Skip to content

AI-Enhanced MariaDB Jupyter Magic Commands with Vector RAG Integration

Notifications You must be signed in to change notification settings

RSN601KRI/MariaDB_Magic

Repository files navigation

AI-Enhanced MariaDB Jupyter Magic Commands with Vector RAG Integration🧠

Theme: Innovation

📘 About the Project

This project transforms MariaDB into an intelligent AI/ML platform by introducing Jupyter Magic Commands that seamlessly integrate vector operations, Retrieval-Augmented Generation (RAG) pipelines, and machine learning models directly with database workflows. Built entirely in Google Colab, it bridges the gap between traditional SQL analytics and modern AI inference, empowering developers and data scientists to perform complex AI tasks using intuitive, SQL-like commands within a familiar Jupyter environment.

🚀 Core Features

  • %%mariadb Magic Command: Executes SQL queries directly from Jupyter and returns results as Pandas DataFrames.
  • Automatic Visualisation: Generates charts using Matplotlib or Seaborn for numeric query results.
  • %%mariadb_vector Extension: Integrates Hugging Face embeddings with MariaDB Vector columns for semantic and similarity search.
  • RAG (Retrieval-Augmented Generation) Pipeline: Enables context-aware AI responses using database-driven document retrieval.
  • Google Colab Support: Complete setup and execution directly in Colab, requiring no local installation.
  • Reusable Module: Packaged code for quick integration in any AI or data workflow.

🧩 Dataset Information

This project uses the OpenFlights dataset as the primary data source.
You can load it in Colab with:

!wget https://raw.githubusercontent.com/MariaDB/openflights/main/airports.dat -O airports.csv
!wget https://raw.githubusercontent.com/MariaDB/openflights/main/routes.dat -O routes.csv

These datasets include information about airports, airlines, and global routes — perfect for demonstrating data querying, visualization, and vector-based similarity search (e.g., “Find airports similar to Delhi”).

⚙️ Setup Instructions (Google Colab)

  1. Install MariaDB and Dependencies

    !apt-get update
    !apt-get install -y mariadb-server
    !pip install mariadb sqlalchemy pandas matplotlib seaborn transformers
    !service mysql start
  2. Create and Load Database

    import pandas as pd, mariadb
    conn = mariadb.connect(user='root', password='', database='test_db')
    cursor = conn.cursor()
    cursor.execute("CREATE DATABASE IF NOT EXISTS test_db;")
  3. Run SQL Queries via Magic Command

    %%mariadb
    SELECT Country, COUNT(*) AS AirportCount
    FROM airports
    GROUP BY Country
    ORDER BY AirportCount DESC
    LIMIT 10;
  4. Visualize Results

    import matplotlib.pyplot as plt
    df.plot(kind='bar', x='Country', y='AirportCount')
    plt.title('Top 10 Countries by Number of Airports')
    plt.show()

Data Visualisation Insights

maria13 maria12 mariaaa Maria1 Maria2 maria3 maria8 maria9 maria10 maria11

🧠 AI / Vector Integration (Optional Advanced Feature)

Use %%mariadb_vector to generate and store embeddings from text fields (e.g., airport names or cities) using the Hugging Face model sentence-transformers/all-MiniLM-L6-v2, enabling semantic similarity and RAG-based querying.

📊 Additional Sample Datasets

This directory includes a few sample datasets to get you started:

  • california_housing_data*.csv – California housing data from the 1990 US Census More info: Census Dataset Documentation

  • mnist_*.csv – A small sample of the MNIST handwritten digits dataset Description: Yann LeCun’s MNIST Page

  • anscombe.json – Contains a copy of Anscombe’s Quartet Source: Anscombe, F. J. (1973). Graphs in Statistical Analysis. American Statistician, 27(1): 17–21. JSTOR 2682899. Prepared via the vega_datasets library.

🧩 Demo Notebook

A ready-to-use Colab demo notebook includes:

  • Environment setup
  • Dataset loading
  • Example SQL queries and plots
  • AI/Vector and RAG integration demos

➡️ Open in Colab: Open In Colab

📚 License

This project is open-source and released under the MIT License.

About

AI-Enhanced MariaDB Jupyter Magic Commands with Vector RAG Integration

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published