Skip to content

odysa/rdf4j-python

Repository files navigation

rdf4j-python

PyPI version Python Versions CI License Documentation

A modern Python client for the Eclipse RDF4J framework, enabling seamless RDF data management and SPARQL operations from Python applications.

rdf4j-python bridges the gap between Python and the robust Eclipse RDF4J ecosystem, providing a clean, async-first API for managing RDF repositories, executing SPARQL queries, and handling semantic data with ease.

Features

  • Async-First Design: Native support for async/await with synchronous fallback
  • Repository Management: Create, access, and manage RDF4J repositories programmatically
  • SPARQL Support: Execute SELECT, ASK, CONSTRUCT, and UPDATE queries effortlessly
  • SPARQL Query Builder: Fluent, programmatic query construction with method chaining
  • Transaction Support: Atomic operations with commit/rollback and isolation levels
  • Flexible Data Handling: Add, retrieve, and manipulate RDF triples and quads
  • File Upload: Upload RDF files (Turtle, N-Triples, N-Quads, RDF/XML, JSON-LD, TriG, N3) directly to repositories
  • Multiple Formats: Support for various RDF serialization formats
  • Repository Types: Memory stores, native stores, HTTP repositories, and more
  • Named Graph Support: Work with multiple graphs within repositories
  • Inferencing: Built-in support for RDFS and custom inferencing rules

Installation

Prerequisites

  • Python 3.11 or higher
  • RDF4J Server (for remote repositories) or embedded usage

Install from PyPI

pip install rdf4j-python

Install with Optional Dependencies

# Include SPARQLWrapper integration
pip install rdf4j-python[sparqlwrapper]

Development Installation

git clone https://github.com/odysa/rdf4j-python.git
cd rdf4j-python
uv sync --group dev

Usage

Quick Start

import asyncio
from rdf4j_python import AsyncRdf4j
from rdf4j_python.model.repository_config import RepositoryConfig, MemoryStoreConfig, SailRepositoryConfig
from rdf4j_python.model.term import IRI, Literal

async def main():
    # Connect to RDF4J server
    async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
        # Create an in-memory repository
        config = RepositoryConfig(
            repo_id="my-repo",
            title="My Repository",
            impl=SailRepositoryConfig(sail_impl=MemoryStoreConfig(persist=False))
        )
        repo = await db.create_repository(config=config)
        
        # Add some data
        await repo.add_statement(
            IRI("http://example.com/person/alice"),
            IRI("http://xmlns.com/foaf/0.1/name"),
            Literal("Alice")
        )
        
        # Query the data
        results = await repo.query("SELECT * WHERE { ?s ?p ?o }")
        for result in results:
            print(f"Subject: {result['s']}, Predicate: {result['p']}, Object: {result['o']}")

if __name__ == "__main__":
    asyncio.run(main())

SPARQL Query Builder

Build queries programmatically with method chaining instead of writing raw SPARQL strings:

from rdf4j_python import select, ask, construct, describe, GraphPattern, Namespace

ex = Namespace("ex", "http://example.org/")
foaf = Namespace("foaf", "http://xmlns.com/foaf/0.1/")

# SELECT with typed terms — IRIs serialize automatically
query = (
    select("?person", "?name")
    .where("?person", foaf.type, ex.Person)
    .where("?person", foaf.name, "?name")
    .optional("?person", foaf.email, "?email")
    .filter("?name != 'Bob'")
    .order_by("?name")
    .limit(10)
    .build()
)

# Or use string-based prefixed names
query = (
    select("?name")
    .prefix("foaf", "http://xmlns.com/foaf/0.1/")
    .where("?person", "a", "foaf:Person")
    .where("?person", "foaf:name", "?name")
    .build()
)

# GROUP BY with aggregation
query = (
    select("?city", "(COUNT(?person) AS ?count)")
    .where("?person", ex.city, "?city")
    .group_by("?city")
    .having("COUNT(?person) > 1")
    .order_by("DESC(?count)")
    .build()
)

# ASK, CONSTRUCT, and DESCRIBE
ask_query = ask().where("?s", ex.name, "?name").build()

construct_query = (
    construct(("?s", ex.fullName, "?name"))
    .where("?s", ex.firstName, "?fname")
    .bind("CONCAT(?fname, ' ', ?lname)", "?name")
    .build()
)

describe_query = describe(ex.alice).build()

The query builder supports FILTER, OPTIONAL, UNION, BIND, VALUES, sub-queries, DISTINCT, ORDER BY, GROUP BY, HAVING, LIMIT, and OFFSET. Both raw strings and typed objects (IRI, Variable, Literal, Namespace) work as terms.

Working with Multiple Graphs

from rdf4j_python.model.term import Quad

async def multi_graph_example():
    async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
        repo = await db.get_repository("my-repo")
        
        # Add data to specific graphs
        statements = [
            Quad(
                IRI("http://example.com/person/bob"),
                IRI("http://xmlns.com/foaf/0.1/name"),
                Literal("Bob"),
                IRI("http://example.com/graph/people")
            ),
            Quad(
                IRI("http://example.com/person/bob"),
                IRI("http://xmlns.com/foaf/0.1/age"),
                Literal("30", datatype=IRI("http://www.w3.org/2001/XMLSchema#integer")),
                IRI("http://example.com/graph/demographics")
            )
        ]
        await repo.add_statements(statements)
        
        # Query specific graph
        graph_query = """
        SELECT * WHERE {
            GRAPH <http://example.com/graph/people> {
                ?person ?property ?value
            }
        }
        """
        results = await repo.query(graph_query)

Advanced Repository Configuration

Here's a more comprehensive example showing repository creation with different configurations:

async def advanced_example():
    async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
        # Memory store with persistence
        persistent_config = RepositoryConfig(
            repo_id="persistent-repo",
            title="Persistent Memory Store",
            impl=SailRepositoryConfig(sail_impl=MemoryStoreConfig(persist=True))
        )
        
        # Create and populate repository
        repo = await db.create_repository(config=persistent_config)
        
        # Bulk data operations
        data = [
            (IRI("http://example.com/alice"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Alice")),
            (IRI("http://example.com/alice"), IRI("http://xmlns.com/foaf/0.1/email"), Literal("alice@example.com")),
            (IRI("http://example.com/bob"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Bob")),
        ]
        
        statements = [
            Quad(subj, pred, obj, IRI("http://example.com/default"))
            for subj, pred, obj in data
        ]
        await repo.add_statements(statements)
        
        # Query with the fluent query builder
        from rdf4j_python import select
        from rdf4j_python.model._namespace import Namespace

        foaf = Namespace("foaf", "http://xmlns.com/foaf/0.1/")
        query = (
            select("?name", "?email")
            .where("?person", foaf.name, "?name")
            .optional("?person", foaf.email, "?email")
            .order_by("?name")
            .build()
        )
        results = await repo.query(query)

Uploading RDF Files

import pyoxigraph as og

async def upload_example():
    async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
        repo = await db.get_repository("my-repo")

        # Upload a Turtle file (format auto-detected from extension)
        await repo.upload_file("data.ttl")

        # Upload to a specific named graph
        await repo.upload_file("data.ttl", context=IRI("http://example.com/graph"))

        # Upload with explicit format
        await repo.upload_file("data.txt", rdf_format=og.RdfFormat.N_TRIPLES)

        # Upload with base URI for relative URIs
        await repo.upload_file("data.ttl", base_uri="http://example.com/")

Using Transactions

from rdf4j_python import IsolationLevel

async def transaction_example():
    async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
        repo = await db.get_repository("my-repo")

        # Atomic operations with auto-commit/rollback
        async with repo.transaction() as txn:
            await txn.add_statements([
                Quad(IRI("http://example.com/alice"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Alice")),
                Quad(IRI("http://example.com/bob"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Bob")),
            ])
            await txn.delete_statements([old_quad])
            # Commits automatically on success, rolls back on exception

        # With specific isolation level
        async with repo.transaction(IsolationLevel.SERIALIZABLE) as txn:
            await txn.update("""
                DELETE { ?s <http://example.com/status> "draft" }
                INSERT { ?s <http://example.com/status> "published" }
                WHERE { ?s <http://example.com/status> "draft" }
            """)

For more detailed examples, see the examples directory.

Development

Setting up Development Environment

  1. Clone the repository:

    git clone https://github.com/odysa/rdf4j-python.git
    cd rdf4j-python
  2. Install development dependencies:

    uv sync --group dev
  3. Start RDF4J Server (for integration tests):

    # Using Docker
    docker run -p 19780:8080 eclipse/rdf4j:latest
  4. Run tests:

    pytest tests/
  5. Run linting:

    ruff check .
    ruff format .

Project Structure

rdf4j_python/
├── _driver/          # Core async driver implementation
├── model/            # Data models and configurations
├── query/            # SPARQL query builder
├── exception/        # Custom exceptions
└── utils/           # Utility functions

examples/            # Usage examples
tests/              # Test suite
docs/               # Documentation

Contributing

We welcome contributions! Here's how to get involved:

  1. Fork the repository on GitHub
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes and add tests
  4. Run the test suite to ensure everything works
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to your branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Running Examples

# Make sure RDF4J server is running on localhost:19780
python examples/complete_workflow.py
python examples/query.py

License

This project is licensed under the BSD 3-Clause License. See the LICENSE file for details.

Copyright (c) 2025, Chengxu Bian

Support

  • Issues & Bug Reports: GitHub Issues
  • Documentation: docs/
  • Questions: Feel free to open a discussion or issue

If you find this project useful, please consider starring the repository!

About

Python client for Eclipse RDF4J — interact with RDF4J repositories, execute SPARQL queries, and manage RDF data seamlessly in Python.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages