Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions docs/api/query-filtering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Query Filtering Framework

## Overview

A composable, reusable filtering system for common OMOP query patterns. This framework provides type-safe, extensible filters that can be applied to SQLAlchemy Select statements, enabling consistent query construction across different use cases.

## Implementation

### New Module Structure
```
omop_alchemy/cdm/query/
├── __init__.py # Public exports
└── filters.py # BaseConceptFilter, ConceptFilter
```

### Key Components

#### `BaseConceptFilter` (Abstract Base)
- Protocol for all future concept-based filters
- Forces subclasses to implement `apply(query: Select) -> Select`
- Allows for extensibility

#### `ConceptFilter`
- Frozen dataclass for immutability
- Supports filtering by: concept_ids, domains, vocabularies, require_standard

### Export Paths
```python
# From query submodule
from omop_alchemy.cdm.query import ConceptFilter, BaseConceptFilter

# From CDM package (convenience)
from omop_alchemy.cdm import ConceptFilter, BaseConceptFilter
```

## Usage

### Basic Filtering

```python
from sqlalchemy import select
from omop_alchemy.cdm.query import ConceptFilter
from omop_alchemy.cdm.model.vocabulary import Concept

# Single filter
filter = ConceptFilter(
domains=("Condition", "Drug"),
require_standard=True
)

query = select(Concept)
filtered_query = filter.apply(query)
results = session.execute(filtered_query).all()
```

## Benefits

**Composable**: Chain multiple filters or apply individually
**Type-Safe**: Frozen dataclasses with optional typing
**Reusable**: Single implementation across different contexts
**Extensible**: `BaseConceptFilter` protocol enables new filter types

## Extensibility

The framework is designed to support additional filter types by implementing `BaseConceptFilter`:

```python
from dataclasses import dataclass
from typing import Optional, Tuple
from sqlalchemy.sql import Select
from omop_alchemy.cdm.query.filters import BaseConceptFilter

@dataclass(frozen=True)
class DomainSpecificFilter(BaseConceptFilter):
"""Custom filter for specialized concept querying."""
custom_constraint: Optional[Tuple[str, ...]] = None

def apply(self, query: Select) -> Select:
# Implement domain-specific filtering logic
if self.custom_constraint is not None:
query = query.where(...) # Your constraint logic
return query
```

### Potential Filter Types

While `ConceptFilter` covers the core concept-filtering patterns, the protocol supports domain-specific extensions:

- **Measurement filters** — Unit types, operator constraints (>, <, =)
- **Relationship filters** — Hierarchical traversal, predicate types, depth bounds
- **Temporal filters** — Valid date ranges, versioning constraints
- **Vocabulary-specific filters** — Code patterns, classification hierarchies
- **Domain composition filters** — Multi-table filtering across clinical events

13 changes: 13 additions & 0 deletions omop_alchemy/cdm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""
OMOP Common Data Model ORM and utilities.

This package provides SQLAlchemy ORM mappings for the OMOP CDM, along with
configuration, querying helpers, and filtering utilities.
"""

from .query import ConceptFilter, BaseConceptFilter

__all__ = [
"ConceptFilter",
"BaseConceptFilter",
]
19 changes: 19 additions & 0 deletions omop_alchemy/cdm/query/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
"""
Query filtering framework for OMOP Alchemy.

Provides composable, reusable filters for common query patterns across
OMOP concepts and vocabularies. Filters can be combined and applied to
SQLAlchemy Select statements, enabling consistent filtering logic across
different projects (omop-emb, omop-graph, etc.) without circular imports.
"""

from .filters import (
ConceptFilter,
BaseConceptFilter,
)


__all__ = [
"ConceptFilter",
"BaseConceptFilter",
]
128 changes: 128 additions & 0 deletions omop_alchemy/cdm/query/filters.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
"""
Concept filtering for OMOP Concept table queries.

Provides a composable filter that applies constraints to SQLAlchemy queries
targeting the OMOP Concept table. This unified implementation is used by both
omop-emb and omop-graph to avoid code duplication and circular imports.
"""

from __future__ import annotations

from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Optional, Tuple

from sqlalchemy.sql import Select

from omop_alchemy.cdm.model.vocabulary import Concept


class BaseConceptFilter(ABC):
"""
Abstract base for filters that can be applied to Concept queries.

Subclasses implement the ``apply`` method to modify a SQLAlchemy Select
statement with domain-specific constraints.
"""

@abstractmethod
def apply(self, query: Select) -> Select:
"""
Apply filter constraints to a SQLAlchemy Select statement.

Parameters
----------
query : Select
A SQLAlchemy Select statement, typically targeting the Concept table.

Returns
-------
Select
The modified Select statement with filter constraints appended.
"""
pass


@dataclass(frozen=True)
class ConceptFilter(BaseConceptFilter):
"""
Unified filter for OMOP Concept table queries.

Consolidates filtering logic previously duplicated in omop-emb
(EmbeddingConceptFilter) and omop-graph (SearchConstraintConcept).
This filter can be used by any project that needs to constrain
Concept queries by domain, vocabulary, concept IDs, or standardization status.

Parameters
----------
concept_ids : tuple[int, ...], optional
A tuple of OMOP Concept IDs to filter by.
If None, no concept ID filtering is applied.
domains : tuple[str, ...], optional
A tuple of OMOP Domain IDs to filter by (e.g., ('Condition', 'Drug')).
If None, no domain filtering is applied.
vocabularies : tuple[str, ...], optional
A tuple of OMOP Vocabulary IDs to filter by (e.g., ('SNOMED', 'RxNorm')).
If None, no vocabulary filtering is applied.
require_standard : bool, optional
If True, restricts results to standard ('S') or classification ('C') concepts.
Default is False.

Examples
--------
>>> from omop_alchemy.cdm.query import ConceptFilter
>>> from sqlalchemy import select
>>> from omop_alchemy.cdm.model.vocabulary import Concept
>>>
>>> # Filter for conditions and drugs in SNOMED and RxNorm
>>> filter = ConceptFilter(
... domains=("Condition", "Drug"),
... vocabularies=("SNOMED", "RxNorm"),
... require_standard=True
... )
>>>
>>> query = select(Concept)
>>> filtered_query = filter.apply(query)

Notes
-----
- All parameters are optional; filters are only applied if set (not None or default).
- The `require_standard` flag filters for both 'S' (Standard) and 'C' (Classification)
concepts to allow curated, non-standard-but-approved concepts.
- Filters are composable with SQLAlchemy's native query building.
"""

concept_ids: Optional[Tuple[int, ...]] = field(default=None)
domains: Optional[Tuple[str, ...]] = field(default=None)
vocabularies: Optional[Tuple[str, ...]] = field(default=None)
require_standard: bool = False
limit: Optional[int] = None

def apply(self, query: Select) -> Select:
"""
Apply the filter constraints to a SQLAlchemy Select statement.

Parameters
----------
query : Select
The SQLAlchemy Select statement targeting the Concept table.

Returns
-------
Select
The modified Select statement with where clauses appended.
"""
if self.concept_ids is not None:
query = query.where(Concept.concept_id.in_(self.concept_ids))

if self.domains is not None:
query = query.where(Concept.domain_id.in_(self.domains))

if self.vocabularies is not None:
query = query.where(Concept.vocabulary_id.in_(self.vocabularies))

if self.require_standard:
# Filters for 'S' (Standard) or 'C' (Classification)
query = query.where(Concept.standard_concept.in_(["S", "C"]))

return query.limit(self.limit)