Skip to content

Implement CORE-01: Per-Tenant Routing Policies#9

Open
cto-new[bot] wants to merge 1 commit into
mainfrom
feature/per-tenant-routing
Open

Implement CORE-01: Per-Tenant Routing Policies#9
cto-new[bot] wants to merge 1 commit into
mainfrom
feature/per-tenant-routing

Conversation

@cto-new

@cto-new cto-new Bot commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

This PR implements per-tenant routing policies by adding a routing_preference column to the database and updating the routing engine to respect these preferences during the scoring phase. It also adds API endpoints to get and update these settings.

Summary by Sourcery

Add support for per-tenant routing preferences and propagate them through authentication, routing policies, and scoring to influence provider selection.

New Features:

  • Expose tenant settings API endpoints to get and update a tenant's routing preference.
  • Store a routing_preference field on users in the database with a default of 'balanced'.
  • Include routing preference in the chat completion routing context so policies can tailor provider selection.

Enhancements:

  • Incorporate tenant routing preferences into routing directives and the scoring function to bias toward cost, balance, or performance.
  • Extend auth middleware to load routing preferences from Supabase and attach them to the request state for downstream use.

@sourcery-ai

sourcery-ai Bot commented Apr 20, 2026

Copy link
Copy Markdown

Reviewer's Guide

Implements per-tenant routing preferences that flow from Supabase user records through auth middleware into the routing engine and policy/scoring logic, plus adds tenant settings APIs and a new routing_preference column in the users table.

Sequence diagram for per-tenant routing preference in chat completions

sequenceDiagram
    actor Tenant
    participant ClientApp
    participant APIService
    participant AuthMiddleware
    participant Supabase
    participant RoutingEngine
    participant PolicyEngine
    participant Scorer

    Tenant->>ClientApp: Send chat request with API key
    ClientApp->>APIService: POST /v1/chat/completions
    APIService->>AuthMiddleware: Incoming request
    AuthMiddleware->>Supabase: Select api_keys and users(tier, routing_preference)
    Supabase-->>AuthMiddleware: user_id, tier, routing_preference
    AuthMiddleware->>APIService: Set request.state.user_id, tier, routing_preference

    APIService->>RoutingEngine: route(request, user_id, tier, routing_preference)
    RoutingEngine->>RoutingEngine: _prepare_context(..., routing_preference)
    RoutingEngine->>PolicyEngine: apply(context with tenant.routing_preference)
    PolicyEngine->>PolicyEngine: Set directive.routing_preference = tenant_pref
    PolicyEngine-->>RoutingEngine: ordered_providers, directive

    RoutingEngine->>Scorer: compute_expected_utility(..., directive)
    Scorer->>Scorer: Adjust weights based on directive.routing_preference
    Scorer-->>RoutingEngine: scores per provider
    RoutingEngine-->>APIService: ChatCompletionResponse
    APIService-->>ClientApp: Response to chat request
Loading

ER diagram for users table with routing_preference

erDiagram
    USERS {
        uuid id PK
        text email
        text tier
        text routing_preference
        timestamptz created_at
    }
Loading

Class diagram for routing context and tenant settings models

classDiagram
    class TenantSettingsRequest {
        +str routing_preference
    }

    class TenantSettingsResponse {
        +bool success
        +str routing_preference
    }

    class RequestContext {
        +ChatCompletionRequest request
        +WorkloadProfile workload_profile
        +ContextBundle compression
        +str user_id
        +int total_tokens
        +float schema_success_ratio
        +str tenant_tier
        +str routing_preference
        +RoutingDirective policy_directive
        +dict policy_context()
    }

    class RoutingDirective {
        +list~str~ prefer
        +list~str~ exclude
        +str require_hedging
        +bool human_gate
        +float policy_weight
        +str routing_preference
    }

    class RoutingPolicy {
        +list~RoutingRule~ rules
        +tuple~list~str~~ RoutingDirective~ apply(context, available_providers)
    }

    class RoutingEngine {
        +RequestContext _prepare_context(request, user_id, tier, routing_preference)
        +ChatCompletionResponse route(request, user_id, tier, routing_preference)
        +Any route_stream(request, user_id, tier, routing_preference)
    }

    class AuthMiddleware {
        +str api_key
        +dispatch(request, call_next)
        -_verify_token_supabase(token_hash) dict
    }

    RoutingEngine --> RequestContext : creates
    RoutingEngine --> RoutingPolicy : uses
    RoutingPolicy --> RoutingDirective : returns
    RoutingDirective <-- RequestContext : policy_directive
    AuthMiddleware --> TenantSettingsRequest : validates routing_preference via API
    AuthMiddleware --> TenantSettingsResponse : returns responses
    TenantSettingsRequest <.. TenantSettingsResponse : API models
Loading

File-Level Changes

Change Details Files
Plumb routing_preference from authentication into request handling and routing engine
  • Extend Supabase auth query to join users.routing_preference and return it alongside user_id and tier
  • Populate request.state.routing_preference for both admin and Supabase-authenticated users
  • Read routing_preference from request.state in the chat_completions handler with a default of 'balanced'
  • Pass routing_preference into engine.route and engine.route_stream
freerelay/middleware/auth.py
freerelay/main.py
Extend routing engine context and policy to carry tenant routing preference
  • Add routing_preference field to RequestContext and include it in the tenant section of policy_context
  • Update _prepare_context, route, and route_stream to accept a routing_preference argument and propagate it into RequestContext
  • Add routing_preference to RoutingDirective and default it from rules
  • In PolicyStore.apply, read tenant routing_preference from context, attach it to matched directives, and use it as the default directive preference when no rules match
freerelay/core/routing/engine.py
freerelay/core/routing/policy.py
Incorporate routing_preference into provider scoring
  • Derive a preference value from the RoutingDirective in compute_expected_utility
  • For cost-optimized preference, increase cost weighting and reduce latency/quality weighting via exponentiation
  • For performance-first preference, increase latency/quality weighting and reduce cost weighting via exponentiation
  • Retain balanced behavior as the default when no preference is set
freerelay/core/routing/scorer.py
Expose tenant routing preference via API models and endpoints
  • Introduce TenantSettingsRequest and TenantSettingsResponse Pydantic models
  • Add POST /v1/tenant/settings endpoint that validates routing_preference, updates the users.routing_preference field in Supabase, and returns the updated preference
  • Add GET /v1/tenant/settings endpoint that fetches users.routing_preference from Supabase for the authenticated user, defaulting to 'balanced' and handling missing records and errors
freerelay/shared/models/internal.py
freerelay/main.py
Persist routing preferences per tenant in the database schema
  • Add routing_preference column to users table with NOT NULL and default 'balanced'
  • Document allowed values for routing_preference in a comment
supabase_schema.sql

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The set of allowed routing_preference values is hardcoded in multiple places (Pydantic model comment, validation in the settings endpoint, default values in policy/routing/auth); consider centralizing these as a shared enum/constant to avoid drift and typos.
  • The tenant settings endpoints are annotated with response_model=TenantSettingsResponse but sometimes return JSONResponse with error payloads; it may be clearer to use HTTPException or adjust response_model/return types so success and error responses are consistent with FastAPI’s typing.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The set of allowed routing_preference values is hardcoded in multiple places (Pydantic model comment, validation in the settings endpoint, default values in policy/routing/auth); consider centralizing these as a shared enum/constant to avoid drift and typos.
- The tenant settings endpoints are annotated with response_model=TenantSettingsResponse but sometimes return JSONResponse with error payloads; it may be clearer to use HTTPException or adjust response_model/return types so success and error responses are consistent with FastAPI’s typing.

## Individual Comments

### Comment 1
<location path="freerelay/core/routing/policy.py" line_range="87-88" />
<code_context>
             require_hedging=self.require_hedging,
             human_gate=self.human_gate,
             policy_weight=self.policy_weight,
+            routing_preference="balanced",  # Default, can be overridden by rule if we want
         )

</code_context>
<issue_to_address>
**issue (bug_risk):** Rule-level routing preference is effectively ignored/overridden by tenant preference in `apply`.

Because `apply` always overwrites `directive.routing_preference` with `tenant_pref`, any rule-specific routing preference is effectively ignored. If rule-level overrides are intended, only apply `tenant_pref` when the directive is still at its default (e.g., `"balanced"`). If tenant preference should always win, update the comment/default on `RoutingRule.directive` to match that behavior.
</issue_to_address>

### Comment 2
<location path="freerelay/core/routing/scorer.py" line_range="118-120" />
<code_context>
-    return success_prob * quality * schema * latency * cost * safety * policy_weight
+    # Adjust weights based on routing preference
+    pref = directive.routing_preference if directive else "balanced"
+    if pref == "cost-optimized":
+        # Square cost to make it more dominant, root latency and quality
+        cost = cost**1.5
+        latency = latency**0.5
+        quality = quality**0.5
</code_context>
<issue_to_address>
**nitpick:** The cost-optimized comment doesn't match the actual exponent used.

The inline description and the actual exponent don’t match: the comment says “square cost” but the code uses `cost**1.5`. Please update either the comment or the exponent so the intended weighting is clear to future readers.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 87 to +88
policy_weight=self.policy_weight,
routing_preference="balanced", # Default, can be overridden by rule if we want

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Rule-level routing preference is effectively ignored/overridden by tenant preference in apply.

Because apply always overwrites directive.routing_preference with tenant_pref, any rule-specific routing preference is effectively ignored. If rule-level overrides are intended, only apply tenant_pref when the directive is still at its default (e.g., "balanced"). If tenant preference should always win, update the comment/default on RoutingRule.directive to match that behavior.

Comment on lines +118 to +120
if pref == "cost-optimized":
# Square cost to make it more dominant, root latency and quality
cost = cost**1.5

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: The cost-optimized comment doesn't match the actual exponent used.

The inline description and the actual exponent don’t match: the comment says “square cost” but the code uses cost**1.5. Please update either the comment or the exponent so the intended weighting is clear to future readers.

@capy-ai capy-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 3 comments

Comment thread supabase_schema.sql
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
email TEXT UNIQUE NOT NULL,
tier TEXT NOT NULL DEFAULT 'free', -- 'free', 'bronze', 'silver', 'gold'
routing_preference TEXT NOT NULL DEFAULT 'balanced', -- 'cost-optimized', 'balanced', 'performance-first'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟠 High] [🔵 Bug]

This adds routing_preference only inside the bootstrap CREATE TABLE IF NOT EXISTS users block, which means existing Supabase deployments will never receive the new column because PostgreSQL skips the whole statement once users already exists. Verified against @README.md:462, which instructs operators to run supabase_schema.sql, and against @freerelay/middleware/auth.py:36-50 and @freerelay/main.py:322-361, which now immediately read and write users.routing_preference; on an upgraded install those queries will fail due to the missing column. Add an explicit ALTER TABLE users ADD COLUMN IF NOT EXISTS routing_preference TEXT NOT NULL DEFAULT 'balanced' migration/backfill step before shipping the new readers. ```sql
-- supabase_schema.sql
CREATE TABLE IF NOT EXISTS users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
email TEXT UNIQUE NOT NULL,
tier TEXT NOT NULL DEFAULT 'free',
routing_preference TEXT NOT NULL DEFAULT 'balanced',

tier = str(users_data.get("tier", "free"))
return {"user_id": user_id, "tier": tier}
routing_preference = str(users_data.get("routing_preference", "balanced"))
return {"user_id": user_id, "tier": tier, "routing_preference": routing_preference}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟠 High] [🔵 Bug]

_verify_token_supabase is wrapped in @lru_cache, and this PR adds routing_preference to the cached payload without adding any invalidation when /v1/tenant/settings updates users.routing_preference. After a tenant changes their preference, subsequent requests with the same API key will keep reusing the old cached value until process restart or cache eviction, so the new routing policy never takes effect even though the settings API returns success. Fix by not caching mutable tenant settings here, or by clearing/refreshing the cache when settings are updated.

# freerelay/middleware/auth.py
if isinstance(users_data, dict):
    tier = str(users_data.get("tier", "free"))
    routing_preference = str(users_data.get("routing_preference", "balanced"))
return {"user_id": user_id, "tier": tier, "routing_preference": routing_preference}

Comment thread freerelay/main.py

try:
supabase = get_supabase_admin_client()
supabase.table("users").update(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 Medium] [🔵 Bug]

POST /v1/tenant/settings unconditionally returns success after the update call, and the paired GET path turns an empty result into 200 {success:false} instead of surfacing a missing tenant. This is reachable because authentication is cached in AuthMiddleware, so a token can remain authorized after its backing tenant row has been removed or the data becomes inconsistent; in that state the update matches zero rows, but the client still gets a success response even though nothing was persisted. Check the Supabase result and return 404/401 when no row matched before reporting success.

# freerelay/main.py
supabase.table("users").update(
    {"routing_preference": settings_req.routing_preference}
).eq("id", user_id).execute()

return TenantSettingsResponse(
    success=True,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant