agentic-db-query

Natural language → SQL TypeScript agent — install dependencies, point at a database, ask questions in plain English.

What is this?

agentic-db-query is a TypeScript agent that turns natural language questions into SQL, validates execution safety, runs the query, and returns rows plus a short explanation. It loads your live schema (tables, columns, types, foreign keys, optional sample values), uses an LLM via the Vercel AI SDK, and retries with error feedback when the database rejects generated SQL.

Typical uses: internal BI helpers, support dashboards, admin lookups, prototyping queries, teaching SQL by example.

Features

Feature	Description
Schema-aware text-to-SQL	Injects table/column definitions and relationships into the model prompt.
Safe execution	Read-only mode by default; row cap via `LIMIT`; query timeout (PostgreSQL `statement_timeout`).
Self-healing	On validation or execution failure, the agent retries with the error message (configurable attempts).
Session context	Optional `sessionId` keeps a short transcript for follow-up questions.
Streaming	`queryStream()` yields SQL fragments, then rows, then a done event.
Explain mode	`explain()` returns SQL + explanation without executing.
Databases	PostgreSQL, SQLite, MySQL (via `mysql2`).

Requirements

Node.js 22+
A reachable database (local or remote)
An API key for OpenAI or Anthropic (used by the model string, see below)

Installation

cd agentic-db-query
npm install

Run npm run build before npm start (the compiled API loads from dist/). For development, npm run dev runs TypeScript directly.

Configuration

Copy .env.example to .env and set:

Variable	Description
`DATABASE_URL`	Connection string (see below).
`OPENAI_API_KEY`	Required if you use `openai/...` models.
`ANTHROPIC_API_KEY`	Required if you use `anthropic/...` models.
`MODEL`	Default: `openai/gpt-4o-mini`. Format: `provider/model-id`.
`PORT`	HTTP port for `npm run dev` / `npm start` (default `3000`; must be ≤ 65535).
`READ_ONLY`	Set to `false` to disable read-only SQL checks (default `true`; use with care).
`MAX_ROWS`	Hard cap appended as `LIMIT` when missing (default `1000`).
`QUERY_TIMEOUT_MS`	Query timeout in ms (default `30000`).
`MAX_RETRIES`	Max LLM generate/execute attempts for `POST /query` (default `3`).
`INCLUDE_SAMPLE_VALUES`	Set to `true` to add per-column sample values to the schema prompt (default off; slower on large DBs).

Database URLs

PostgreSQL: postgresql://user:pass@localhost:5432/mydb
SQLite: sqlite:./data.db or file:./data.db (path after the prefix)
MySQL: mysql://user:pass@localhost:3306/mydb

Quick start (library)

import 'dotenv/config';
import { DatabaseAgent } from './src/index.js';

const agent = new DatabaseAgent({
  connectionString: process.env.DATABASE_URL!,
  model: process.env.MODEL ?? 'openai/gpt-4o-mini',
  readOnly: true,
  maxRows: 100,
});

const result = await agent.query('How many users joined in the last 30 days?');

console.log(result.sql);
console.log(result.results);
console.log(result.explanation);

await agent.close();

Explain only (no execution)

Returns generated SQL and explanation without running it. In read-only mode, the SQL is still validated the same way as for query() (mutating SQL is rejected).

const { sql, explanation } = await agent.explain('Show me top products by revenue');

Session memory

await agent.query('How many users do we have?', { sessionId: 'user-123' });
await agent.query('List their email addresses', { sessionId: 'user-123' });

Domain hints

agent.addHint('When asked for revenue, use orders.total_amount');

Schema filter

const agent = new DatabaseAgent({
  connectionString: process.env.DATABASE_URL!,
  model: 'openai/gpt-4o-mini',
  schemaFilter: (table) => !table.startsWith('_internal'),
});

Streaming

queryStream() emits partial SQL while the model generates it, then rows and a final done event. Unlike query(), it does not retry on execution errors (you get a single error event).

for await (const event of agent.queryStream('Show first 10 rows from users')) {
  if (event.type === 'sql') console.log(event.sql);
  if (event.type === 'row') console.log(event.row);
  if (event.type === 'done') console.log(event.explanation, event.rowCount);
  if (event.type === 'error') console.error(event.message);
}

HTTP API

npm run dev

GET /health — liveness check
POST /query — JSON body: { "question": string, "sessionId?: string }
Returns: { sql, results, explanation, rowCount, executionTimeMs, retries? }

Example:

curl -s -X POST http://localhost:3000/query \
  -H 'Content-Type: application/json' \
  -d '{"question":"List tables"}'

CLI

npm run cli

Interactive prompts; uses DATABASE_URL and MODEL from .env.

Docker

export OPENAI_API_KEY=sk-...
docker compose up --build

Services: PostgreSQL 16 and the API on port 3000. Ensure the database has data/schema you expect before querying.

How it works

Load schema — introspect tables, columns, types, foreign keys (optional row samples).
Build prompt — system rules + formatted schema + optional hints and session context.
Generate SQL — structured output (sql + explanation) via generateObject.
Validate — in read-only mode, allow a single SELECT / WITH ... SELECT and block DML/DDL patterns.
Execute — run with timeout and automatic LIMIT when absent.
Retry — on failure, feed the database error back into the next attempt.

API reference

`DatabaseAgentConfig`

Field	Type	Notes
`connectionString`	`string`	Required.
`model`	`string \| LanguageModel`	e.g. `openai/gpt-4o-mini` or a Vercel AI SDK model instance.
`readOnly`	`boolean`	Default `true`.
`maxRows`	`number`	Default `1000`.
`queryTimeoutMs`	`number`	Default `30000`.
`maxRetries`	`number`	Default `3`; max generation/execute attempts for `query()` (each failure feeds back to the model).
`includeSampleValues`	`boolean`	Default `false`; adds per-column samples to the prompt.
`schemaFilter`	`(table: string) => boolean`	Exclude tables from schema injection.
`hints`	`string[]`	Initial domain hints (same as `addHint`).

`QueryResult`

Field	Type
`sql`	`string`
`results`	`Record<string, unknown>[]`
`explanation`	`string`
`rowCount`	`number`
`executionTimeMs`	`number`
`retries`	`number` (optional)

Scripts

Command	Description
`npm run build`	Compile TypeScript to `dist/`.
`npm start`	Run compiled API (`node dist/api/server.js`; run `npm run build` first).
`npm run dev`	Run API with `tsx` watch.
`npm run cli`	Interactive CLI.
`npm test`	All automated tests (`tests/agent.integration.test.ts` is skipped unless `OPENAI_API_KEY` is set).
`npm run test:integration`	Run only `tests/agent.integration.test.ts` (faster when iterating on the LLM test).
`npm run test:watch`	Vitest in watch mode.

prepack runs npm run build automatically before npm pack / npm publish, so dist/ is fresh for the tarball.

Project layout

agentic-db-query/
├── src/
│   ├── DatabaseAgent.ts
│   ├── db/
│   │   ├── dialect.ts
│   │   └── runner.ts
│   ├── schema/
│   │   ├── extractor.ts
│   │   └── formatter.ts
│   ├── sql/
│   │   ├── validator.ts
│   │   ├── executor.ts
│   │   └── result-formatter.ts
│   ├── prompts/
│   │   └── text-to-sql.ts
│   ├── llm/
│   │   └── model.ts
│   ├── api/
│   │   └── server.ts
│   ├── cli/
│   │   └── index.ts
│   └── index.ts
├── tests/
├── .gitignore
├── .env.example
├── docker-compose.yml
├── Dockerfile
├── vitest.config.ts
├── package.json
├── tsconfig.json
├── LICENSE
└── README.md

Limitations

Read-only validation uses heuristics (not a full SQL parser). Rare edge cases can include identifiers that resemble keywords.
Query timeouts are applied via PostgreSQL statement_timeout and MySQL’s query timeout. SQLite does not apply a server-side timeout; maxRows still caps result size.

Security notes

Keep read-only enabled for untrusted users.
The LLM can still produce expensive queries; use timeouts and row limits.
Run against least-privilege DB users in production.
This tool is not a substitute for parameterized app queries in application code; it is for controlled analytics/admin use.

License

This project is licensed under the MIT License — see LICENSE.

Contact Info

Telegram: @AuraTerminal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agentic-db-query

What is this?

Features

Requirements

Installation

Configuration

Database URLs

Quick start (library)

Explain only (no execution)

Session memory

Domain hints

Schema filter

Streaming

HTTP API

CLI

Docker

How it works

API reference

`DatabaseAgentConfig`

`QueryResult`

Scripts

Project layout

Limitations

Security notes

License

Contact Info

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

agentic-db-query

What is this?

Features

Requirements

Installation

Configuration

Database URLs

Quick start (library)

Explain only (no execution)

Session memory

Domain hints

Schema filter

Streaming

HTTP API

CLI

Docker

How it works

API reference

DatabaseAgentConfig

QueryResult

Scripts

Project layout

Limitations

Security notes

License

Contact Info

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`DatabaseAgentConfig`

`QueryResult`

Packages