Natural language → SQL TypeScript agent — install dependencies, point at a database, ask questions in plain English.
agentic-db-query is a TypeScript agent that turns natural language questions into SQL, validates execution safety, runs the query, and returns rows plus a short explanation. It loads your live schema (tables, columns, types, foreign keys, optional sample values), uses an LLM via the Vercel AI SDK, and retries with error feedback when the database rejects generated SQL.
Typical uses: internal BI helpers, support dashboards, admin lookups, prototyping queries, teaching SQL by example.
| Feature | Description |
|---|---|
| Schema-aware text-to-SQL | Injects table/column definitions and relationships into the model prompt. |
| Safe execution | Read-only mode by default; row cap via LIMIT; query timeout (PostgreSQL statement_timeout). |
| Self-healing | On validation or execution failure, the agent retries with the error message (configurable attempts). |
| Session context | Optional sessionId keeps a short transcript for follow-up questions. |
| Streaming | queryStream() yields SQL fragments, then rows, then a done event. |
| Explain mode | explain() returns SQL + explanation without executing. |
| Databases | PostgreSQL, SQLite, MySQL (via mysql2). |
- Node.js 22+
- A reachable database (local or remote)
- An API key for OpenAI or Anthropic (used by the model string, see below)
cd agentic-db-query
npm installRun npm run build before npm start (the compiled API loads from dist/). For development, npm run dev runs TypeScript directly.
Copy .env.example to .env and set:
| Variable | Description |
|---|---|
DATABASE_URL |
Connection string (see below). |
OPENAI_API_KEY |
Required if you use openai/... models. |
ANTHROPIC_API_KEY |
Required if you use anthropic/... models. |
MODEL |
Default: openai/gpt-4o-mini. Format: provider/model-id. |
PORT |
HTTP port for npm run dev / npm start (default 3000; must be ≤ 65535). |
READ_ONLY |
Set to false to disable read-only SQL checks (default true; use with care). |
MAX_ROWS |
Hard cap appended as LIMIT when missing (default 1000). |
QUERY_TIMEOUT_MS |
Query timeout in ms (default 30000). |
MAX_RETRIES |
Max LLM generate/execute attempts for POST /query (default 3). |
INCLUDE_SAMPLE_VALUES |
Set to true to add per-column sample values to the schema prompt (default off; slower on large DBs). |
- PostgreSQL:
postgresql://user:pass@localhost:5432/mydb - SQLite:
sqlite:./data.dborfile:./data.db(path after the prefix) - MySQL:
mysql://user:pass@localhost:3306/mydb
import 'dotenv/config';
import { DatabaseAgent } from './src/index.js';
const agent = new DatabaseAgent({
connectionString: process.env.DATABASE_URL!,
model: process.env.MODEL ?? 'openai/gpt-4o-mini',
readOnly: true,
maxRows: 100,
});
const result = await agent.query('How many users joined in the last 30 days?');
console.log(result.sql);
console.log(result.results);
console.log(result.explanation);
await agent.close();Returns generated SQL and explanation without running it. In read-only mode, the SQL is still validated the same way as for query() (mutating SQL is rejected).
const { sql, explanation } = await agent.explain('Show me top products by revenue');await agent.query('How many users do we have?', { sessionId: 'user-123' });
await agent.query('List their email addresses', { sessionId: 'user-123' });agent.addHint('When asked for revenue, use orders.total_amount');const agent = new DatabaseAgent({
connectionString: process.env.DATABASE_URL!,
model: 'openai/gpt-4o-mini',
schemaFilter: (table) => !table.startsWith('_internal'),
});queryStream() emits partial SQL while the model generates it, then rows and a final done event. Unlike query(), it does not retry on execution errors (you get a single error event).
for await (const event of agent.queryStream('Show first 10 rows from users')) {
if (event.type === 'sql') console.log(event.sql);
if (event.type === 'row') console.log(event.row);
if (event.type === 'done') console.log(event.explanation, event.rowCount);
if (event.type === 'error') console.error(event.message);
}npm run devGET /health— liveness checkPOST /query— JSON body:{ "question": string, "sessionId?: string }
Returns:{ sql, results, explanation, rowCount, executionTimeMs, retries? }
Example:
curl -s -X POST http://localhost:3000/query \
-H 'Content-Type: application/json' \
-d '{"question":"List tables"}'npm run cliInteractive prompts; uses DATABASE_URL and MODEL from .env.
export OPENAI_API_KEY=sk-...
docker compose up --buildServices: PostgreSQL 16 and the API on port 3000. Ensure the database has data/schema you expect before querying.
- Load schema — introspect tables, columns, types, foreign keys (optional row samples).
- Build prompt — system rules + formatted schema + optional hints and session context.
- Generate SQL — structured output (
sql+explanation) viagenerateObject. - Validate — in read-only mode, allow a single
SELECT/WITH ... SELECTand block DML/DDL patterns. - Execute — run with timeout and automatic
LIMITwhen absent. - Retry — on failure, feed the database error back into the next attempt.
| Field | Type | Notes |
|---|---|---|
connectionString |
string |
Required. |
model |
string | LanguageModel |
e.g. openai/gpt-4o-mini or a Vercel AI SDK model instance. |
readOnly |
boolean |
Default true. |
maxRows |
number |
Default 1000. |
queryTimeoutMs |
number |
Default 30000. |
maxRetries |
number |
Default 3; max generation/execute attempts for query() (each failure feeds back to the model). |
includeSampleValues |
boolean |
Default false; adds per-column samples to the prompt. |
schemaFilter |
(table: string) => boolean |
Exclude tables from schema injection. |
hints |
string[] |
Initial domain hints (same as addHint). |
| Field | Type |
|---|---|
sql |
string |
results |
Record<string, unknown>[] |
explanation |
string |
rowCount |
number |
executionTimeMs |
number |
retries |
number (optional) |
| Command | Description |
|---|---|
npm run build |
Compile TypeScript to dist/. |
npm start |
Run compiled API (node dist/api/server.js; run npm run build first). |
npm run dev |
Run API with tsx watch. |
npm run cli |
Interactive CLI. |
npm test |
All automated tests (tests/agent.integration.test.ts is skipped unless OPENAI_API_KEY is set). |
npm run test:integration |
Run only tests/agent.integration.test.ts (faster when iterating on the LLM test). |
npm run test:watch |
Vitest in watch mode. |
prepack runs npm run build automatically before npm pack / npm publish, so dist/ is fresh for the tarball.
agentic-db-query/
├── src/
│ ├── DatabaseAgent.ts
│ ├── db/
│ │ ├── dialect.ts
│ │ └── runner.ts
│ ├── schema/
│ │ ├── extractor.ts
│ │ └── formatter.ts
│ ├── sql/
│ │ ├── validator.ts
│ │ ├── executor.ts
│ │ └── result-formatter.ts
│ ├── prompts/
│ │ └── text-to-sql.ts
│ ├── llm/
│ │ └── model.ts
│ ├── api/
│ │ └── server.ts
│ ├── cli/
│ │ └── index.ts
│ └── index.ts
├── tests/
├── .gitignore
├── .env.example
├── docker-compose.yml
├── Dockerfile
├── vitest.config.ts
├── package.json
├── tsconfig.json
├── LICENSE
└── README.md
- Read-only validation uses heuristics (not a full SQL parser). Rare edge cases can include identifiers that resemble keywords.
- Query timeouts are applied via PostgreSQL
statement_timeoutand MySQL’s query timeout. SQLite does not apply a server-side timeout;maxRowsstill caps result size.
- Keep read-only enabled for untrusted users.
- The LLM can still produce expensive queries; use timeouts and row limits.
- Run against least-privilege DB users in production.
- This tool is not a substitute for parameterized app queries in application code; it is for controlled analytics/admin use.
This project is licensed under the MIT License — see LICENSE.
Telegram: @AuraTerminal