diff --git a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/dashboard-guide.md b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/dashboard-guide.md index 25a4d0f7e0..19665bd8bf 100644 --- a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/dashboard-guide.md +++ b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/dashboard-guide.md @@ -1 +1,201 @@ -# Grafana Dashboards for InfluxDB 3 \ No newline at end of file +# Grafana Dashboards for InfluxDB 3 + +This guide covers creating Grafana dashboards and visualizations for InfluxDB 3 data. + +--- + +## Rules + +- SHOULD use the Grafana InfluxDB data source with the "InfluxQL" or "SQL" query type +- MUST NOT configure Grafana with Flux query type for InfluxDB 3 +- SHOULD use time-bucketed aggregations for time-series panels to avoid returning too many raw data points +- SHOULD use Last Value Cache queries for single-stat and gauge panels showing current state + +--- + +## Data Source Configuration + +### Grafana InfluxDB Data Source (InfluxQL) + +| Setting | Value | +|---------|-------| +| Query Language | InfluxQL | +| URL | `https://your-influxdb3-endpoint:8181` | +| Database | Your database name | +| HTTP Header: Authorization | `Bearer ` | + +### Grafana InfluxDB Data Source (SQL) + +Grafana's official InfluxDB data source natively supports SQL as a query language for InfluxDB 3.x. No separate Flight SQL plugin is needed. + +| Setting | Value | +|---------|-------| +| Query Language | SQL | +| URL | `https://your-influxdb3-endpoint:8181` | +| Database | Your database name | +| Token | Your InfluxDB 3 token | + +--- + +## Panel Examples + +### Time-Series Panel — CPU Usage Over Time + +**InfluxQL:** +```sql +SELECT MEAN("cpu_usage") +FROM "system" +WHERE $timeFilter +GROUP BY time($__interval), "host" +``` + +**SQL:** +```sql +SELECT DATE_BIN(INTERVAL '1 minute', time, TIMESTAMP '1970-01-01T00:00:00Z') as time, + host, + AVG(cpu_usage) as cpu_usage +FROM system +WHERE time >= $__timeFrom AND time <= $__timeTo +GROUP BY 1, host +ORDER BY 1 +``` + +### Gauge Panel — Current Temperature + +**InfluxQL:** +```sql +SELECT LAST("value") +FROM "temperature" +WHERE "location" = 'server-room' + AND $timeFilter +``` + +**SQL:** +```sql +SELECT value +FROM temperature +WHERE location = 'server-room' +ORDER BY time DESC +LIMIT 1 +``` + +### Table Panel — Top Hosts by CPU + +**SQL:** +```sql +SELECT host, + AVG(cpu_usage) as avg_cpu, + MAX(cpu_usage) as max_cpu, + COUNT(*) as samples +FROM system +WHERE time >= $__timeFrom AND time <= $__timeTo +GROUP BY host +ORDER BY avg_cpu DESC +LIMIT 20 +``` + +### Bar Chart — Requests by Endpoint + +**SQL:** +```sql +SELECT endpoint, COUNT(*) as request_count +FROM http_requests +WHERE time >= $__timeFrom AND time <= $__timeTo +GROUP BY endpoint +ORDER BY request_count DESC +LIMIT 10 +``` + +### Stat Panel — Total Events + +**SQL:** +```sql +SELECT COUNT(*) as total_events +FROM events +WHERE time >= $__timeFrom AND time <= $__timeTo +``` + +--- + +## Grafana Variables + +Use Grafana template variables for dynamic dashboards: + +### Host selector variable (SQL) +```sql +SELECT DISTINCT host FROM system WHERE time >= now() - INTERVAL '1 hour' +``` + +### Location selector variable (SQL) +```sql +SELECT DISTINCT location FROM temperature WHERE time >= now() - INTERVAL '1 hour' +``` + +### Host selector variable (InfluxQL) +```sql +SHOW TAG VALUES FROM "system" WITH KEY = "host" +``` + +Use variables in InfluxQL queries with `$variable_name` syntax: +```sql +SELECT MEAN("cpu_usage") +FROM "system" +WHERE "host" =~ /^$host$/ + AND $timeFilter +GROUP BY time($__interval) +``` + +--- + +## Dashboard Design Best Practices + +- SHOULD use `$__interval` or `DATE_BIN` for time-series panels to auto-adjust resolution based on the time range +- SHOULD set appropriate refresh intervals (e.g., 30s for real-time monitoring, 5m for historical analysis) +- SHOULD use Grafana alerting rules on critical metrics rather than polling dashboards manually +- SHOULD group related panels into rows (e.g., "CPU Metrics", "Memory Metrics", "Disk Metrics") +- SHOULD use `LIMIT` in table panels to avoid loading excessive data +- SHOULD use Stat or Gauge panels with `LAST()` or `ORDER BY time DESC LIMIT 1` for current-value displays + +--- + +## Grafana Dashboard JSON Template — System Monitoring + +A minimal dashboard structure for system metrics: + +```json +{ + "title": "System Monitoring", + "panels": [ + { + "title": "CPU Usage", + "type": "timeseries", + "targets": [ + { + "rawSql": "SELECT DATE_BIN(INTERVAL '1 minute', time, TIMESTAMP '1970-01-01T00:00:00Z') as time, host, AVG(cpu_usage) as cpu FROM system WHERE time >= $__timeFrom AND time <= $__timeTo GROUP BY 1, host ORDER BY 1", + "format": "time_series" + } + ] + }, + { + "title": "Memory Usage", + "type": "timeseries", + "targets": [ + { + "rawSql": "SELECT DATE_BIN(INTERVAL '1 minute', time, TIMESTAMP '1970-01-01T00:00:00Z') as time, host, AVG(memory_pct) as memory FROM system WHERE time >= $__timeFrom AND time <= $__timeTo GROUP BY 1, host ORDER BY 1", + "format": "time_series" + } + ] + }, + { + "title": "Current CPU by Host", + "type": "gauge", + "targets": [ + { + "rawSql": "SELECT host, cpu_usage FROM system ORDER BY time DESC LIMIT 10", + "format": "table" + } + ] + } + ] +} +``` diff --git a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/development-guide.md b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/development-guide.md index 36b2a739c8..c6bc78134c 100644 --- a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/development-guide.md +++ b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/development-guide.md @@ -2,24 +2,265 @@ ## Overview -## Best Practices Guide +InfluxDB 3 is a columnar time-series database built on Apache Arrow and Parquet. It uses SQL as its primary query language and supports InfluxQL for compatibility. Data is organized into Databases and Tables. -- SHOULD never atempt FLUX queries +Amazon Timestream for InfluxDB 3 is available as managed DB clusters. Use the `awslabs.timestream-for-influxdb-mcp-server` for cluster management and the `influxdb3` MCP server for data operations. + +--- + +## Rules + +- MUST NOT attempt Flux queries — Flux is not supported in InfluxDB 3 +- SHOULD use SQL as the primary query language +- MAY use InfluxQL for compatibility with existing queries +- MUST use the `influxdb3` MCP server for data operations (read/write/schema) +- MUST use the `awslabs.timestream-for-influxdb-mcp-server` for AWS resource management (clusters, instances, parameter groups) +- SHOULD remind users of Core vs Enterprise limits when relevant +- MAY use the v2-compatible write endpoint (`/api/v2/write`) when migrating existing v2 write workloads — this avoids rewriting write code during migration + +--- + +## Best Practices + +- SHOULD design schemas with low-cardinality tags for optimal query performance +- SHOULD use SQL for new query development — it is the primary and most capable language in v3 +- SHOULD batch writes using line protocol for high-throughput ingestion +- SHOULD use Last Value Cache (LVC) for dashboards that display current state +- SHOULD use Distinct Value Cache (DVC) for fast enumeration of tag values +- MUST be aware of Core limits: 5 databases, 500 tables per DB, 250 columns per table +- SHOULD prefer `INTERVAL` syntax for time ranges in SQL: `WHERE time >= now() - INTERVAL '1 hour'` + +--- ## Tool Examples -### Queries +### Cluster Management (awslabs MCP server) + +#### Create a cluster +```json +{ + "name": "my-influxdb3-cluster", + "db_instance_type": "db.influx.xlarge", + "password": "securePassword123", + "allocated_storage_gb": 100, + "vpc_security_group_ids": ["sg-0123456789abcdef0"], + "vpc_subnet_ids": ["subnet-abc123", "subnet-def456"], + "publicly_accessible": true, + "tool_write_mode": true +} +``` + +#### List clusters +No parameters required. Use `ListDbClusters`. + +#### Check cluster status +```json +{ + "db_cluster_id": "cluster-abc123" +} +``` + +### Data Operations (influxdb3 MCP server) -### Writes +Refer to the influxdb3 MCP server documentation for tool-specific parameters. Common operations include: + +#### Write data +Write using line protocol format: +``` +cpu,host=server01,region=us-east usage=0.64,idle=0.36 1622505600000000000 +``` + +#### Query with SQL +```sql +SELECT host, AVG(usage) as avg_usage +FROM cpu +WHERE time >= now() - INTERVAL '1 hour' +GROUP BY host +ORDER BY avg_usage DESC +``` + +#### Query with InfluxQL +```sql +SELECT MEAN("usage") +FROM "cpu" +WHERE time > now() - 1h +GROUP BY "host" +``` ### Schema Operations +#### Create a database +Use the influxdb3 MCP server's database creation tool. + +#### List tables +Use the influxdb3 MCP server's schema inspection tools to view tables and columns. + +--- ## Workflow Examples +### IoT Sensor Monitoring Setup +1. Create a DB cluster using `CreateDbCluster` with appropriate instance type +2. Wait for cluster status to become `AVAILABLE` using `GetDbCluster` +3. Create a database for sensor data using the influxdb3 MCP server +4. Write sensor data using line protocol: + ``` + sensor,device_id=d001,location=factory1 temperature=23.5,humidity=45.2 + sensor,device_id=d002,location=factory1 temperature=24.1,humidity=44.8 + ``` +5. Query with SQL: + ```sql + SELECT device_id, location, + AVG(temperature) as avg_temp, + MAX(humidity) as max_humidity + FROM sensor + WHERE time >= now() - INTERVAL '24 hours' + GROUP BY device_id, location + ``` + +### DevOps Metrics Dashboard +1. Write system metrics: + ``` + system,host=web01 cpu=0.72,memory=68.5,disk_pct=45.2 + system,host=web02 cpu=0.45,memory=72.1,disk_pct=38.7 + ``` +2. Set up Last Value Cache for current state: + - Cache the latest `cpu`, `memory`, `disk_pct` per `host` +3. Query current state (fast via LVC): + ```sql + SELECT host, cpu, memory, disk_pct + FROM system + WHERE time >= now() - INTERVAL '5 minutes' + ``` + +--- + +## Core vs Enterprise Limits + +| Resource | Core | Enterprise | +|----------|------|-----------| +| Databases | 5 | 500+ | +| Tables per database | 500 | 500+ | +| Columns per table | 250 | 500+ | +| Last Value Caches | Limited | Higher limits | +| Distinct Value Caches | Limited | Higher limits | + +- SHOULD warn users when approaching Core limits +- SHOULD recommend Enterprise for production workloads with many databases or high-cardinality schemas + +--- + ## Limitations -- Core has significant restrictions compared to Enterprise; will need explicit guidelines in our steering docs +- Flux is NOT supported — all Flux queries must be rewritten as SQL or InfluxQL +- Core has significant restrictions compared to Enterprise (see limits table above) +- No built-in task engine — scheduled processing must be handled externally +- No built-in UI for dashboarding — use Grafana or similar tools +- Deletes of individual records are limited — design retention policies carefully + +--- + +## Schema Design & Data Modelling + +### Tag vs Field Decision Guide + +| Put in Tags (indexed) | Put in Fields (not indexed) | +|----------------------|---------------------------| +| Host names, regions, environments | CPU %, memory %, latency values | +| Device IDs (if bounded set) | Temperature, humidity readings | +| Status categories (ok, warning, critical) | Request counts, byte counts | +| Application names, service names | Duration, response time | +| Sensor types, metric types | Status messages (strings) | + +**Key rules:** +- MUST use tags for values used in `WHERE` and `GROUP BY` clauses +- MUST use fields for numeric measurements and high-cardinality strings +- MUST NOT use tags for: UUIDs, session IDs, IP addresses, user IDs, timestamps, request IDs +- SHOULD keep total unique tag combinations (series) under 1 million per database for optimal performance + +### Naming Conventions + +- SHOULD use snake_case for measurement/table names: `cpu_usage`, `http_requests` +- SHOULD use snake_case for tag and field keys: `device_id`, `avg_latency_ms` +- MUST NOT use reserved SQL keywords as table or column names (e.g., `time`, `select`, `from`, `table`) + - If unavoidable, quote them: `"time"`, `"select"` +- MUST NOT use special characters in tag/field keys: avoid `.`, `/`, `(`, `)`, `{`, `}` +- SHOULD include units in field names for clarity: `temperature_celsius`, `latency_ms`, `disk_pct` +- SHOULD use consistent naming across related tables + +### Series Cardinality Audit Workflow + +When a user asks "which tags are blowing up series count?" or performance is degrading: + +**For InfluxDB 3 (SQL):** +1. Count distinct values per tag column: + ```sql + SELECT COUNT(DISTINCT host) as host_count, + COUNT(DISTINCT region) as region_count, + COUNT(DISTINCT device_id) as device_count + FROM my_table + WHERE time >= now() - INTERVAL '24 hours' + ``` +2. Identify the high-cardinality culprit — any tag with thousands+ of distinct values +3. Estimate total series: multiply distinct counts of all tags together +4. Recommend fixes: + - Move high-cardinality tags to fields + - Consolidate related tags (e.g., `city` + `state` → `region`) + - Split into separate tables if tag sets serve different query patterns + +**For InfluxDB 2 (Flux):** +1. Count distinct tag values: + ```flux + import "influxdata/influxdb/schema" + schema.tagValues(bucket: "my-bucket", tag: "device_id", start: -24h) + |> count() + ``` +2. List all tag keys: + ```flux + import "influxdata/influxdb/schema" + schema.tagKeys(bucket: "my-bucket", start: -24h) + ``` +3. For each tag with high distinct count, recommend moving to a field + +**Common redesign patterns:** +- `device_id` with 100K+ values → keep as tag only if you always filter by it; otherwise move to field +- `request_id` or `trace_id` → ALWAYS a field, never a tag +- `ip_address` → field (high cardinality) +- `user_id` → field unless bounded set (e.g., internal users only) + +--- + +## Ad-Hoc Data Export + +For scenarios like "export the last 7 days of tenant=acme data for incident analysis": + +### InfluxDB 3 — SQL Export +```sql +SELECT * +FROM events +WHERE tenant = 'acme' + AND time >= now() - INTERVAL '7 days' +ORDER BY time ASC +``` +Use the influxdb3 MCP server's query tool with CSV output format for export. + +### Filtered Export to Line Protocol +1. Query the data with filters +2. Convert results to line protocol format +3. Write to a new database/bucket or save as file + +### Time-Scoped Export for Large Datasets +For large exports, batch by day: +```sql +SELECT * FROM events +WHERE tenant = 'acme' + AND time >= TIMESTAMP '2025-03-01T00:00:00Z' + AND time < TIMESTAMP '2025-03-02T00:00:00Z' +ORDER BY time ASC +``` +Repeat for each day in the range. + +--- ## Troubleshooting diff --git a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/migrations.md b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/migrations.md index d3e981737b..1dfe94c170 100644 --- a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/migrations.md +++ b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/migrations.md @@ -1 +1,130 @@ # Migrations to Timestream for InfluxDB 3 + +This guide covers migration paths to Amazon Timestream for InfluxDB 3. + +--- + +## Rules + +- MUST back up source data before starting any migration +- SHOULD perform a test migration with a subset of data before migrating production workloads +- MUST rewrite Flux queries as SQL or InfluxQL when migrating from InfluxDB 2 +- SHOULD plan for downtime or dual-write during migration +- MUST NOT delete source data until the migration is verified + +--- + +## Migration Paths + +### 1. InfluxDB OSS (Core) → Timestream for InfluxDB 3 + +**Scenario:** Moving from a self-managed InfluxDB 3 Core instance to a managed Timestream for InfluxDB 3 cluster. + +**Steps:** +1. Create a Timestream for InfluxDB 3 cluster using `CreateDbCluster` +2. Wait for the cluster to reach `AVAILABLE` status +3. Create matching databases on the target cluster +4. Export data from the source using the InfluxDB 3 CLI: + ```bash + influxdb3 query --database "SELECT * FROM " --format csv > export.csv + ``` +5. Convert exported data to line protocol format +6. Write data to the target using line protocol +7. Verify data integrity by comparing row counts and sample queries +8. Update application connection strings to point to the new endpoint + +**Considerations:** +- Last Value Caches and Distinct Value Caches must be recreated on the target +- Database tokens must be recreated on the target +- Schema (tables, columns) will be recreated automatically on first write + +### 2. InfluxDB 3 Core → InfluxDB 3 Enterprise + +**Scenario:** Upgrading from Core to Enterprise for higher limits and clustering. + +**Steps:** +1. Deploy an Enterprise cluster using `CreateDbCluster` +2. Export data from Core using the InfluxDB 3 CLI or query API +3. Write data to the Enterprise cluster +4. Recreate databases, tokens, caches, and any configuration +5. Verify data and update connection strings + +**Considerations:** +- Enterprise supports higher limits (500+ databases, 500+ columns per table) +- No query language changes needed — both use SQL and InfluxQL +- Plan for the transition window where both instances are running + +### 3. InfluxDB Cloud → Timestream for InfluxDB 3 + +**Scenario:** Moving from InfluxDB Cloud (managed by InfluxData) to Amazon Timestream for InfluxDB 3. + +**Steps:** +1. Create a Timestream for InfluxDB 3 cluster using `CreateDbCluster` +2. Export data from InfluxDB Cloud: + - Use the InfluxDB Cloud API or CLI to query and export data + - Export in CSV or line protocol format +3. Create matching databases on the target +4. Write exported data to the target using line protocol +5. Recreate tokens and any caches +6. Verify data integrity +7. Update application endpoints + +**Considerations:** +- InfluxDB Cloud may have features not available in self-managed InfluxDB 3 (e.g., managed alerting) +- Ensure the Timestream cluster has sufficient storage for the migrated data +- Network latency may differ — test query performance after migration + +--- + +## Data Export Strategies + +### Line Protocol Export +Best for preserving the exact data format: +```bash +influxdb3 query --database "SELECT * FROM
WHERE time >= TIMESTAMP '2025-01-01T00:00:00Z'" --format csv +``` +Then convert CSV rows to line protocol format for writing. + +### Time-Range Batching +For large datasets, export in time-range batches to avoid memory issues: +1. Query data in daily or hourly chunks +2. Write each chunk to the target +3. Verify each chunk before proceeding + +### Dual-Write Strategy +For zero-downtime migration: +1. Configure applications to write to both source and target simultaneously +2. Backfill historical data from source to target +3. Verify data consistency +4. Switch reads to the target +5. Stop writes to the source + +--- + +## Query Migration + +### Flux → SQL Conversion Examples + +| Flux | SQL Equivalent | +|------|---------------| +| `from(bucket: "b") \|> range(start: -1h)` | `SELECT * FROM table WHERE time >= now() - INTERVAL '1 hour'` | +| `\|> filter(fn: (r) => r._field == "temp")` | `SELECT temp FROM table ...` | +| `\|> mean()` | `SELECT AVG(temp) FROM table ...` | +| `\|> group(columns: ["host"])` | `... GROUP BY host` | +| `\|> aggregateWindow(every: 5m, fn: mean)` | `SELECT DATE_BIN(INTERVAL '5 minutes', time, TIMESTAMP '1970-01-01T00:00:00Z'), AVG(temp) ... GROUP BY 1` | +| `\|> last()` | `SELECT * FROM table ORDER BY time DESC LIMIT 1` | +| `\|> pivot(...)` | Fields are already columns in InfluxDB 3 — no pivot needed | + +--- + +## Post-Migration Checklist + +- [ ] All databases created on target +- [ ] Data written and row counts verified +- [ ] Sample queries return expected results +- [ ] Authentication tokens created +- [ ] Last Value Caches recreated (if used) +- [ ] Distinct Value Caches recreated (if used) +- [ ] Application connection strings updated +- [ ] Monitoring and alerting configured for the new cluster +- [ ] Source data retained until migration is fully verified diff --git a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/query-guide.md b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/query-guide.md index 8029ba43de..eb96d9b548 100644 --- a/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/query-guide.md +++ b/src/timestream-for-influxdb-mcp-server/power/steering/influxdb3/query-guide.md @@ -1 +1,330 @@ -# InfluxDB 3 Query Guide \ No newline at end of file +# InfluxDB 3 Query Guide + +InfluxDB 3 supports SQL (primary) and InfluxQL for querying time-series data. Flux is NOT supported. + +--- + +## Rules + +- MUST NOT use Flux queries — they will fail on InfluxDB 3 +- SHOULD prefer SQL for new queries — it is the primary query language +- MAY use InfluxQL for compatibility with existing queries or cross-version portability +- SHOULD use parameterized queries when incorporating user input to prevent injection + +--- + +## SQL Query Examples + +### Basic Queries + +#### Select all data from a table (last hour) +```sql +SELECT * +FROM temperature +WHERE time >= now() - INTERVAL '1 hour' +ORDER BY time DESC +``` + +#### Select specific columns +```sql +SELECT time, location, value +FROM temperature +WHERE time >= now() - INTERVAL '24 hours' +ORDER BY time DESC +``` + +#### Filter by tag value +```sql +SELECT time, value +FROM temperature +WHERE location = 'office' + AND time >= now() - INTERVAL '1 hour' +ORDER BY time DESC +``` + +#### Limit results +```sql +SELECT time, location, value +FROM temperature +WHERE time >= now() - INTERVAL '1 hour' +ORDER BY time DESC +LIMIT 100 +``` + +### Aggregation Queries + +#### Average by group +```sql +SELECT location, AVG(value) as avg_temp +FROM temperature +WHERE time >= now() - INTERVAL '24 hours' +GROUP BY location +ORDER BY avg_temp DESC +``` + +#### Min, Max, Count +```sql +SELECT location, + MIN(value) as min_temp, + MAX(value) as max_temp, + COUNT(*) as readings +FROM temperature +WHERE time >= now() - INTERVAL '24 hours' +GROUP BY location +``` + +#### Time-bucketed aggregation +```sql +SELECT DATE_BIN(INTERVAL '15 minutes', time, TIMESTAMP '1970-01-01T00:00:00Z') as bucket, + location, + AVG(value) as avg_temp, + COUNT(*) as count +FROM temperature +WHERE time >= now() - INTERVAL '6 hours' +GROUP BY bucket, location +ORDER BY bucket DESC +``` + +#### Percentiles +```sql +SELECT location, + APPROX_PERCENTILE_CONT(value, 0.50) as p50, + APPROX_PERCENTILE_CONT(value, 0.95) as p95, + APPROX_PERCENTILE_CONT(value, 0.99) as p99 +FROM response_time +WHERE time >= now() - INTERVAL '1 hour' +GROUP BY location +``` + +### Advanced Queries + +#### Subquery — latest reading per device +```sql +SELECT * +FROM ( + SELECT *, + ROW_NUMBER() OVER (PARTITION BY device_id ORDER BY time DESC) as rn + FROM sensor_data + WHERE time >= now() - INTERVAL '1 hour' +) +WHERE rn = 1 +``` + +#### Join two tables +```sql +SELECT a.time, a.host, a.cpu_usage, b.memory_usage +FROM cpu a +JOIN memory b ON a.host = b.host AND a.time = b.time +WHERE a.time >= now() - INTERVAL '1 hour' +``` + +#### CASE expressions +```sql +SELECT time, host, cpu_usage, + CASE + WHEN cpu_usage > 90 THEN 'critical' + WHEN cpu_usage > 70 THEN 'warning' + ELSE 'normal' + END as status +FROM cpu +WHERE time >= now() - INTERVAL '1 hour' +``` + +--- + +## InfluxQL Query Examples + +### Basic Queries + +#### Select all fields from a measurement +```sql +SELECT * +FROM "temperature" +WHERE time > now() - 1h +``` + +#### Filter by tag +```sql +SELECT "value" +FROM "temperature" +WHERE "location" = 'office' + AND time > now() - 1h +``` + +### Aggregation Queries + +#### Mean grouped by tag +```sql +SELECT MEAN("value") +FROM "temperature" +WHERE time > now() - 24h +GROUP BY "location" +``` + +#### Time-bucketed aggregation +```sql +SELECT MEAN("value"), MAX("value"), MIN("value") +FROM "temperature" +WHERE time > now() - 6h +GROUP BY time(15m), "location" +``` + +#### Count and sum +```sql +SELECT COUNT("value"), SUM("value") +FROM "requests" +WHERE time > now() - 1h +GROUP BY "endpoint" +``` + +#### Last value per group +```sql +SELECT LAST("value") +FROM "temperature" +GROUP BY "location" +``` + +--- + +## Time Range Syntax + +### SQL +| Expression | Meaning | +|-----------|---------| +| `now() - INTERVAL '1 hour'` | Last hour | +| `now() - INTERVAL '24 hours'` | Last 24 hours | +| `now() - INTERVAL '7 days'` | Last 7 days | +| `now() - INTERVAL '30 days'` | Last 30 days | +| `'2025-01-01T00:00:00Z'` | Specific timestamp | + +### InfluxQL +| Expression | Meaning | +|-----------|---------| +| `now() - 1h` | Last hour | +| `now() - 24h` | Last 24 hours | +| `now() - 7d` | Last 7 days | +| `now() - 30d` | Last 30 days | + +--- + +## Common SQL Functions + +| Function | Description | Example | +|----------|-------------|---------| +| `AVG(col)` | Average value | `AVG(temperature)` | +| `MIN(col)` | Minimum value | `MIN(temperature)` | +| `MAX(col)` | Maximum value | `MAX(temperature)` | +| `SUM(col)` | Sum of values | `SUM(bytes_sent)` | +| `COUNT(*)` | Count of rows | `COUNT(*)` | +| `DATE_BIN(interval, time, origin)` | Bucket timestamps | `DATE_BIN(INTERVAL '5 minutes', time, TIMESTAMP '1970-01-01T00:00:00Z')` | +| `APPROX_PERCENTILE_CONT(col, p)` | Approximate percentile | `APPROX_PERCENTILE_CONT(latency, 0.99)` | +| `ROW_NUMBER() OVER (...)` | Window function | `ROW_NUMBER() OVER (PARTITION BY host ORDER BY time DESC)` | + +--- + +## Generating Queries from English (Text-to-SQL) + +When a user describes what they want in plain English, follow this pattern: + +### Pattern: Identify → Map → Build + +1. **Identify** the table, columns, filters, time range, and aggregation from the request +2. **Map** to SQL clauses: time range → `WHERE time >=`, filters → `WHERE col = 'val'`, aggregation → `AVG()`/`SUM()`/etc., grouping → `GROUP BY`, sorting → `ORDER BY` +3. **Build** the query following standard SQL order: `SELECT` → `FROM` → `WHERE` → `GROUP BY` → `ORDER BY` → `LIMIT` + +### Examples + +**"Show me the average temperature per location for the last 24 hours"** +```sql +SELECT location, AVG(value) as avg_temp +FROM temperature +WHERE time >= now() - INTERVAL '24 hours' +GROUP BY location +ORDER BY avg_temp DESC +``` + +**"What was the peak CPU usage on server web01 in the last hour?"** +```sql +SELECT MAX(cpu_usage) as peak_cpu +FROM system +WHERE host = 'web01' + AND time >= now() - INTERVAL '1 hour' +``` + +**"Give me 5-minute averages of memory usage grouped by host for the last 6 hours"** +```sql +SELECT DATE_BIN(INTERVAL '5 minutes', time, TIMESTAMP '1970-01-01T00:00:00Z') as bucket, + host, + AVG(memory_pct) as avg_memory +FROM system +WHERE time >= now() - INTERVAL '6 hours' +GROUP BY bucket, host +ORDER BY bucket DESC +``` + +**"Which hosts had CPU above 90% in the last hour?"** +```sql +SELECT host, MAX(cpu_usage) as max_cpu +FROM system +WHERE cpu_usage > 90 + AND time >= now() - INTERVAL '1 hour' +GROUP BY host +ORDER BY max_cpu DESC +``` + +**"Show me the 95th percentile response time per endpoint for today"** +```sql +SELECT endpoint, + APPROX_PERCENTILE_CONT(response_time_ms, 0.95) as p95 +FROM http_requests +WHERE time >= now() - INTERVAL '24 hours' +GROUP BY endpoint +ORDER BY p95 DESC +``` + +--- + +## Query Optimization + +When a user says "this query is slow", follow this checklist: + +### Optimization Checklist + +1. **Narrow the time range** — `INTERVAL '1 hour'` is much faster than `INTERVAL '30 days'` +2. **Add `WHERE` filters** — filter by tags/columns to reduce scan scope +3. **Use `DATE_BIN` for aggregation** — avoid returning millions of raw rows +4. **Add `LIMIT`** — cap result size +5. **Select only needed columns** — avoid `SELECT *` on large tables +6. **Use Last Value Cache** — for current-state queries, LVC avoids scanning historical data + +### Slow Query → Optimized Query Example + +**Slow:** +```sql +SELECT * FROM system +WHERE time >= now() - INTERVAL '30 days' +``` + +**Optimized:** +```sql +SELECT DATE_BIN(INTERVAL '1 hour', time, TIMESTAMP '1970-01-01T00:00:00Z') as bucket, + host, + AVG(cpu_usage) as avg_cpu, + MAX(memory_pct) as max_memory +FROM system +WHERE time >= now() - INTERVAL '30 days' + AND host = 'web01' +GROUP BY bucket, host +ORDER BY bucket DESC +LIMIT 1000 +``` + +### Schema Changes to Improve Performance + +If queries are consistently slow even after optimization: +- Move high-cardinality tags to fields (reduces series count) +- Split wide tables into focused ones (e.g., `system` → `cpu`, `memory`, `disk`) +- Use Last Value Cache for frequently-queried current-state data +- Use Distinct Value Cache for tag enumeration queries +- Consider separate databases with different retention for hot vs. cold data +- For InfluxDB 3 Core: ensure you're not hitting the 250 column limit per table