Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,141 @@ For all documentation and guides related to InfluxDB 2 (Timestream for InfluxDB)
- SHOULD refer to `influxdb-2-vs-3` when working across both versions.
- SHOULD refer to `line-protocol.md` when dealing with Line Protocol.

---

## Onboarding: Deploy and Connect

### InfluxDB 2 — Full Setup

1. Deploy an instance:
```json
{
"db_instance_name": "my-app-influxdb",
"db_instance_type": "db.influx.large",
"password": "<secure-password>",
"allocated_storage_gb": 50,
"vpc_security_group_ids": ["sg-xxxxxxxx"],
"vpc_subnet_ids": ["subnet-aaa", "subnet-bbb"],
"publicly_accessible": true,
"username": "admin",
"organization": "my-org",
"bucket": "default",
"tool_write_mode": true
}
```
2. Wait for `AVAILABLE` status using `GetDbInstance`
3. Save the operator token from initial setup — it grants full admin access
4. Create scoped tokens for applications:
- SHOULD use read-only tokens for query-only services
- SHOULD use read/write tokens scoped to specific buckets for application writes
- MUST use operator token only for admin operations (creating orgs, managing tokens)
5. Create application buckets with appropriate retention:
```json
{"bucket_name": "app-metrics", "retention_seconds": 2592000, "tool_write_mode": true}
```
6. Verify connectivity by listing buckets: `InfluxDBListBuckets`

### InfluxDB 3 — Full Setup

1. Deploy a cluster:
```json
{
"name": "my-app-influxdb3",
"db_instance_type": "db.influx.xlarge",
"password": "<secure-password>",
"allocated_storage_gb": 100,
"vpc_security_group_ids": ["sg-xxxxxxxx"],
"vpc_subnet_ids": ["subnet-aaa", "subnet-bbb"],
"publicly_accessible": true,
"tool_write_mode": true
}
```
2. Wait for `AVAILABLE` status using `GetDbCluster`
3. Create a database using the influxdb3 MCP server
4. Create scoped database tokens:
- SHOULD create read-only tokens for query services
- SHOULD create read/write tokens per database for application writes
- MUST use admin tokens only for database/token management
5. Write initial data to create tables (schema-on-write)
6. Verify connectivity by listing tables via the influxdb3 MCP server

### Ready-to-Run Client Snippets

#### Python — InfluxDB 2
```python
from influxdb_client import InfluxDBClient

client = InfluxDBClient(
url="https://your-endpoint:8086",
token="your-token",
org="your-org",
verify_ssl=True
)

# Write
write_api = client.write_api()
write_api.write(bucket="my-bucket", record="temperature,location=office value=23.5")

# Query
query_api = client.query_api()
tables = query_api.query('from(bucket: "my-bucket") |> range(start: -1h)')
for table in tables:
for record in table.records:
print(f"{record.get_time()}: {record.get_value()}")

client.close()
```

#### Python — InfluxDB 3
```python
from influxdb_client_3 import InfluxDBClient3

client = InfluxDBClient3(
host="your-endpoint",
token="your-token",
database="my-database"
)

# Write
client.write("temperature,location=office value=23.5")

# Query (SQL)
table = client.query("SELECT * FROM temperature WHERE time >= now() - INTERVAL '1 hour'")
print(table.to_pandas())

client.close()
```

#### curl — Write (InfluxDB 2)
```bash
curl -X POST "https://your-endpoint:8086/api/v2/write?org=your-org&bucket=my-bucket&precision=ns" \
-H "Authorization: Token your-token" \
-H "Content-Type: text/plain" \
-d "temperature,location=office value=23.5"
```

#### curl — Write (InfluxDB 3)
```bash
curl -X POST "https://your-endpoint:8181/api/v3/write_lp?db=my-database&precision=auto" \
-H "Authorization: Bearer your-token" \
-H "Content-Type: text/plain" \
-d "temperature,location=office value=23.5"
```

---

## Token Best Practices (Least Privilege)

| Use Case | InfluxDB 2 Token Type | InfluxDB 3 Token Type |
|----------|----------------------|----------------------|
| Admin / setup | Operator token | Admin token |
| Application writes | Read/write token (bucket-scoped) | Database token (write) |
| Dashboard / query service | Read-only token (bucket-scoped) | Database token (read) |
| CI/CD / migrations | All-access token (temporary) | Admin token (temporary) |

- MUST NOT embed operator/admin tokens in application code
- SHOULD rotate tokens periodically
- SHOULD use environment variables for token storage, never hardcode

## Troubleshooting

Expand Down
Original file line number Diff line number Diff line change
@@ -1 +1,83 @@
# Glossary
# Glossary

Key terms and concepts for working with Amazon Timestream for InfluxDB.

---

## InfluxDB Core Concepts

| Term | Definition |
|------|-----------|
| **Measurement** | The top-level grouping of data in InfluxDB, analogous to a table in relational databases. Each measurement contains tags, fields, and timestamps. |
| **Tag** | An indexed key-value pair used for metadata. Tags are strings and are optimized for grouping and filtering queries. Do not use tags for high-cardinality values that change frequently. |
| **Field** | A key-value pair that stores the actual data values (metrics). Fields are not indexed and can be integers, floats, strings, or booleans. |
| **Timestamp** | The time associated with a data point. Every point in InfluxDB has a timestamp. If omitted during write, the server assigns the current time. |
| **Point** | A single data record consisting of a measurement name, tag set, field set, and timestamp. A point is uniquely identified by its measurement, tag set, and timestamp. |
| **Series** | A unique combination of measurement and tag set. Each distinct tag set within a measurement creates a new series. |
| **Cardinality** | The total number of unique series in a database. High cardinality (millions of series) can degrade performance. |

---

## InfluxDB 2 Concepts

| Term | Definition |
|------|-----------|
| **Bucket** | A named storage location in InfluxDB 2 that holds time-series data. Buckets have a retention policy that defines how long data is kept. Analogous to a database + retention policy in InfluxDB 1.x. |
| **Organization** | A workspace for a group of users in InfluxDB 2. Buckets belong to organizations. |
| **Flux** | The functional query and scripting language for InfluxDB 2. Flux supports data transformations, aggregations, joins, and alerting. Not supported in InfluxDB 3. |
| **InfluxQL** | A SQL-like query language supported in both InfluxDB 2 and 3. In InfluxDB 2, it is a legacy alternative to Flux. |
| **Token** | An authentication credential in InfluxDB 2. Types include operator tokens (full access), all-access tokens, and read/write tokens scoped to specific buckets. |
| **Operator Token** | A special token created during initial setup that grants full administrative access. Required for creating new organizations. |
| **Task** | A scheduled Flux script in InfluxDB 2 that runs at defined intervals for downsampling, alerting, or data processing. |
| **DBRP Mapping** | Database and Retention Policy mapping that allows InfluxQL queries to target InfluxDB 2 buckets using the legacy database/retention-policy naming convention. |

---

## InfluxDB 3 Concepts

| Term | Definition |
|------|-----------|
| **Database** | The primary storage container in InfluxDB 3, replacing the bucket concept from InfluxDB 2. Databases hold tables (measurements). |
| **Table** | A structured collection of data in InfluxDB 3, equivalent to a measurement. Tables have defined columns for tags, fields, and time. |
| **SQL** | The primary query language in InfluxDB 3. Standard SQL is supported for querying time-series data. |
| **InfluxQL** | Also supported in InfluxDB 3 as an alternative query language alongside SQL. |
| **Core** | The open-source, single-node edition of InfluxDB 3. Has limitations on database count, table count, and column count compared to Enterprise. |
| **Enterprise** | The commercial, multi-node edition of InfluxDB 3 with higher limits and additional features like clustering. |
| **Last Value Cache (LVC)** | An InfluxDB 3 feature that caches the most recent value for specified columns, enabling fast lookups of current state without scanning historical data. |
| **Distinct Value Cache (DVC)** | An InfluxDB 3 feature that maintains a cache of distinct values for specified columns, useful for fast enumeration of tag values. |

---

## Amazon Timestream for InfluxDB Concepts

| Term | Definition |
|------|-----------|
| **DB Instance** | A managed InfluxDB 2 deployment in Amazon Timestream for InfluxDB. Supports standalone and Multi-AZ deployment types. |
| **DB Cluster** | A managed InfluxDB deployment (v2 or v3) in Amazon Timestream for InfluxDB that can span multiple instances for high availability. |
| **DB Parameter Group** | A collection of engine configuration parameters that can be applied to DB instances or clusters. Used to tune InfluxDB behavior. |
| **DB Instance Type** | The compute class (e.g., db.influx.medium, db.influx.xlarge) that determines CPU and memory for the InfluxDB deployment. |
| **DB Storage Type** | The storage class for the InfluxDB deployment. Options include InfluxIOIncludedT1 and InfluxIOIncludedT2. |
| **Deployment Type** | Specifies whether a DB instance runs as a standalone instance or with a Multi-AZ standby for high availability. |
| **Failover Mode** | For clusters, specifies the behavior when the primary node fails. Options include automatic failover. |
| **Allocated Storage** | The amount of storage (in GiB) provisioned for the DB instance or cluster. |
| **Endpoint** | The DNS hostname used to connect to the InfluxDB instance or cluster. Provided after deployment. |

---

## Line Protocol Concepts

| Term | Definition |
|------|-----------|
| **Line Protocol** | InfluxDB's text-based format for writing data. Format: `measurement,tag_key=tag_val field_key=field_val timestamp`. |
| **Write Precision** | The timestamp precision used when writing data. Options: nanoseconds (ns), microseconds (us), milliseconds (ms), seconds (s). Default is ns. |
| **Batch Write** | Writing multiple data points in a single API call for improved throughput. Recommended for high-ingest workloads. |

---

## Query Language Summary

| Language | InfluxDB 2 | InfluxDB 3 | Notes |
|----------|:----------:|:----------:|-------|
| Flux | Supported (primary) | Not supported | Functional scripting language |
| InfluxQL | Supported (legacy) | Supported | SQL-like, works across versions |
| SQL | Not supported | Supported (primary) | Standard SQL for InfluxDB 3 |
Original file line number Diff line number Diff line change
@@ -1 +1,144 @@
# InfluxDB 2 vs 3
# InfluxDB 2 vs 3

This guide highlights the key differences between InfluxDB 2 and InfluxDB 3 to help with version selection, migration planning, and avoiding version-specific pitfalls.

---

## Rules

- MUST NOT use Flux queries against InfluxDB 3 instances — Flux is not supported in v3
- MUST NOT use SQL queries against InfluxDB 2 instances — SQL is not supported in v2
- SHOULD ask the user which version they are using if not specified
- SHOULD use InfluxQL when writing queries that need to work across both versions
- MUST use the correct MCP server for the target version:
- InfluxDB 2: `awslabs.timestream-for-influxdb-mcp-server`
- InfluxDB 3: `influxdb3` MCP server

---

## Architecture Comparison

| Aspect | InfluxDB 2 | InfluxDB 3 |
|--------|-----------|-----------|
| Storage Engine | TSM (Time-Structured Merge Tree) | Apache Parquet + Arrow (columnar) |
| Query Languages | Flux (primary), InfluxQL (legacy) | SQL (primary), InfluxQL |
| Data Model | Buckets → Measurements | Databases → Tables |
| Schema | Schema-on-write (implicit) | Schema-on-write with explicit table structure |
| Compression | Good | Significantly better (Parquet columnar) |
| Write Protocol | Line Protocol via HTTP API | Line Protocol via HTTP API |
| Default Port | 8086 | 8181 |

---

## Data Model Differences

### InfluxDB 2
- Data is organized into **Organizations** → **Buckets** → **Measurements**
- Buckets have retention policies
- Schema is implicit — new tags and fields are created on first write
- Measurements contain tags (indexed strings) and fields (values)

### InfluxDB 3
- Data is organized into **Databases** → **Tables**
- Databases have retention policies
- Schema is more explicit — tables have defined columns
- Tables contain tags (dictionary-encoded), fields, and timestamps
- Supports **Last Value Cache** and **Distinct Value Cache** for fast lookups

---

## Query Language Differences

### Flux (InfluxDB 2 only)
```flux
from(bucket: "sensors")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r._field == "value")
|> group(columns: ["location"])
|> mean()
```

### SQL (InfluxDB 3 only)
```sql
SELECT location, AVG(value) as avg_temp
FROM temperature
WHERE time >= now() - INTERVAL '1 hour'
GROUP BY location
```

### InfluxQL (Both versions)
```sql
SELECT MEAN("value")
FROM "temperature"
WHERE time > now() - 1h
GROUP BY "location"
```

---

## API Differences

| Operation | InfluxDB 2 | InfluxDB 3 |
|-----------|-----------|-----------|
| Write | `POST /api/v2/write?bucket=<name>&org=<org>` | `POST /api/v3/write_lp?db=<name>` (native) or `POST /api/v2/write` (v2 compat) |
| Query (native) | `POST /api/v2/query` (Flux) | `POST /api/v3/query_sql` or `query_influxql` |
| Auth Header | `Token <token>` | `Bearer <token>` |
| Bucket/DB Management | Buckets API (`/api/v2/buckets`) | Database API via InfluxDB 3 MCP or CLI |
| Organization | Required (`/api/v2/orgs`) | Not applicable |

---

## Limits Comparison

| Limit | InfluxDB 2 | InfluxDB 3 Core | InfluxDB 3 Enterprise |
|-------|-----------|----------------|----------------------|
| Databases/Buckets | Unlimited | 5 | 500+ |
| Tables/Measurements | Unlimited | 500 per DB | 500+ per DB |
| Columns per Table | Unlimited | 250 | 500+ |
| Query Languages | Flux, InfluxQL | SQL, InfluxQL | SQL, InfluxQL |

---

## Authentication Differences

### InfluxDB 2
- Uses **operator tokens**, **all-access tokens**, and **read/write tokens**
- Tokens are scoped to organizations and buckets
- Operator token is created during initial setup
- Token header format: `Authorization: Token <token>`

### InfluxDB 3
- Uses **admin tokens** and **database tokens**
- Tokens are scoped to databases with read/write permissions
- Token header format: `Authorization: Bearer <token>`
- Token management via CLI or API

---

## When to Use Which Version

### Choose InfluxDB 2 when:
- You need Flux for complex data transformations and alerting pipelines
- You have existing InfluxDB 2 workloads
- You need the built-in task engine for scheduled processing
- You need the InfluxDB 2 UI for dashboarding and exploration

### Choose InfluxDB 3 when:
- You prefer SQL for querying time-series data
- You need better compression and query performance on large datasets
- You are starting a new project without legacy Flux dependencies
- You need Last Value Cache or Distinct Value Cache features
- You want columnar storage benefits (Parquet/Arrow)

---

## Migration Considerations

- See `influxdb2/migrations.md` for migrating TO InfluxDB 2
- See `influxdb3/migrations.md` for migrating TO InfluxDB 3
- InfluxQL queries are the most portable across versions
- Line Protocol writes are compatible across both versions
- InfluxDB 3 exposes v2-compatible endpoints (`/api/v2/write`, `/api/v2/query`) so existing v2 write workloads can target a v3 instance without code changes
- Flux scripts MUST be rewritten as SQL when migrating from v2 to v3 — the v2 query compatibility endpoint does NOT support Flux
- Bucket → Database, Measurement → Table naming may need adjustment
Loading