Agent-first Bluesky data filtering and monitoring CLI built with Effect.
Sync posts from timelines, feeds, lists, authors, and the real-time Jetstream firehose into local SQLite stores. Query, filter, derive, and export data with a powerful filter DSL and multiple output formats. Designed to be operated by CLI coding agents — Claude Code, Codex CLI, and similar tools — with structured errors, machine-readable output, and self-describing capabilities.
bun add -g @mepuka/skygentgit clone https://github.com/mepuka/skygent-bsky.git
cd skygent-bsky
bun install
bun run index.ts --helpDownload a prebuilt binary from GitHub Releases, or build locally:
bun run build:binary
./skygent --helpCross-platform targets: linux-x64, linux-arm64, darwin-x64, darwin-arm64.
Skygent is built so an LLM agent can discover commands, interpret output, recover from errors, and chain operations without human intervention.
Self-describing capabilities — an agent's first call can be skygent capabilities --format json, which returns every command, filter predicate, output format, and source type in machine-readable JSON.
Structured error envelope — set SKYGENT_JSON_ERRORS=1 and every error returns a JSON object with type, code, message, suggestion, and details. Validation errors include received, expected, and fix fields so the agent can self-correct:
{
"type": "StoreNotFound",
"code": "STORE_NOT_FOUND",
"exitCode": 3,
"message": "Store 'my-stor' not found",
"suggestion": "Run: skygent store list"
}Semantic exit codes — 0 success, 2 input validation, 3 store not found, 5 network error, 7 storage I/O, 8 filter compilation. Agents can implement retry strategies based on the code alone.
NDJSON everywhere — every command supports --format ndjson for streaming, one-JSON-object-per-line output. Pair with the pipe command to chain filter operations without intermediate files.
Dry-run everything — sync, derive, and watch commands all support --dry-run so agents can validate before committing. Field projection (--fields @minimal) and --count reduce token cost when full output isn't needed.
Inline help for filter authoring — --filter-help on any filtering command dumps the full predicate reference with examples.
HTTP mode — skygent store serve exposes a REST API with SSE streaming for agents that prefer HTTP over CLI.
Skygent needs a Bluesky handle and app password. Credentials are resolved in this order:
- CLI flags:
--identifierand--password - Environment variables:
SKYGENT_IDENTIFIERandSKYGENT_PASSWORD - Encrypted credential file (
~/.skygent/credentials.json, requires a credentials key)
Credentials key resolution order:
- Environment:
SKYGENT_CREDENTIALS_KEY - Keyfile:
~/.skygent/credentials.key
Manage the encrypted credential file with skygent config credentials.
Manage the credentials key with skygent config credentials key set|status|clear.
The simplest setup:
cp .env.example .env
# Edit .env with your handle and app passwordBun loads .env automatically.
# Create a store
skygent store create my-store
# Sync your timeline
skygent sync timeline --store my-store
# Query recent posts
skygent query my-store --limit 10 --format table
# Stream live posts from Jetstream
skygent watch jetstream --store my-store
# Derive a filtered store
skygent derive my-store ai-posts --filter 'hashtag:#ai OR hashtag:#ml'
# Search posts locally
skygent search posts "effect typescript" --store my-store
# Analyze social graph
skygent graph interactions my-store --format table
skygent graph centrality my-store --metric pagerank --limit 10| Subcommand | Description |
|---|---|
store create <name> |
Create a new store |
store list |
List all stores |
store show <name> |
Show store config and metadata |
store update <name> |
Update store metadata |
store rename <from> <to> |
Rename a store |
store delete <name> --force |
Delete a store |
store stats <name> |
Show store statistics |
store info <name> |
Alias for store stats |
store analytics <name> |
Time-bucketed analytics (posts, authors, engagement by day/hour) |
store summary |
Summarize all stores |
store tree |
Visualize store lineage |
store materialize <name> |
Materialize filter outputs to disk |
store sources <name> |
List configured sources for a store |
store add-source <name> |
Add an author, feed, list, timeline, or jetstream source |
store remove-source <name> <id> |
Remove a configured source |
store authors <name> |
List authors with stats |
store remove-author <name> <actor> |
Remove an author's posts from a store |
store cache <name> |
Cache image embeds |
store cache-status <name> |
Report image cache coverage |
store cache-clean |
Clear the image cache |
store cache-sweep <name> |
Sweep orphaned image cache files |
store cache-ttl-sweep <name> |
Sweep expired cache files (TTL-based) |
store serve |
Start HTTP server with SSE streaming (--port, --poll-interval) |
| Subcommand | Description |
|---|---|
sync <store> |
Sync all configured sources |
sync timeline |
Sync your timeline |
sync feed <uri> |
Sync a feed generator |
sync list <uri> |
Sync a list feed |
sync author <actor> |
Sync posts from an author |
sync thread <uri> |
Sync a thread (parents + replies) |
sync notifications |
Sync notifications |
sync jetstream |
Sync from Jetstream firehose |
All sync commands accept --store, --filter, --post-filter, --quiet, --refresh, --dry-run, --limit, --cache-images, and --filter-help.
Same subcommands as sync, with continuous polling. Supports --interval (default: 30s), --max-cycles, and --until.
skygent watch timeline --store my-store --interval "5 minutes"
skygent watch jetstream --store my-store --until "10 minutes"skygent query my-store --limit 25 --format table
skygent query my-store --filter 'hashtag:#ai' --sort desc --format json
skygent query my-store --range 2024-01-01T00:00:00Z..2024-01-31T00:00:00Z
skygent query my-store --fields @minimal --newest-first
skygent query my-store --fields @images --resolve-images
skygent query my-store --extract-images --format json
skygent query store-a,store-b --format ndjson
skygent query my-store --filter 'hashtag:#ai' --count-by hashtag
skygent query my-store --filter 'engagement:minLikes=50' --countFormats: json, ndjson, table, markdown, compact, card, thread
Sorting: --sort asc|desc|by-likes|by-reposts|by-replies|by-quotes|by-engagement, --newest-first
Field presets: @minimal, @social, @full, @images, @embeds, @media, or comma-separated field names with dot notation (use * to traverse arrays, e.g. images.*.alt).
Image options: --extract-images, --resolve-images, --cache-images, --no-cache-images-thumbnails.
Aggregation: --count for totals, --count-by author|hashtag|date|hour for breakdowns.
Multi-store queries accept comma-separated store lists and include store names in output by default.
Apply a filter to a source store to produce a new filtered store:
skygent derive source-store target-store --filter 'hashtag:#ai'Modes:
event-time(default) — Pure filters only, replayablederive-time— Allows effectful filters (Trending, HasValidLinks)
Supports --include-author, --exclude-author, --reset --yes, and --dry-run.
Run skygent filter help for a compact list of predicates and aliases.
| Subcommand | Description |
|---|---|
filter create <name> |
Save a named filter |
filter list |
List saved filters |
filter show <name> |
Show a saved filter |
filter delete <name> |
Delete a saved filter |
filter help |
Show filter DSL and JSON help |
filter validate |
Validate a filter expression |
filter test |
Test a filter against a post |
filter explain |
Explain why a post matches |
filter benchmark |
Benchmark filter performance |
filter describe |
Describe a filter in plain text |
| Subcommand | Description |
|---|---|
search posts <query> |
Search posts locally or --network |
search handles <query> |
Search Bluesky profiles |
search feeds <query> |
Search feed generators |
Network search supports --ingest --store <name> to save results locally.
| Subcommand | Description |
|---|---|
graph followers <actor> |
List followers |
graph follows <actor> |
List follows |
graph known-followers <actor> |
Mutual followers |
graph relationships <actor> |
Relationship status (--others actor1,actor2) |
graph interactions <store> |
Build interaction network from store posts |
graph centrality <store> |
Rank actors by centrality (PageRank or degree) |
graph communities <store> |
Detect communities (label propagation) |
graph stores |
Cross-store topology from lineage data |
graph lists <actor> |
Lists created by actor |
graph list <uri> |
View a list's members |
graph blocks |
Your blocked accounts |
graph mutes |
Your muted accounts |
| Subcommand | Description |
|---|---|
feed show <uri> |
Show feed details |
feed batch <uri>... |
Fetch multiple feeds |
feed by <actor> |
List feeds by an actor |
| Subcommand | Description |
|---|---|
post likes <uri> |
Who liked a post |
post reposted-by <uri> |
Who reposted |
post quotes <uri> |
Quote posts |
| Subcommand | Description |
|---|---|
view thread <uri> |
Display a thread |
view status <view> <source> |
Check if a derived view is stale |
skygent digest my-store --since 24h --format tableGenerates a summary of store content over a time range: top posts, hashtags, active authors, volume by hour/day.
skygent actor resolve alice.bsky.social bob.bsky.social --format jsonResolves handles to DIDs and vice versa. Supports --cache-only for offline use and --strict for API verification.
cat posts.ndjson | skygent pipe --filter 'hashtag:#ai' --on-error skipReads NDJSON from stdin, applies a filter expression, emits matching posts. Useful for chaining with other tools.
| Subcommand | Description |
|---|---|
image-cache sweep |
Sweep expired images (--force to delete, default: dry-run) |
skygent capabilities --format jsonReturns CLI version, all commands, filter predicates with examples, supported output formats, and source types. Designed for agent bootstrapping.
skygent config check # Run health checksFilters are passed via --filter (DSL string) or --filter-json (JSON AST).
| Filter | Example | Description |
|---|---|---|
hashtag:#tag |
hashtag:#ai |
Match posts with hashtag |
hashtagin:#a,#b |
hashtagin:#ai,#ml,#dl |
Match any of several hashtags |
author:handle |
author:alice.bsky.social |
Match posts by author |
authorin:a,b |
authorin:alice.bsky.social,bob.bsky.social |
Match any of several authors |
contains:"text" |
contains:"bluesky" |
Text search (case-insensitive) |
regex:/pattern/flags |
regex:/hello|world/i |
Regex match |
language:code |
language:en,es |
Match languages |
date:<start>..<end> |
date:2024-01-01..2024-01-31 |
Date range (ISO 8601) |
since:duration |
since:24h |
Posts from last N duration |
until:timestamp |
until:2024-01-31T23:59:59Z |
Posts until timestamp |
age:comparator |
age:<24h, age:>=7d |
Post age with comparator |
engagement:thresholds |
engagement:minLikes=100,minReposts=5 |
Engagement thresholds |
is:type |
is:reply, is:quote, is:repost, is:original |
Post type |
has:media_type |
has:images, has:video, has:links, has:media, has:embed |
Media presence |
min-images:N |
min-images:2 |
Minimum image count |
alt-text:text |
alt-text:"accessibility" |
Alt text contains text or regex |
no-alt-text |
no-alt-text |
Images without alt text |
link-contains:text |
link-contains:substack.com |
Links containing substring |
links |
links, links:/pattern/ |
Valid external links (effectful) |
trending:#tag |
trending:#ai |
Trending hashtag (effectful) |
@saved-name |
@my-filter |
Reference a saved filter |
from: = author:, tag: = hashtag:, text: = contains:, lang: = language:, tags: = hashtagin:, authors: = authorin:
hashtag:#ai AND author:user.bsky.social
hashtag:#ai OR hashtag:#ml
NOT hashtag:#spam
(hashtag:#ai OR hashtag:#ml) AND engagement:minLikes=10
Operators: AND / &&, OR / ||, NOT / !, parentheses for grouping.
trending and links (valid link checking) require network access and cannot be used in event-time derivation mode. Both support onError=include|exclude|retry policies.
| Variable | Default | Description |
|---|---|---|
SKYGENT_IDENTIFIER |
— | Bluesky handle or DID |
SKYGENT_PASSWORD |
— | App password |
SKYGENT_CREDENTIALS_KEY |
— | Master key for encrypted credential storage |
SKYGENT_SERVICE |
https://bsky.social |
Bluesky service URL |
SKYGENT_STORE_ROOT |
~/.skygent |
Root storage directory |
SKYGENT_OUTPUT_FORMAT |
ndjson |
Default output format |
SKYGENT_JSON_ERRORS |
false |
Enable structured JSON error envelope |
SKYGENT_BSKY_RATE_LIMIT |
250 millis |
Min delay between API calls |
SKYGENT_BSKY_RETRY_MAX |
5 |
Max retry attempts |
SKYGENT_SYNC_CONCURRENCY |
5 |
Concurrent sync workers |
--full— Use verbose JSON output (compact is the default)--quiet— Suppress progress output--log-format json|human— Control log format
Skygent is built entirely on Effect with a layered service architecture:
- Domain (
src/domain/) — Data models for posts, stores, filters, events, and derivations using Effect Schema - Services (
src/services/) — Business logic: Bluesky API client, SQLite store, sync engine, filter runtime, derivation engine, graph builder - CLI (
src/cli/) — Command definitions, output formatting, error handling
Stores are local SQLite databases with an append-only event log. Each store has its own index.sqlite with FTS5 full-text search, WAL mode, and optimized pragmas. Derivations track lineage between stores and support incremental processing with checkpoints.
The sync engine supports resumable checkpoints, configurable concurrency, and a four-stage pipeline: source fetch, parse, filter, store. The Jetstream engine provides a separate real-time path with batched commit processing.
- Passwords are handled as
Redactedvalues and never logged - Encrypted credential storage uses AES-GCM with PBKDF2 (100,000 iterations)
- Filesystem permissions enforced (0700 directories, 0600 files)
- Avoid putting passwords in config files; use environment variables or the credential store
Detailed docs are in docs/:
MIT