feat(importers): add twitterapi.io and generic CSV import support by Frostbite1536 · Pull Request #12 · MaskyS/tweetscope

Frostbite1536 · 2026-03-16T00:41:21Z

Summary

twitterapi.io is a third party API which is much cheaper than the official API. I have been using it for another analysis tool and it works well.

twitterapi_io.py — Live API fetcher + offline JSON loader for twitterapi.io. Fetches any public account's tweets with cursor-based pagination, rate-limit backoff, and retries. Also loads saved JSON responses from disk or in-memory dicts/lists. Maps camelCase API schema to tweetscope's flat _flatten_tweet() row format, including extra engagement fields (quotes, views, bookmarks). Pure stdlib — no external dependencies.
csv_import.py — Generic CSV/TSV importer with 60+ column name aliases for broad compatibility with Twitter data export tools (Chrome extensions, analytics platforms, etc.). Auto-detects delimiters, parses URL fields in multiple formats, and handles common column naming conventions.
Both produce rows schema-compatible with _flatten_tweet(), plugging directly into the ingestion pipeline
29 new tests + 4 existing pass with no regressions

Architecture note

The importers follow a drop-in pattern: import ImportResult from twitter.py when inside tweetscope, fall back to a local dataclass for standalone use. Anyone can add format-specific importers by following the same pattern.

Add two new importers that plug directly into the existing ingestion pipeline by producing rows matching the _flatten_tweet() schema: - twitterapi_io.py: normalises camelCase JSON from the twitterapi.io REST API (accepts raw API response or bare tweet list) - csv_import.py: auto-detects column names from common Twitter CSV export formats (X_Account_Analyzer, Chrome extensions, etc.) with TSV support and flexible column alias mapping Both importers are pure stdlib with no external dependencies, use ImportResult from twitter.py when inside tweetscope, and include a standalone fallback for independent use. Re-exports added to importers/__init__.py. 24 new tests covering schema compatibility, HTML decoding, URL extraction, reply/retweet detection, and column alias resolution. https://claude.ai/code/session_019HSb1hE1xWXAkh6S9ZGub8

…and X_Account_Analyzer CSV Rework importers to combine the best of both implementations: twitterapi_io.py (merged): - Add fetch_twitterapi_io() for live API fetching with pagination, rate-limit backoff, and configurable max_pages - load_twitterapi_io_json() now accepts file paths, dicts, or lists - Add extra engagement fields: quotes, views, bookmarks - Richer profile with followers, following, statuses_count, is_verified - Date parsing handles both ISO and Twitter native formats - Pure stdlib HTTP (urllib) — no external dependencies xanalyzer_csv.py (new, from reference): - Purpose-built for X_Account_Analyzer detailed.csv format - Extracts tweet IDs from URLs (/status/123 → id: "123") - Maps post_type ("reply"/"retweet"/"original") to is_reply/is_retweet - Preserves sentiment_score, sentiment_label, engagement - Auto-discovers summary.csv for profile enrichment (follower counts) - Username filtering for multi-handle CSVs csv_import.py (kept): - Generic CSV/TSV importer for other Twitter export formats - 60+ column name aliases for broad compatibility __init__.py exports all five public functions: fetch_twitterapi_io, load_twitterapi_io_json, load_xanalyzer_csv, load_csv, load_csv_string 46 tests pass (17 twitterapi_io + 12 csv_import + 13 xanalyzer_csv + 4 existing). https://claude.ai/code/session_019HSb1hE1xWXAkh6S9ZGub8

Remove X_Account_Analyzer-specific importer since the tool is not publicly available. The generic csv_import.py and twitterapi_io.py remain as generally useful importers for the community. https://claude.ai/code/session_019HSb1hE1xWXAkh6S9ZGub8

- Remove unnecessary _flatten_twitterapi_tweet alias (no backwards compat needed on new code) and its test - Remove unnecessary Content-Type header on GET requests in _api_request - Fix inconsistent indices validation: URL entities now check isinstance(list) same as media entities - Update csv_import.py docstring to remove reference to private tool - Add missing test coverage: extendedEntities media extraction, TypeError on invalid input, fallback username/display_name 35 tests pass (19 twitterapi_io + 12 csv_import + 4 existing). https://claude.ai/code/session_019HSb1hE1xWXAkh6S9ZGub8

…importer-XM8vb Claude/integrate tweetscope importer xm8vb

vercel · 2026-03-16T00:41:28Z

@Frostbite1536 is attempting to deploy a commit to the maskys' projects Team on Vercel.

A member of the Team first needs to authorize it.

Add back xanalyzer_csv.py and its tests for private use. This was excluded from the upstream PR but belongs in this fork. https://claude.ai/code/session_019HSb1hE1xWXAkh6S9ZGub8

…importer-XM8vb feat(importers): restore X_Account_Analyzer CSV importer

CRITICAL fixes: - Python SSRF: restrict resolve-url to t.co domain only (was open proxy) - Python path traversal: add _safe_dataset_path() with realpath validation to all dataset routes (16+ endpoints) HIGH fixes: - SQL LIKE injection: escape %, _, \ in contains filter with ESCAPE clause - Unbounded URL cache: add eviction at 10k entries (Python) and 5k (JS) - Error message leakage: sanitize internal errors in search routes - Batch DoS: limit resolve-urls to 50 URLs per request (both TS and Python) - HTTP method misuse: change write endpoints from GET to POST - Regex injection: disable regex in Python str.contains (use literal match) MEDIUM fixes: - Graph query limits: add upper bounds (10k chain, 50k descendants) - Frontend memory leaks: add cache eviction to urlResolver, destroy() method to EmbedScheduler for event listener cleanup LLM agent patterns addressed: - Legacy code left unpatched during TS rewrite - No adversarial input consideration (happy path only) - Unbounded operations throughout https://claude.ai/code/session_01KBwYSnfgmhwSNuu9XBDcgA

- Cap max_edges graph parameter at 50k to prevent DoS via massive responses - Cap page parameter at 10k in query routes to prevent excessive offsets - Add 5s timeout to t.co URL resolution fetch to prevent hanging connections - Add 30s timeout to VoyageAI embedding API calls https://claude.ai/code/session_01KBwYSnfgmhwSNuu9XBDcgA

https://claude.ai/code/session_01KBwYSnfgmhwSNuu9XBDcgA

…-rk1xH Claude/audit codebase issues rk1x h

claude and others added 5 commits March 16, 2026 00:18

Merge pull request #1 from Frostbite1536/claude/integrate-tweetscope-…

7c4a340

…importer-XM8vb Claude/integrate tweetscope importer xm8vb

claude and others added 6 commits March 16, 2026 00:43

feat(importers): restore X_Account_Analyzer CSV importer

3e9a9d5

Add back xanalyzer_csv.py and its tests for private use. This was excluded from the upstream PR but belongs in this fork. https://claude.ai/code/session_019HSb1hE1xWXAkh6S9ZGub8

Merge pull request #2 from Frostbite1536/claude/integrate-tweetscope-…

416c86c

…importer-XM8vb feat(importers): restore X_Account_Analyzer CSV importer

fix: narrow bare except to specific exception types in datasets.py

6709afa

https://claude.ai/code/session_01KBwYSnfgmhwSNuu9XBDcgA

Merge pull request #3 from Frostbite1536/claude/audit-codebase-issues…

fdc263a

…-rk1xH Claude/audit codebase issues rk1x h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(importers): add twitterapi.io and generic CSV import support#12

feat(importers): add twitterapi.io and generic CSV import support#12
Frostbite1536 wants to merge 11 commits intoMaskyS:mainfrom
Frostbite1536:main

Frostbite1536 commented Mar 16, 2026

Uh oh!

vercel bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Frostbite1536 commented Mar 16, 2026

Summary

Architecture note

Uh oh!

vercel bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants