Skip to content
View Sagnik-Chakravarty's full-sized avatar

Highlights

  • Pro

Block or report Sagnik-Chakravarty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Sagnik-Chakravarty/README.md

Hi, I'm Sagnik Chakravarty

Survey & Data Science · NLP · Computational Social Science · Public Opinion Measurement

Portfolio · GitHub · LinkedIn · Google Scholar

profile views Survey Methodology Data Science NLP Computational Social Science


About Me

I am a Survey and Data Science graduate student at the University of Maryland, College Park, with prior graduate training in Data Science and undergraduate training in Statistics.

My work sits at the intersection of survey methodology, machine learning, NLP, public discourse analysis, and computational social science. I am especially interested in how data are generated, measured, validated, and interpreted — whether the data come from surveys, social media, administrative records, news, or large language models.

I build projects that connect statistical rigor with practical data systems: survey response modeling, LLM-assisted discourse coding, sentiment and stance analysis, policy measurement, multilevel modeling, and reproducible research workflows.


Current Research Focus

  • Survey methodology, nonresponse, mode effects, and Total Survey Error
  • LLM evaluation for public opinion and digital trace measurement
  • NLP pipelines for framing, metaphor, stance, and sentiment analysis
  • Computational social science using news, Reddit, Bluesky, YouTube, and administrative data
  • Multilevel modeling, causal inference, and interpretable machine learning
  • Public-facing research tools, dashboards, and reproducible workflows

Featured Projects

LLM-assisted metaphor, stance, and framing analysis of AI discourse across news and social media.

Methods: LLM annotation, metaphor detection, stance classification, embeddings, NLP pipelines

Survey methodology project evaluating postcard reminders, mail vs. web mode effects, response rates, and subgroup nonresponse.

Methods: response-rate analysis, bootstrap inference, logistic regression, chi-square tests

Policy/data project comparing immigration narratives from news and Reddit with CBP and ICE enforcement indicators.

Methods: monthly aggregation, z-scores, divergence measures, regression, residual diagnostics

AAPOR-selected project comparing LLM-generated EV sentiment with observed public discourse from Reddit, news, and online data.

Methods: sentiment analysis, LLM evaluation, platform comparison, public opinion measurement

Interpretable ML project combining Global Terrorism Database event records with international news framing to analyze terrorism severity rankings.

Methods: random forest, decision trees, feature importance, residual analysis, media framing

Public narrative analytics project studying how ThuggerDaily activity temporally aligned with YSL RICO trial discourse across sentiment, engagement, volume, and topic prevalence.

Methods: schema standardization, sentiment scoring, engagement normalization, topic grouping, event-window analysis, pre/post tests, lag correlations, regression summaries, DiD, interrupted time series, Streamlit, Neon/Postgres


Technical Stack

Methods

Survey methodology | sampling | nonresponse | mode effects | Total Survey Error
NLP | sentiment analysis | stance classification | metaphor/framing analysis
LLM evaluation | prompt workflows | text-as-data | computational social science
Multilevel modeling | causal inference | regression | interpretable machine learning
Data cleaning | reproducible reporting | dashboards | research communication

GitHub Stats

GitHub profile summary

GitHub detailed stats Productive time

GitHub streak

Language and Repository Breakdown

Repos per language Most committed languages

Commit Activity

GitHub activity graph


Selected Research and Conference Work

  • AAPOR 80th Annual Conference — EV public sentiment and LLM comparison project
  • IISA 2025 — Media-aware GTI ranking analysis
  • NCSET Best Paper — Topological Data Analysis on DNA/RNA structures
  • NCSET Best Paper — PageRank and HITS citation-network analysis

What I Am Building Toward

I am currently focused on research and applied data roles that combine:

  • rigorous survey/statistical methodology,
  • large-scale text and public discourse data,
  • LLM evaluation and AI-assisted coding workflows,
  • policy, social, and behavioral data analysis,
  • reproducible data products for research and decision-making.

Contact

Pinned Loading

  1. FrameScope FrameScope Public

    End-to-end pipeline for collecting, labeling, and analyzing metaphor framing and stance in Reddit and news discourse using LLMs.

    Jupyter Notebook 1

  2. Media-Aware-GTI-ML Media-Aware-GTI-ML Public

    Interpretable ML project combining Global Terrorism Database event records with international news framing to analyze terrorism severity rankings, residuals, and feature importance.

    Python 1

  3. AAPOR_EV_Project AAPOR_EV_Project Public

    LLM-vs-digital-trace sentiment analysis project comparing electric vehicle public opinion across Reddit, news, and model-generated sentiment; selected for AAPOR 2025.

    HTML 1 1

  4. ASHA ASHA Public

    Survey methodology project evaluating ASHA response behavior, postcard reminder effects, and postal mail vs. web survey mode differences using R, bootstrap inference, and logistic regression.

    R 1

  5. Immigration-Narrative-Vs-Enforcement Immigration-Narrative-Vs-Enforcement Public

    R/Quarto analysis comparing immigration narratives with CBP and ICE enforcement indicators using GDELT news, Reddit sentiment, z-score standardization, divergence measures, regression, and residual…

    R

  6. NarrativePulse NarrativePulse Public

    End-to-end research dashboard measuring shifts in public attention and sentiment around the YSL trial, with a focus on ThuggerDaily’s temporal alignment with major legal events

    Jupyter Notebook