Skip to content
View IrumShehryar's full-sized avatar

Highlights

  • Pro

Block or report IrumShehryar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
IrumShehryar/README.md

Hi, I'm Irum Shehryar 👋

NLP Researcher | MSc Information Technology, Metropolia University of Applied Sciences — Graduated May 2026 | Based in Espoo, Finland


🔬 Master's Thesis Research

Context-Aware Document-Level Location Selection for Finnish News Articles: A Hybrid Rule-Based and LLM Approach

Developed in collaboration with Superhood Oy — a Finnish neighbourhood-level news platform. Awarded grade 5/5 — the highest grade at Metropolia University of Applied Sciences.

The research built and evaluated a four-configuration NLP pipeline for extracting and ranking geographic locations at postal-code granularity from Finnish news articles. The system combined:

  • Stanza for Finnish NER and morphological normalisation
  • Geoapify for dynamic geocoding and candidate resolution
  • Postal-first hierarchical ranking for geographic level disambiguation
  • Model Context Protocol (MCP) with Llama 3.3 70B via Groq for contextual reasoning

Key results: 83.33% exact match accuracy with the full hybrid configuration across 60 Finnish news articles — a 16.66 percentage point improvement over the rule-based baseline.

Key finding: Systematic lemmatization failures on Finnish postal-level names point to a data coverage gap in Finnish NER training corpora — not a tool architecture limitation.

📄 Published thesis: URN:NBN:fi:amk-2026051311846

📊 Evaluation dataset & results: thesis-nlp-evaluation


🌱 Current Focus

  • Pursuing PhD opportunities in Clinical NLP, Biomedicine and cross-lingual health information extraction
  • Advance NLP and Machine Learning certifications in progress

🛠️ Technical Skills

Area Tools & Technologies
NLP Stanza, Finnish NER, named entity recognition, lemmatization, evaluation framework design, error analysis
LLM Integration MCP (Model Context Protocol), Groq, Llama 3.3 70B, FastMCP
Backend Python, FastAPI, Flask, REST APIs
Databases PostgreSQL, MongoDB, MySQL
ML Regression, classification fundamentals, ablation study design
Data Pandas, NumPy

📌 Featured Projects

🔍 Finnish NLP Pipeline — Master's Thesis (private — Superhood Oy)

Four-configuration hybrid pipeline for Finnish postal-level location extraction. Evaluation dataset and results available in the thesis evaluation repo above. Grade 5/5.

Annotated evaluation dataset, ground truth annotations, and results across all four pipeline configurations. This is the live evaluation repository for the thesis research.

Hands-on exploration of Model Context Protocol — the tool orchestration framework used in the thesis disambiguation layer.

University coursework implementing regression and classification algorithms in Python including Ridge/Lasso regression, SVM, KNN, and structured evaluation.

Prototype language learning app providing real-time feedback on Finnish grammar and pronunciation.

Contributed backend REST API development for a VoIP application supporting calling and scheduling meetings. Implemented CRUD operations and database integration using Flask and MySQL.

Notebooks and experiments from NLP and Machine Learning specialisation coursework. Updated continuously as certifications progress.


📍 Contact

📧 irum.shehryar@gmail.com

🔗 LinkedIn

🌐 Portfolio

Pinned Loading

  1. thesis-nlp-evaluation thesis-nlp-evaluation Public

    Evaluation dataset and results for Finnish postal-level location extraction — Master's Thesis, Metropolia 2026

    Python

  2. ML-NLP-Coursework ML-NLP-Coursework Public

    Jupyter Notebook

  3. Finnish-mentor Finnish-mentor Public

    This repo contains code for a finnish-mentor prototype-app, which translates ,corrects and provides real time feedback regarding language, grammar and pronunciations.,

    JavaScript

  4. MCP-practice MCP-practice Public

    This repo contains practice snippets for MCP

    Python

  5. portfolio portfolio Public

    This repo contains my personal portfolio

    HTML

  6. AI-with-Python AI-with-Python Public

    This repo contains University assignments implementing Regression and classification

    Jupyter Notebook