Irum Shehryar IrumShehryar

Hi, I'm Irum Shehryar 👋

NLP Researcher | MSc Information Technology, Metropolia University of Applied Sciences — Graduated May 2026 | Based in Espoo, Finland

🔬 Master's Thesis Research

Context-Aware Document-Level Location Selection for Finnish News Articles: A Hybrid Rule-Based and LLM Approach

Developed in collaboration with Superhood Oy — a Finnish neighbourhood-level news platform. Awarded grade 5/5 — the highest grade at Metropolia University of Applied Sciences.

The research built and evaluated a four-configuration NLP pipeline for extracting and ranking geographic locations at postal-code granularity from Finnish news articles. The system combined:

Stanza for Finnish NER and morphological normalisation
Geoapify for dynamic geocoding and candidate resolution
Postal-first hierarchical ranking for geographic level disambiguation
Model Context Protocol (MCP) with Llama 3.3 70B via Groq for contextual reasoning

Key results: 83.33% exact match accuracy with the full hybrid configuration across 60 Finnish news articles — a 16.66 percentage point improvement over the rule-based baseline.

Key finding: Systematic lemmatization failures on Finnish postal-level names point to a data coverage gap in Finnish NER training corpora — not a tool architecture limitation.

📄 Published thesis: URN:NBN:fi:amk-2026051311846

📊 Evaluation dataset & results: thesis-nlp-evaluation

🌱 Current Focus

Pursuing PhD opportunities in Clinical NLP, Biomedicine and cross-lingual health information extraction —
Advance NLP and Machine Learning certifications in progress

🛠️ Technical Skills

Area	Tools & Technologies
NLP	Stanza, Finnish NER, named entity recognition, lemmatization, evaluation framework design, error analysis
LLM Integration	MCP (Model Context Protocol), Groq, Llama 3.3 70B, FastMCP
Backend	Python, FastAPI, Flask, REST APIs
Databases	PostgreSQL, MongoDB, MySQL
ML	Regression, classification fundamentals, ablation study design
Data	Pandas, NumPy

📌 Featured Projects

🔍 Finnish NLP Pipeline — Master's Thesis (private — Superhood Oy)

Four-configuration hybrid pipeline for Finnish postal-level location extraction. Evaluation dataset and results available in the thesis evaluation repo above. Grade 5/5.

📊 thesis-nlp-evaluation

Annotated evaluation dataset, ground truth annotations, and results across all four pipeline configurations. This is the live evaluation repository for the thesis research.

🤖 MCP-practice

Hands-on exploration of Model Context Protocol — the tool orchestration framework used in the thesis disambiguation layer.

🧠 AI-with-Python

University coursework implementing regression and classification algorithms in Python including Ridge/Lasso regression, SVM, KNN, and structured evaluation.

🇫🇮 Finnish-mentor

Prototype language learning app providing real-time feedback on Finnish grammar and pronunciation.

📞 AsiakasGroupOy

Contributed backend REST API development for a VoIP application supporting calling and scheduling meetings. Implemented CRUD operations and database integration using Flask and MySQL.

📚 ML-NLP-coursework

Notebooks and experiments from NLP and Machine Learning specialisation coursework. Updated continuously as certifications progress.

📍 Contact

📧 irum.shehryar@gmail.com

🔗 LinkedIn

🌐 Portfolio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly