Skip to content
View kiselyovd's full-sized avatar

Block or report kiselyovd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kiselyovd/README.md

Daniil Kiselyov · Lead ML Engineer

Lead ML Engineer. I build LLM systems, RAG, and multimodal pipelines for enterprise clients in on-premise and air-gapped environments. 5.5 years in ML and engineering; the last three years end-to-end LLM, OCR/VLM, and RAG platforms in production.

Fullstack and team-lead background (Django/React) means I ship models as services, not proof-of-concepts.


Currently

  • Multi-platform AI for BIM design - production system for 160+ engineers at a major developer. Scenario Engine (n8n-style visual DAG with LLM-aware nodes, versioning, rollback) replacing an earlier multi-agent architecture. Polyglot monorepo: FastAPI backend, Tauri 2 + React 19 desktop, C# Revit plugin with a custom MCP protocol. Full MLOps contour - OpenTelemetry + Langfuse + Prometheus + Grafana + Jaeger.
  • Visual RAG platform for engineering documentation - hybrid retrieval (Visual RAG on PDF/DWG via Jina v4 + Text2SQL via Qwen on vLLM), four domain profiles, air-gapped deployments. Three enterprise clients shipped.

Text2BIM / Scenario Engine architecture


Python PyTorch FastAPI vLLM Qwen Langfuse OpenTelemetry Tauri Rust Docker HuggingFace Ruff uv


Enterprise work

Category Role What Stack highlights
BIM / AEC Full-stack, solo Multi-platform AI for BIM design, 160+ engineers; Scenario Engine, MCP, OIDC FastAPI · Tauri 2 · C#/Revit · MCP
Visual RAG Solo, 3 clients Hybrid Visual RAG + Text2SQL for engineering docs, 4 domain profiles, air-gapped vLLM · ChromaDB · Jina v4 · PaddleOCR
Industrial B2B Tech Lead, team 1+1 LLM commercial-proposal automation, VLM field extraction on Qwen with xgrammar constrained decoding Qwen 3.6 · xgrammar · networkx
EdTech VR ML advisor / mentor VR language-learning AI; Whisper + phoneme-alignment for pronunciation, MMS-TTS fine-tune on a low-resource creole Whisper · MMS-TTS

Presale & leadership

  • 10+ tenders and presale analyses over 10 months - multilingual ticket systems, LLM infrastructure cost modelling, HR analytics, GNN for BIM, air-gapped platforms for design institutes.
  • Tech lead on the BIM AI system - architecture, decomposition, code review, the Scenario Engine rewrite. CQRS split, OIDC + RBAC, full audit log.
  • Hiring & mentoring - 10 interviews on LLM + RecSys, 2 candidates accepted by the client. ML mentoring for external teams on speech pipelines and low-resource TTS/ASR.

🧪 Open-source ML Portfolio

Five production-grade ML projects on a shared cookiecutter template. Full stack: PyTorch Lightning · Hydra · MLflow · DVC · FastAPI · Docker · GitHub Actions · MkDocs · HuggingFace Hub.

Project Task Main model Metrics Status
chest-xray-classifier CI HF 3-class pneumonia classification ConvNeXt-V2-Tiny acc 91.3% · F1 90.3% · AUROC 97.5% ✅ v0.1.0
brain-mri-segmentation CI HF Binary brain-tumor segmentation SegFormer-B2 Dice 65.5% · IoU 66.2% · Pixel acc 99.7% ✅ v0.1.0
vehicle-keypoints CI HF 14-keypoint car pose (CarFusion, n=12 761) YOLO26-pose + ViTPose-S (baseline) OKS-mAP 22.0% · mAP50 35.0% · PCK@0.05 49.6% ✅ v0.1.0
cardio-risk-rf CI HF Tabular cardiovascular-risk classification (n=70 000, test n=10 501) LightGBM + RandomForest (baseline, ROC-AUC 79.5%) ROC-AUC 79.8% · PR-AUC 78.1% · F1 73.8% · Brier 0.182 ✅ v0.1.0
grnti-text-classifier CI HF Russian scientific-text classification, 28 GRNTI codes (test n=2 772) XLM-RoBERTa-base + ruBERT-base (baseline, Top-1 72.9%) Top-1 72.4% · Top-5 96.8% · Macro F1 72.3% ✅ v0.1.0
ml-project-template CI Cookiecutter scaffold for the five above - 12/12 meta-tests green ✅ Stable

Shared features across all five models: patient-/scene-level splits with no leakage, bilingual EN+RU README, multi-stage Docker, HF Hub model cards with widgets, DVC-tracked artefacts, Python 3.12+3.13 matrix CI, ruff + mypy + deptry + bandit + interrogate + pre-commit quality gates, self-hosted coverage badges.


Activity

GitHub Metrics


🇷🇺 Русская версия

Lead ML Engineer. Строю LLM-системы, RAG и мультимодальные пайплайны для корпоративных заказчиков в on-premise и air-gapped контурах. 5.5 лет в ML и разработке; последние три года - LLM, OCR/VLM и RAG-платформы в проде от архитектуры до поддержки.

Fullstack и тимлид-бэкграунд (Django/React) помогает доводить модели до полноценных сервисов, а не PoC.

Сейчас

  • Мультиплатформенная AI-система для BIM-проектирования - продакшен, 160+ инженеров крупного девелопера. Scenario Engine (визуальный DAG с LLM-нодами, версионированием, rollback) вместо прежней мультиагентной архитектуры. Polyglot-монорепо: FastAPI-бэкенд, Tauri 2 + React 19 desktop, C# плагин для Revit с собственным MCP-протоколом. MLOps-контур: OpenTelemetry + Langfuse + Prometheus + Grafana + Jaeger.
  • Visual RAG-платформа для инженерной документации - гибридный ретривал (Visual RAG по PDF/DWG на Jina v4 + Text2SQL через Qwen на vLLM), 4 domain-профиля, air-gapped развёртывания. Три корпоративных заказчика закрыто.

Enterprise-проекты

Категория Роль Что Стек
BIM / AEC Full-stack, соло Мультиплатформенная AI-система, 160+ инженеров; Scenario Engine, MCP, OIDC FastAPI · Tauri 2 · C#/Revit · MCP
Visual RAG Соло, 3 заказчика Гибрид Visual RAG + Text2SQL для инженерной документации, 4 domain-профиля, air-gapped vLLM · ChromaDB · Jina v4 · PaddleOCR
Промышленный B2B Tech Lead, команда 1+1 LLM-автоматизация КП, VLM-экстракция на Qwen через xgrammar Qwen 3.6 · xgrammar · networkx
EdTech / VR ML-ментор VR-платформа для изучения языков; Whisper + phoneme-alignment, MMS-TTS под низкоресурсный креольский Whisper · MMS-TTS

Пресейл и лидерство

  • 10+ тендеров и пресейл-аналитики за 10 месяцев - мультиязычные тикет-системы, cost-модели LLM-инфраструктуры, HR-аналитика, GNN-планировщик для BIM, air-gapped платформы для проектных институтов.
  • Техлид на BIM AI-системе - архитектура, декомпозиция, код-ревью, rewrite на Scenario Engine. CQRS, OIDC + RBAC, полный audit-log.
  • Найм и менторинг - 10 интервью LLM + RecSys, 2 кандидата приняты заказчиком. ML-менторинг внешних команд по речевым пайплайнам и low-resource TTS/ASR.

🧪 Open-source ML-портфолио

Пять production-grade ML-проектов на общем cookiecutter-шаблоне. Тот же набор моделей и метрик, что в английской версии таблицы выше - карточки на GitHub и HuggingFace доступны по тем же ссылкам.

Pinned Loading

  1. chest-xray-classifier chest-xray-classifier Public

    3-class pneumonia classifier (normal/bacterial/viral) — ConvNeXt-V2-Tiny + DINOv2 baseline, FastAPI serving, HF Hub model

    Python

  2. brain-mri-segmentation brain-mri-segmentation Public

    Binary brain-tumor MRI segmentation (LGG) — SegFormer-B2 + U-Net baseline, FastAPI serving, HF Hub model

    Python

  3. vehicle-keypoints vehicle-keypoints Public

    14-keypoint car pose estimation on CarFusion — YOLO26-pose main + ViTPose-S baseline, FastAPI serving, HF Hub model

    Python

  4. cardio-risk-rf cardio-risk-rf Public

    Production-grade cardiovascular risk tabular classifier — LightGBM + SHAP main, RandomForest baseline, FastAPI serving, HF Hub model.

    Jupyter Notebook

  5. grnti-text-classifier grnti-text-classifier Public

    Production-grade Russian multi-class text classifier (GRNTI) - XLM-RoBERTa main, ruBERT baseline, FastAPI serving, HF Hub model.

    Python

  6. ml-project-template ml-project-template Public

    Production-grade cookiecutter template for ML projects (PyTorch Lightning + Hydra + MLflow + DVC + FastAPI)

    Python