Bioinformatics Platform

A full-stack, visual bioinformatics pipeline execution platform built for the Turkish market. Drag-and-drop a pipeline on a canvas, upload your genomic data, pay per run, and get results — all in the browser. Designed for KVKK compliance and hosted on Turkish cloud infrastructure.

What It Does

Visual pipeline builder — drag nodes onto a canvas, connect them, and run
nf-core/sarek — variant calling (GATK HaplotypeCaller, DeepVariant, Strelka2, FreeBayes) — primary MVP workflow
nf-core pipelines — rnaseq, atacseq, methylseq, ampliseq, chipseq, fetchngs
Snakemake workflows — 4700+ community workflows + 454 wrappers
BioScript — custom bash script runs inside a Docker image pre-loaded with bio tools
Custom Linux pipelines — SPAdes, Kraken2, Prokka, IQ-TREE 2, Flye (de novo, metagenomics, annotation, phylogenomics, long-read assembly)
Mutation Assessment — post-sarek pipeline: annotates VCF variants against 17 public databases and generates a PDF report
CNV/SV analysis — structural variant parsing from VCF (DEL, DUP, INV, INS, BND, CNV, TRA)
Paired-end FASTQ — upload R1 + R2 and both get passed to the runner
Pay-per-run billing — Stripe (global) or iyzico (Turkey) checkout, cost estimated before every job
Live results — volcano plots, VCF tables, MultiQC HTML, file lists — auto-detected

Tech Stack

Layer	Technology
Frontend	React 18 + TypeScript + Vite, @xyflow/react v12
Backend	FastAPI + Uvicorn (async)
Job queue	Celery 5 + Redis 7
Database	PostgreSQL 16 + SQLAlchemy 2 + Alembic
Auth	JWT + RBAC (roles) + MFA/TOTP (pyotp)
Payments	Stripe Checkout (global) + iyzico (Turkey)
Cloud runners	Turkish cloud VMs (Huawei / Turkcell / CloudSigma) or AWS Batch
Storage	Local filesystem (dev), AWS S3, or Turkish S3-compatible (Huawei OBS / Turkcell nDepo)
Observability	Sentry (error tracking) + Prometheus `/metrics`
Containers	Docker + Docker Compose

Running Locally

Prerequisites

Docker Desktop (or Docker Engine + Docker Compose plugin)
8 GB RAM available for Docker (4 GB minimum)

Start (demo / debug mode — no cloud credentials needed)

git clone <your-repo-url>
cd bioinformatics_platform
docker compose up

Wait for:

backend-1  | INFO:     Application startup complete.
frontend-1 | ➜  Local:   http://localhost:5173/

Service	URL
App	http://localhost:5173
API	http://localhost:8000
API docs	http://localhost:8000/docs
Prometheus metrics	http://localhost:8000/metrics

Storage → files saved to a local Docker volume (/uploads)
sarek / nf-core → mock runner returns realistic fake results in ~10 s
Snakemake / BioScript / Custom → mock runners, no tools installed
Assessment pipeline → fully real (queries ClinVar, gnomAD, CADD, etc. live)
Payments → Stripe and iyzico are optional; jobs can be created directly via the API

Running modes

There are four modes for every runner, controlled by NEXTFLOW_BACKEND, SNAKEMAKE_BACKEND, and BIOSCRIPT_BACKEND:

Mode	Value	What happens	When to use
Mock (default)	`mock`	Returns realistic fake data in ~10 s	Demos, UI development
Local	`local`	Runs real pipeline via Docker on your machine	Testing locally
Turkish cloud	`turkishcloud`	Spins up an ephemeral VM on Huawei/Turkcell/CloudSigma	Production (Turkey)
AWS Batch	`awsbatch`	Submits to AWS Batch	Production (AWS)

# Demo mode (default — no setup needed)
docker compose up

# Turkish cloud production
NEXTFLOW_BACKEND=turkishcloud SNAKEMAKE_BACKEND=turkishcloud docker compose up

# AWS Batch production
NEXTFLOW_BACKEND=awsbatch SNAKEMAKE_BACKEND=awsbatch docker compose up

Stop

docker compose down          # keep database volumes
docker compose down -v       # full reset (wipes all data)

Useful commands

docker compose logs -f
docker compose logs -f worker
docker compose exec frontend sh -c "cd /app && npx tsc --noEmit"
docker compose exec backend bash
docker compose exec backend pytest tests/ -v
docker compose restart worker

Troubleshooting

Problem	Fix
Port 5432 in use	Stop local Postgres or change `5432:5432` in docker-compose.yml
Port 6380 in use	Change `6380:6379` to another port
Assessment job fails with network errors	The assessment runner calls live public APIs — check internet connection
Jobs stuck in `pending`	Worker crashed — run `docker compose logs worker`
Turkish cloud VM never completes	Check `completion/{job_id}/done` exists in object storage; verify provider credentials

Project Structure

bioinformatics_platform/
├── backend/
│   ├── alembic/versions/          # DB migrations (0001 – 0014)
│   ├── app/
│   │   ├── api/v1/                # REST routers: auth, uploads, jobs,
│   │   │   │                      #   pipelines, nfcore, snakemake,
│   │   │   │                      #   payments, iyzico
│   │   ├── models/                # User, Job, Pipeline, NfCorePipeline,
│   │   │   │                      #   NfCoreModule, SnakemakeWrapper,
│   │   │   │                      #   SnakemakeWorkflow, AuditLog,
│   │   │   │                      #   ConsentRecord, Patient, Sample
│   │   ├── schemas/               # Pydantic request / response models
│   │   ├── services/
│   │   │   ├── storage/           # local.py + s3.py (S3-endpoint-aware)
│   │   │   ├── nextflow/          # mock, local, batch (AWS), turkishcloud
│   │   │   ├── snakemake/         # mock, local, batch (AWS), turkishcloud
│   │   │   ├── bioscript/         # mock, local, batch (AWS), turkishcloud
│   │   │   ├── vm_provisioner/    # base, huawei, turkcell, cloudsigma,
│   │   │   │   │                  #   factory (fallback), runner (shared)
│   │   │   ├── assessment/        # real.py + databases.py + report.py (+ SHA-256 hash)
│   │   │   ├── tckn.py            # TC Kimlik No checksum validator
│   │   │   ├── vcf_validator.py   # VCF header validator (plain + gzip)
│   │   │   ├── sv_parser.py       # structural variant VCF parser
│   │   │   ├── audit.py           # fire-and-forget audit log writer
│   │   │   └── auth.py            # JWT + refresh tokens + MFA tokens
│   │   ├── tasks/                 # Celery tasks: pipeline, scrape_*, retention
│   │   ├── config.py              # All env vars (Pydantic Settings)
│   │   └── main.py                # Sentry init + Prometheus + health check
│   ├── Dockerfile
│   ├── Dockerfile.worker          # Java 17 + Nextflow + Snakemake
│   ├── Dockerfile.tools           # samtools, bwa, STAR, Kraken2, SPAdes…
│   ├── bioplatform_helpers.sh     # Shell library for BioScript jobs
│   ├── nextflow_aws.config
│   ├── .env.example               # All env vars documented
│   └── requirements.txt
│
├── frontend/
│   └── src/
│       ├── api/
│       ├── builder/               # Canvas, nodes, validation, templates,
│       │   │                      #   Spotlight, TemplateGallery, undo/redo
│       ├── components/            # AuthGate, TierConfirm, JobProgress,
│       │   │                      #   JobHistory, ResultsPanel, ResultViewer
│       └── App.tsx
│
├── terraform/                     # AWS infra (S3, IAM, Batch, ECR)
├── .github/workflows/             # CI (ruff+mypy+pytest+tsc) + CD (ECR+S3)
├── .env.example
└── docker-compose.yml

Turkish Cloud Infrastructure

The platform is designed to run compute jobs on Turkish cloud providers to satisfy KVKK data residency requirements for health data. Compute is ephemeral and per-job — a VM is created when a job starts and terminated when it finishes. Customers pay only for what they use.

Provider stack

Priority	Provider	Location	API	Storage
1	Huawei Cloud	Istanbul (tr-west-1)	`huaweicloudsdkecs`	Huawei OBS (S3-compatible)
2	Turkcell Bulut	Istanbul / Ankara / İzmir	VMware vCD (`pyvcloud`)	Turkcell nDepo (S3-compatible)
3	CloudSigma / Siaflex	İzmir	REST API (no SDK)	Uses whichever OBS/nDepo is configured

How fallback works

On each job submission, COMPUTE_PROVIDERS (default: huawei,turkcell,cloudsigma) is tried in order. The first provider that passes a health check and successfully creates an instance is used. If Huawei's API is down or quota is exhausted, Turkcell is tried automatically; then CloudSigma. The pipeline task never needs to know which provider ran the job.

VM lifecycle

Job submitted
    │
    ▼
factory.get_provisioner_with_fallback()
    │  tries huawei → turkcell → cloudsigma
    ▼
VM created with cloud-init user_data script
    │
    ▼  (VM runs independently)
    │  1. Installs Docker + AWS CLI
    │  2. Pulls bioplatform/worker image
    │  3. Runs pipeline (Nextflow / Snakemake / BioScript)
    │  4. Uploads results to object storage
    │  5. Writes completion/{job_id}/done or /error marker
    │  6. shutdown -h now
    ▼
Celery task polls object storage every 30 s (max 3 h 45 m)
    │
    ▼
Results collected → VM terminated (always, even on failure)

Storage for Turkish cloud

Both Huawei OBS and Turkcell nDepo expose an S3-compatible API. Set S3_ENDPOINT_URL to switch the storage backend:

# Huawei OBS (Turkey)
S3_ENDPOINT_URL=https://obs.tr-west-1.myhuaweicloud.com
AWS_ACCESS_KEY_ID=<huawei_ak>
AWS_SECRET_ACCESS_KEY=<huawei_sk>
S3_BUCKET=<obs_bucket_name>

# Turkcell nDepo
S3_ENDPOINT_URL=<ndep0_endpoint_provided_by_turkcell>
AWS_ACCESS_KEY_ID=<ndep0_access_key>
AWS_SECRET_ACCESS_KEY=<ndep0_secret_key>

The existing STORAGE_BACKEND=s3 setting and all download/upload logic works unchanged.

VM flavors

Flavor	Huawei ECS	Turkcell vCD	CloudSigma
`small`	c7n.large.4 (2 vCPU / 8 GB)	2 vCPU / 8 GB	4 GHz / 8 GB
`standard`	c7n.2xlarge.4 (8 vCPU / 32 GB)	8 vCPU / 32 GB	16 GHz / 32 GB
`large`	c7n.4xlarge.4 (16 vCPU / 64 GB)	16 vCPU / 64 GB	32 GHz / 64 GB
`xlarge`	m7n.4xlarge.8 (16 vCPU / 128 GB)	16 vCPU / 128 GB	48 GHz / 128 GB

Set DEFAULT_VM_FLAVOR=standard (default) or pass tier in workflow_config per job.

Security & Compliance

Authentication

Short-lived JWT access tokens (15 min default) paired with Redis-backed refresh tokens (7 days). The /auth/refresh endpoint issues a new pair and revokes the old refresh token on every call (rotation).
Email verification — a signed token is emailed on registration; account is marked verified at GET /auth/verify-email?token=…
Password reset — POST /auth/forgot-password (always returns 202 to prevent email enumeration); POST /auth/reset-password with a 1-hour expiring token
Account lockout — after MAX_LOGIN_ATTEMPTS (default 5) consecutive failures the account is locked for LOCKOUT_MINUTES (default 30). Resets on successful login or password reset.
RBAC: role field on every user — user, clinician, admin
MFA/TOTP via pyotp — compatible with any authenticator app (Google Authenticator, Authy, etc.)
- POST /auth/mfa/setup → provisioning URI + QR data
- POST /auth/mfa/verify → activate MFA with a valid code
- POST /auth/mfa/complete → exchange MFA token + code for a full JWT (called at login challenge screen)
- DELETE /auth/mfa → disable
Login with MFA enabled returns mfa_required: true + a short-lived (5 min) mfa_token instead of a full JWT
POST /auth/logout revokes the refresh token immediately

RBAC

from app.api.v1.deps import require_role

@router.delete("/admin/something")
async def admin_only(user = Depends(require_role("admin"))):
    ...

@router.post("/clinical/report")
async def clinician_or_admin(user = Depends(require_role("clinician", "admin"))):
    ...

Audit log

Every authentication event, job creation, cancellation, retry, and consent change is written to an append-only audit_log table. Writes are fire-and-forget (own DB session, never blocks the request). Logged fields: user_id, action, resource_type, resource_id, ip_address, user_agent, meta (JSON), created_at.

KVKK compliance

Turkey's personal data protection law (KVKK) applies to all health data processed on this platform.

Consent records — POST /auth/consent records explicit KVKK consent per user per consent type (e.g. "kvkk", "marketing"). Upsert semantics; full audit trail.
Data residency — data_residency field on User (default "TR"). Turkish cloud infrastructure keeps compute and storage physically in Turkey.
Right to erasure — DELETE /auth/me deletes the user, all their jobs, and queues S3 object deletion (KVKK Article 7 + GDPR Article 17).
Automated data retention — a daily Celery beat task (run_retention) enforces configurable windows:
- Raw upload files deleted after RAW_FILE_RETENTION_DAYS (default 30 days)
- Report result JSON nulled after REPORT_RETENTION_DAYS (default 1825 days / 5 years)
- Enable with RETENTION_ENABLED=true
TC Kimlik No validation — the TCKN checksum algorithm (11-digit modulo verification) is applied to buyer identity numbers before submission to the iyzico payment gateway. Invalid numbers are rejected with a 400 error.
VCF file validation — uploaded .vcf / .vcf.gz / .bcf files are checked for a valid ##fileformat=VCF header and #CHROM line before being stored. Malformed files are rejected at upload time.
Upload size limit — all uploads are capped at MAX_UPLOAD_SIZE_BYTES (default 10 GB).
VERBİS — Register your data processing activities at verbis.kvkk.gov.tr before going live. Health data is a special category under KVKK Article 6 — explicit consent is required.

BioScript sandboxing

User bash scripts run inside Docker with hard resource limits:

--memory=8g
--cpus=$(nproc)
--read-only
--tmpfs /tmp:exec

In local runner mode, OS-level limits are also applied via resource.setrlimit:

CPU: 2 hours max
Virtual memory: 8 GB
File size: 10 GB
Processes: 256

Patient & Sample Management

The platform includes a structured clinical data model for linking genomic jobs to real patients and biological samples.

Hierarchy

User
 └── Patient  (name, date of birth, sex, notes)
      └── Sample  (sample_type, collection_date, description)
           └── Job  (pipeline run — sarek, assessment, etc.)

Clinical use

Create a patient record before uploading data — POST /patients
Create a sample tied to the patient — POST /patients/{id}/samples
Attach the sample to a job by passing sample_id in the job create request
Retrieve all jobs ever run on a patient — GET /patients/{id}/jobs

FHIR R4 export

Both Patient and Specimen resources can be exported as standard FHIR R4 JSON:

# FHIR Patient resource
GET /api/v1/patients/{patient_id}/fhir

# FHIR Specimen resource
GET /api/v1/patients/{patient_id}/samples/{sample_id}/fhir

These are plain JSON responses — no FHIR server required. They can be imported into any FHIR-compatible EHR (HAPI FHIR, Microsoft Azure Health Data Services, etc.).

Payments

Stripe (global)

Standard Stripe Checkout flow. Set STRIPE_SECRET_KEY and STRIPE_WEBHOOK_SECRET.

iyzico (Turkey)

iyzico is a Turkish payment gateway widely used by Turkish businesses. The flow:

POST /payments/iyzico/checkout — creates a CheckoutFormInitialize and returns the iyzico HTML form snippet + conversation_token
Frontend embeds the form; customer completes payment
iyzico POSTs to POST /payments/iyzico/callback — signature verified, Job created, job_id stored in Redis
Frontend polls GET /payments/iyzico/session/{token} to retrieve the job_id

Required env vars: IYZICO_API_KEY, IYZICO_SECRET_KEY. Sandbox URL: https://sandbox.iyzipay.com (default). Production: https://api.iyzipay.com.

Pipeline Runners

`pipeline_id`	Runner	Backend env var	Notes
`"sarek"`	Nextflow	`NEXTFLOW_BACKEND`	nf-core variant calling; auto-generates samplesheet
other nf-core	Nextflow	`NEXTFLOW_BACKEND`	rnaseq, atacseq, etc.
`"snakemake"`	Snakemake	`SNAKEMAKE_BACKEND`	Generates Snakefile from canvas wrappers/workflows
`"bioscript"`	BioScript	`BIOSCRIPT_BACKEND`	Runs user's bash script in tools Docker image
`"custom-*"`	Custom	`CUSTOM_BACKEND`	spades, kraken2, prokka, iqtree, flye
`"assessment"`	Assessment	always real	Annotates VCF against 17 databases; generates PDF

Each runner supports: mock → local → turkishcloud → awsbatch

Mutation Assessment Pipeline

The Assessment pipeline takes a completed sarek job's VCF output and annotates every variant against 17 public databases.

How to use

Run a sarek job (or select a completed one).
Drop an Assessment node on the canvas and connect it to the sarek Results node.
Pick the source job from the dropdown in the Assessment node.
Submit — no file upload needed.
Results: interactive variant table + downloadable PDF report.

PDF report contents

Summary stats — total variants, pathogenic/LP count, cancer hotspot count
Classification chart — bar chart by ACMG bucket
Table A — ClinVar significance, InterVar/ACMG classification + criteria, gnomAD AF, popmax AF, hotspot flag, rsID
Table B — SIFT, PolyPhen-2, CADD phred, REVEL, MetaLR, MetaSVM, MutationTaster, SpliceAI Δmax, GERP++, PhyloP
Table C — protein name + function (UniProt), OMIM disease, ClinGen validity, GenCC, Orphanet diseases, HPO terms, LOVD variant count
Report signing — a SHA-256 digest of the PDF bytes is computed after generation and stored in the job result (report_sha256). Retrieve it at GET /jobs/{id} to verify the report has not been tampered with.

Variant-level databases (queried per variant, all free)

#	Database	What it provides
1	ClinVar	Pathogenicity classification, HGVS notation
2	gnomAD v4.1	Population allele frequency, popmax AF, AC/AN
3	Ensembl VEP	SIFT, PolyPhen-2, consequence terms, canonical transcript
4	CADD v1.7	Phred-scaled deleteriousness score
5	MyVariant.info	REVEL, MetaLR, MetaSVM, MutationTaster, GERP++, PhyloP
6	SpliceAI	Splice site disruption Δ scores
7	InterVar	ACMG/AMP 2015 auto-classification + criteria met
8	CancerHotspots.org	Recurrent cancer driver mutation hotspot flag
9	dbSNP	rsID fallback

Gene-level databases (queried once per gene, cached 7 days)

#	Database	What it provides
10	UniProt	Protein name + function
11	HGNC	Authoritative gene symbol, Entrez/Ensembl IDs
12	ClinGen	Gene-disease validity classification
13	GenCC	Aggregated gene-disease classifications
14	HPO / Ensembl	Phenotype terms
15	LOVD	Locus-specific variant count

Optional (free registration required)

#	Database	How to enable
16	OMIM	Set `OMIM_API_KEY` — register at omim.org/api
17	Orphanet	Set `ORPHANET_API_KEY` — register at orphacode.org

API Overview

All endpoints under /api/v1. JWT required in Authorization: Bearer <token> except auth and webhooks.

# Auth
POST   /auth/register                 Register (sends verification email)
POST   /auth/login                    Login → JWT + refresh token (or mfa_required)
GET    /auth/me                       Current user
DELETE /auth/me                       Delete account (KVKK right to erasure)
POST   /auth/refresh                  Exchange refresh token → new access + refresh token
POST   /auth/logout                   Revoke refresh token
GET    /auth/verify-email?token=…     Verify email address
POST   /auth/forgot-password          Send password reset email
POST   /auth/reset-password           Set new password with reset token
POST   /auth/mfa/setup                Generate TOTP secret + provisioning URI
POST   /auth/mfa/verify               Activate MFA with first valid code
POST   /auth/mfa/complete             Exchange mfa_token + code → full JWT
DELETE /auth/mfa                      Disable MFA
POST   /auth/consent                  Record KVKK/GDPR consent
GET    /auth/consent                  List consent records

# Uploads
POST   /uploads/presign               Presigned upload URL + cost estimate (size-checked)
GET    /uploads/estimate              Cost estimate
GET    /uploads/local/{filename}      Serve local file (PDF reports etc.)

# Jobs
GET    /jobs                          List jobs (last 50)
POST   /jobs                          Create + dispatch job
GET    /jobs/{id}                     Job details + result
DELETE /jobs/{id}                     Cancel job
POST   /jobs/{id}/retry               Retry failed/cancelled job
GET    /jobs/{id}/logs?offset=N       Stream log lines
GET    /jobs/{id}/download?path=…     Presigned S3 download URL
GET    /jobs/{id}/vcf                 Paginated VCF variant table
GET    /jobs/{id}/sv                  Structural variant / CNV records

# Pipelines
GET    /pipelines                     List saved pipeline graphs
POST   /pipelines                     Save pipeline graph
GET/PUT/DELETE /pipelines/{id}

# Catalogs
GET    /nfcore/pipelines              nf-core pipeline catalog
GET    /nfcore/modules                nf-core module catalog
POST   /nfcore/refresh
GET    /snakemake/wrappers
GET    /snakemake/workflows
POST   /snakemake/refresh

# Payments
POST   /payments/checkout             Stripe checkout session
POST   /payments/webhook              Stripe webhook
GET    /payments/session/{id}         Poll for job_id after Stripe redirect
POST   /payments/iyzico/checkout      iyzico CheckoutFormInitialize
POST   /payments/iyzico/callback      iyzico result callback
GET    /payments/iyzico/session/{token} Poll for job_id after iyzico payment

# Patients & Samples
POST   /patients                      Create patient
GET    /patients                      List patients
GET    /patients/{id}                 Get patient
PUT    /patients/{id}                 Update patient
DELETE /patients/{id}                 Delete patient
GET    /patients/{id}/fhir            FHIR R4 Patient resource (JSON)
POST   /patients/{id}/samples         Create sample
GET    /patients/{id}/samples         List samples for patient
GET    /patients/{id}/samples/{sid}   Get sample
PUT    /patients/{id}/samples/{sid}   Update sample
DELETE /patients/{id}/samples/{sid}   Delete sample
GET    /patients/{id}/samples/{sid}/fhir  FHIR R4 Specimen resource (JSON)
GET    /patients/{id}/jobs            All jobs linked to this patient

# System
GET    /health                        DB + Redis connectivity check
GET    /metrics                       Prometheus metrics (text/plain)

Environment Variables

Copy backend/.env.example to .env. Key groups:

Core

Variable	Default	Description
`JWT_SECRET`	`changeme-…`	Change in production. Min 32 random chars.
`JWT_ACCESS_EXPIRY_MINUTES`	`15`	Access token lifetime in minutes
`JWT_REFRESH_EXPIRY_DAYS`	`7`	Refresh token lifetime in days (stored in Redis)
`MAX_LOGIN_ATTEMPTS`	`5`	Failed attempts before account lockout
`LOCKOUT_MINUTES`	`30`	Lockout duration after too many failed attempts
`MAX_UPLOAD_SIZE_BYTES`	`10737418240`	Maximum upload size (default 10 GB)
`RETENTION_ENABLED`	`false`	Enable automated data retention (KVKK)
`RAW_FILE_RETENTION_DAYS`	`30`	Days before raw upload files are deleted
`REPORT_RETENTION_DAYS`	`1825`	Days before report result data is nulled (5 years)
`ALLOWED_ORIGINS`	`http://localhost:5173`	Comma-separated CORS origins
`MFA_ISSUER`	`BioplatformMD`	Issuer name shown in authenticator apps
`DEBUG`	`true`	Set `false` in production (enforces JWT_SECRET check)

Runners

Variable	Default	Options
`NEXTFLOW_BACKEND`	`mock`	`mock`, `local`, `awsbatch`, `turkishcloud`
`SNAKEMAKE_BACKEND`	`mock`	`mock`, `local`, `awsbatch`, `turkishcloud`
`BIOSCRIPT_BACKEND`	`mock`	`mock`, `local`, `awsbatch`, `turkishcloud`

Turkish cloud compute

Variable	Default	Description
`COMPUTE_PROVIDERS`	`huawei,turkcell,cloudsigma`	Priority-ordered fallback list
`DEFAULT_VM_FLAVOR`	`standard`	`small`, `standard`, `large`, `xlarge`
`S3_ENDPOINT_URL`	``	Huawei OBS or Turkcell nDepo endpoint
`HUAWEI_AK`	``	Huawei Cloud Access Key
`HUAWEI_SK`	``	Huawei Cloud Secret Key
`HUAWEI_PROJECT_ID`	``	Huawei IAM project ID
`HUAWEI_REGION`	`tr-west-1`	Huawei region
`HUAWEI_VPC_ID`	``	VPC for VM instances
`HUAWEI_SUBNET_ID`	``	Subnet for VM instances
`HUAWEI_SECURITY_GROUP_ID`	``	Security group
`HUAWEI_IMAGE_ID`	``	Ubuntu 22.04 base image ID in tr-west-1
`HUAWEI_FLAVOR_DEFAULT`	`c7n.2xlarge.4`	Default ECS flavor
`HUAWEI_OBS_ENDPOINT`	`https://obs.tr-west-1.myhuaweicloud.com`	OBS storage endpoint
`TURKCELL_VCD_URL`	`https://svm.turkcellbulut.com`	vCloud Director URL
`TURKCELL_VCD_ORG`	``	vCD organisation name
`TURKCELL_VCD_VDC`	``	Virtual Datacenter name
`TURKCELL_VCD_USER`	``	vCD username
`TURKCELL_VCD_PASSWORD`	``	vCD password
`TURKCELL_VCD_NETWORK`	``	Org network name
`TURKCELL_VCD_CATALOG`	``	Catalog containing VM template
`TURKCELL_VCD_TEMPLATE`	``	VM template name (Ubuntu 22.04)
`TURKCELL_NDEP0_ENDPOINT`	``	nDepo S3-compatible storage endpoint
`TURKCELL_NDEP0_ACCESS_KEY`	``	nDepo access key
`TURKCELL_NDEP0_SECRET_KEY`	``	nDepo secret key
`CLOUDSIGMA_API_ENDPOINT`	`https://siaflex.cloud/api/2.0`	CloudSigma/Siaflex API
`CLOUDSIGMA_USERNAME`	``	CloudSigma email
`CLOUDSIGMA_PASSWORD`	``	CloudSigma password

Storage

Variable	Default	Description
`STORAGE_BACKEND`	`local`	`local` or `s3`
`S3_ENDPOINT_URL`	``	Override for Turkish-cloud S3-compatible storage
`AWS_ACCESS_KEY_ID`	``	AK for S3 / Huawei OBS / Turkcell nDepo
`AWS_SECRET_ACCESS_KEY`	``	SK
`S3_BUCKET`	``	Bucket name

Payments

Variable	Default	Description
`STRIPE_SECRET_KEY`	``	`sk_test_…` or `sk_live_…`
`STRIPE_WEBHOOK_SECRET`	``	`whsec_…`
`IYZICO_API_KEY`	``	iyzico merchant API key
`IYZICO_SECRET_KEY`	``	iyzico merchant secret
`IYZICO_BASE_URL`	`https://sandbox.iyzipay.com`	Use `https://api.iyzipay.com` in production
`IYZICO_USD_TO_TRY_RATE`	`33.0`	USD→TRY conversion rate

Observability

Variable	Default	Description
`SENTRY_DSN`	``	Leave empty to disable. Get from sentry.io

Mutation Assessment

Variable	Default	Description
`ASSESSMENT_GENOME`	`hg38`	`hg19` or `hg38`
`OMIM_API_KEY`	``	Optional — omim.org/api
`ORPHANET_API_KEY`	``	Optional — orphacode.org

BioScript Shell Helpers

Pre-loaded functions inside the BioScript container:

bioplatform_qc          <input.fastq.gz> <outdir> [r2.fastq.gz]
bioplatform_align       <reads.fastq.gz> <genome.fa> <outdir> [r2]
bioplatform_star_align  <reads.fastq.gz> <star_index_dir> <outdir> [r2]
bioplatform_call        <input.bam> <genome.fa> <outdir>
bioplatform_featurecount <bam> <gtf> <outdir>
bioplatform_multiqc     <results_dir> <outdir>
bioplatform_spades      <r1.fastq.gz> <outdir> [r2.fastq.gz]
bioplatform_kraken2     <r1.fastq.gz> <db_dir> <outdir> [r2.fastq.gz]
bioplatform_prokka      <assembly.fasta> <outdir>
bioplatform_iqtree      <alignment.fasta> <outdir>
bioplatform_flye        <reads.fastq.gz> <outdir>

Available env vars in every BioScript job:

$INPUT_FILE    # storage URI of the uploaded input file
$OUTPUT_DIR    # storage prefix where outputs should be written
$JOB_ID        # unique job identifier

Database Migrations

Migrations run automatically at startup (alembic upgrade head).

Version	Description
0001	Create jobs table
0002	Create pipelines table
0003	Create nf-core catalog tables
0004	Add pipeline input formats
0005	Add pipeline_id to jobs
0006	Create Snakemake catalog tables
0007	Create users table; add user_id to jobs + pipelines
0008	Add stripe_session_id to jobs
0009	Add storage_key_r2 + workflow_config to jobs
0010	Add job_name to jobs
0011	Add role to users (RBAC)
0012	Create audit_log table
0013	Add mfa_secret, mfa_enabled, data_residency to users
0014	Create consent_records table (KVKK)
0015	Add email_verified, password_reset, failed_login_attempts, locked_until to users
0016	Create patients + samples tables; add sample_id to jobs

License

MIT

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Bioinformatics Platform

What It Does

Tech Stack

Running Locally

Prerequisites

Start (demo / debug mode — no cloud credentials needed)

Running modes

Stop

Useful commands

Troubleshooting

Project Structure

Turkish Cloud Infrastructure

Provider stack

How fallback works

VM lifecycle

Storage for Turkish cloud

VM flavors

Security & Compliance

Authentication

RBAC

Audit log

KVKK compliance

BioScript sandboxing

Patient & Sample Management

Hierarchy

Clinical use

FHIR R4 export

Payments

Stripe (global)

iyzico (Turkey)

Pipeline Runners

Mutation Assessment Pipeline

How to use

PDF report contents

Variant-level databases (queried per variant, all free)

Gene-level databases (queried once per gene, cached 7 days)

Optional (free registration required)

API Overview

Environment Variables

Core

Runners

Turkish cloud compute

Storage

Payments

Observability

Mutation Assessment

BioScript Shell Helpers

Database Migrations

License