08 May 10:35

FullLengthFanatic

0150428

v0.4.0 Latest

Latest

Changelog

[0.4.0] — 2026-05-07

Added

Self-explanatory prose blocks in HTML reports for non-expert
readers (manuscript reviewers, conference attendees, collaborators
who haven't run tecap themselves). Three additions, sourced from
new constants in tecap/constants.py:
- REPORT_INTRO: top-level intro paragraph rendered above every
  single-sample and multi-sample report. Names what the report is,
  what was measured, and where the input came from.
- HOW_TO_READ_COMPARE: reading guide rendered only in
  multi-sample comparison reports. Maps polylines / grouped bars /
  colours back to samples.
- GLOSSARY: two-column glossary table at the bottom of every
  report. Defines TE, UTR, CDS, polyA, PAS, APA, oligo-dT.
New report helpers _intro_html, _how_to_read_html,
_glossary_html in tecap/report.py. Wired into both
build_single_report and build_compare_report.
New CSS classes .intro and .howto for visually framed prose
callouts (matches existing .tile styling).

Assets 2

04 May 09:53

FullLengthFanatic

v0.3.2

b636cca

v0.3.2

Changelog

[0.3.2] — 2026-05-04

Prose-only patch. No code, schema, CLI, or behavior change.

Changed

Removed an unsupported mechanistic attribution from the README,
CITATION.cff, .zenodo.json, and the basecomp plot caption emitted
by tecap/constants.py. Earlier wording attributed the moderate-A
vs classical-A split to "saturating-local-concentration oligo-dT
chemistries (10x GEM droplets, BD Rhapsody capture beads)". That
claim is not supported by available bench data: bulk Iso-Seq final
oligo-dT is 1.2 µM, in line with FS-ONT and BD Rhapsody, so bulk
oligo-dT concentration cannot be the axis driving the split.
All retracted prose has been replaced with empirical-only language:
single-cell prep datasets (10x, BD Rhapsody, ArgenTag, plate
FLASH-seq) cluster in the 30-50% A regime; bulk Iso-Seq datasets
cluster past the >=60% A line. The biochemical driver of the split
is currently uncharacterized.

Note on [0.3.0] history

The [0.3.0] block below still contains the original
saturating-local-concentration prose. That entry is preserved verbatim
because v0.3.0 / v0.3.1 are already published on GitHub, Zenodo, and
bioconda; rewriting locked release notes would be misleading. This
[0.3.2] entry documents the retraction.

Assets 2

01 May 13:05

FullLengthFanatic

v0.3.1

44ee830

v0.3.1

[0.3.1] — 2026-05-01

CLI ergonomics. JSON schema unchanged.

Changed

tecap report now accepts space-separated paths for
--classify-json and --basecomp-json (nargs="+") instead of
comma-separated. Commas are legal in POSIX paths, so the comma form
was a real ambiguity. The new form matches standard argparse
conventions.

Migration

Replace any comma-separated invocation
tecap report --classify-json A.json,B.json --basecomp-json A_bc.json,B_bc.json ...
with space-separated
tecap report --classify-json A.json B.json --basecomp-json A_bc.json B_bc.json ....

Assets 2

30 Apr 13:25

FullLengthFanatic

v0.3.0

6503058

v0.3.0

[0.3.0] — 2026-04-30

Readability, reporting, and ergonomics. JSON schema unchanged.

Added

Single source of truth for mechanism / bucket prose in
tecap/constants.py (MECHANISM_DEFINITIONS, BUCKET_DEFINITIONS,
PLOT_CAPTIONS). README, HTML report, plot captions, and tecap explain
all render from these dicts.
tecap explain [--mechanism NAME] [--scope classify|basecomp|all] [--format text|json] — print the glossary at the terminal.
tecap report --classify-json A.json[,B.json,...] [--basecomp-json ...] --out-html OUT.html — single self-contained HTML
per sample, or cross-sample comparison. Embeds PNGs as base64. No JS,
no external CSS, no CDN.
tecap classify / basecomp accept --genome {GRCh38,GRCm38,GRCm39}
to auto-fetch missing references via the existing
fetch_polya_atlas / fetch_gencode_gtf. Adds --gtf-version and
--ref-cache (default $XDG_CACHE_HOME/tecap / ~/.cache/tecap).
Explicit --polya-sites / --gtf still win.

Changed

comparison_terminal_exon.png panel 1 is now horizontal grouped bars,
one row per category, samples grouped within each row. Long category
names are readable; the previous overlapped vertical x-axis labels are
gone.
comparison_terminal_exon.png panels 2 and 3 (UTR-bin rates) switched
from grouped vertical bars to line+marker plots, one polyline per
sample. Bars hid samples with low MechA-correct rates at N>=4; lines
scale to any sample count. Caption rewritten to describe left/middle/
right panels (was "top/bottom").
comparison_basecomp.png switched from side-by-side bars to step-line
overlay, one polyline per sample per bucket, with a single
figure-level sample legend. Side-by-side bars were unreadable at
N>=4. Figure size scales with sample count.
All four plotting functions now draw a figure-level caption explaining
what the plot shows.
Basecomp PNGs now carry an explicit figure-level legend for the grey
band ("30-50% A: moderate-A priming") and the dashed line
(">=60% A: classical A-tract"). Per-bucket subplot titles include the
bucket's interpretation.
Moderate-A priming attribution updated based on a 4-sample comparison
(10x Kinnex, BD Rhapsody Kinnex, PacBio Kinnex bulk cerebellum, PacBio
Kinnex bulk heart, all human GRCh38). MechB_aspecific frac[30,50]:
10x 0.36, BD46 0.25, Kinnex cerebellum 0.16, Kinnex heart 0.16. Bulk
Iso-Seq samples instead show heavy classical-A enrichment (frac>=60
~ 0.47). The moderate-A signature is characteristic of saturating-
local-concentration oligo-dT chemistries: 10x GEM droplets (gel bead
dissolves and releases oligo-dT into a ~1 nL droplet) and BD Rhapsody
capture beads (oligo-dT density at the bead surface). Free oligo-dT
at standard concentrations (Iso-Seq, ~20 µL RT) shows classical-A
internal priming instead. The previous attribution to "saturating
in-solution oligo-dT" generally was too broad: 10x and Iso-Seq both
use in-solution oligo-dT, but only 10x's droplet volume creates the
saturating local concentration that drives moderate-A priming.
Captions, README, CITATION.cff, and .zenodo.json updated.

Fixed

download-atlas: PolyASite URLs in download.py 404'd against the live
service. Replaced with the actual paths under
polyasite.unibas.ch/download/atlas/{2.0,3.0}/.... Human is v3.0
(GENCODE_42), mouse is v2.0 (GRCm38.96) — PolyASite v3.0 mouse is not
published.
--genome now accepts {GRCh38, GRCm38, GRCm39} to reflect the asymmetric
PolyASite/GENCODE coverage (GRCh38 has both; GRCm38 has PolyASite only;
GRCm39 has GENCODE only).

Known limitations / open questions

The 4-sample chemistry comparison covers droplet-scale (10x), bead-
surface (BD Rhapsody), and bulk-tube (Kinnex Iso-Seq, ~20 µL RT)
oligo-dT environments. It does not cover plate-scale RT (1-10 µL
per well, in-solution oligo-dT, e.g. Smart-seq2 / Smart-seq3 /
FLASH-seq), which would test whether the moderate-A signature tracks
reaction-volume scale or chemistry lineage. FLASH-seq amplification
(used in BD Rhapsody) is a Smart-seq descendant, so plate Smart-seq
could plausibly look like BD46 (chemistry-driven moderate-A) or like
Iso-Seq (scale-driven, no moderate-A).
This gap exists because no clean public human Smart-seq2/3 +
PacBio HiFi dataset was findable as of 2026-04-29: PacBio's
Kinnex-single-cell-RNA and MAS-Seq buckets are 10x-only;
HIT-scISOseq corneal limbus is Smart-seq2-derived but not directly
downloadable; Al'Khafaji TIL T cell data (dbGaP phs003200) is
10x-derived. Mouse Smart-seq2 + Iso-Seq exists (SRP225196) but
cross-species adds noise on top of PolyASite v2.0 cluster-type
filter mismatches.
Follow-up paths for v0.4: (a) request HIT-scISOseq raw BAMs from
Zheng/Chen et al. (Sun Yat-sen University); (b) watch PacBio's
public bucket for plate Smart-seq deposits; (c) generate an in-house
Smart-seq3 + Kinnex library if the question stays open.

Assets 4

25 Apr 12:43

FullLengthFanatic

v0.2.0

38d5683

v0.2.0

Performance

Multi-threaded classify and basecomp now scale. Workers inherit
gene_index, gene_records, and polya_index via fork() copy-on-write
instead of pickling them per task. The master process previously saturated
100% CPU on pickle, making --threads N>1 slower than --threads 1 on
10x Kinnex BAMs (195 declared contigs). That's gone.

Contigs with mapped == 0 in the BAM index are skipped at the parent,
avoiding ~165 wasted dispatches on typical 10x Kinnex inputs.

Added

tecap download-atlas --genome {GRCh38,GRCm38,GRCm39} fetches the
PolyASite atlas and (with --gtf-version) the matching GENCODE GTF.
{sample}_tecap_mqc.json: MultiQC custom-content table written by
classify. Captured %, MechA-correct %, MechB-aspecific %, PAS+ fractions,
orientation-match fraction.
conda/meta.yaml bioconda recipe (PR #64853 pending).
docker/Dockerfile micromamba-based image.
GitHub Actions: pytest matrix on Python 3.10/3.11/3.12 + ruff lint.
CITATION.cff and .zenodo.json for the Zenodo DOI:
10.5281/zenodo.19762736.

Unchanged

JSON output of classify and basecomp is byte-for-byte identical to v0.1
on the same input. No schema bump.

Assets 2

Releases: FullLengthFanatic/tecap

v0.4.0

Changelog

[0.4.0] — 2026-05-07

Added

Uh oh!

v0.3.2

Changelog

[0.3.2] — 2026-05-04

Changed

Note on [0.3.0] history

Uh oh!

v0.3.1

[0.3.1] — 2026-05-01

Changed

Migration

Uh oh!

v0.3.0

[0.3.0] — 2026-04-30

Added

Changed

Fixed

Known limitations / open questions

Uh oh!

v0.2.0

Performance

Added

Unchanged

Uh oh!