Skip to content

Releases: AmrasElessar/adminpdftoolkit

v1.13.2 — Polish: CI fix · dead code · system fonts UI · SSRF docs · +5 tests

16 May 13:53

Choose a tag to compare

Admin PDF Toolkit v1.13.2 — D Brand — Polish release
Follow-up to v1.13.1 hot-fixes: CI coverage fix, dead code removal, system-fonts dropdown polish, SSRF docs, +5 new tests. AGPL-3.0.

📥 Hangi Dosyayı İndirmeliyim?

Senaryo İndirin
Sıradan ev/ofis bilgisayarı AdminPDFToolkit_Setup.exe (~33 MB) — online installer
Firewall'lu / internet kısıtlı iş PC'si AdminPDFToolkit_Setup_Offline.exe (~525 MB) — offline installer
Kurulum yapmadan denemek Admin_PDF_Toolkit_Portable_v1.13.2.zip — açıp .bat'a tıkla

Follow-up to v1.13.1 — closes the small punch-list left after the
external code reviews, plus fixes a CI failure introduced when the
packaging scripts landed in v1.13.0.

CI

  • Coverage threshold no longer fails on packaging scripts. v1.13.0
    added 7 build/installer helpers (build_exe.py, installer.py,
    launcher.py, etc., ~770 stmts total) that run only on the
    maintainer's machine and never execute under pytest. They dragged
    TOTAL coverage from ~64% down to 54.92%, tripping the 62% gate.
    Excluded via [tool.coverage.run].omit; metric now reflects the
    runtime app, not the packaging toolchain.

Cleanup

  • Dead code removed from templates/index.html. Per-group
    "Bu grubu birleştir" buttons were replaced by auto-orchestration in
    v1.13.0, but doGroupMerge() + its _mergeFilter / _forceMerge
    globals were left behind. Removed (~50 lines).
  • .gitattributes added. Normalizes line endings per file type;
    stops the Windows working tree from emitting LF↔CRLF warnings on
    every commit. .bat / .cmd / .ps1 / .iss stay CRLF; everything
    else is LF.
  • .gitignore extended. Ad-hoc bug-report screenshots
    (sorun*.png, debug*.png) no longer show up as untracked files.

Testing

  • /batch-deduplicate HTTP-layer validation tests. Four new tests
    cover the form-parameter validation paths the helper-level tests
    don't reach: malformed match_columns JSON, non-list payload,
    unknown column name, and the other_table "you must pick a column"
    required path.
  • batch_convert_worker other_table shape test. Confirms that
    when called with group_kind="other_table" the worker persists
    group_kind / group_headers / group_label in data.json,
    defaults state.match_columns to [], surfaces the metadata in
    the result dict, and keeps the records keyed by the group's own
    headers instead of forcing them into the call-log schema.

Documentation

  • SSRF DNS-rebinding TOCTOU window documented as accepted residual
    risk.
    The previous one-line "acceptable for LAN-only deployment"
    comment is expanded into an explicit rationale: the listener binds
    to 127.0.0.1 by default, /pdf/from-url is operator-only, and a
    rebound fetch can't reach anything the operator can't reach
    directly. Includes the trigger conditions under which the
    assumption breaks (bind to 0.0.0.0 + reverse proxy) and what the
    hardening path looks like.

Notes

v1.13.1 — Hot-fix: external review (Defender fail-open · race · memory · workers)

16 May 13:03

Choose a tag to compare

Admin PDF Toolkit v1.13.1 — D Brand — Hot-fix release
External code review (ChatGPT + Gemini) yielded four real bugs in the v1.13.0 hardening surface. All four fixed here. AGPL-3.0.

📥 Hangi Dosyayı İndirmeliyim?

Senaryo İndirin
Sıradan ev/ofis bilgisayarı AdminPDFToolkit_Setup.exe (~33 MB) — online installer
Firewall'lu / internet kısıtlı iş PC'si AdminPDFToolkit_Setup_Offline.exe (~525 MB) — offline installer
Kurulum yapmadan denemek Admin_PDF_Toolkit_Portable_v1.13.1.zip — açıp .bat'a tıkla

External review (ChatGPT + Gemini) pointed to four real bugs in the
hardening / state-management surface. Fix the ones that are real (not
the "multi-worker scaling" suggestion, which is a deliberate
single-process design choice):

Security

  • pdf_safety.mpcmdrun_scan no longer fails open. The generic
    except Exception path used to return clean=True, status="error",
    which let any unexpected MpCmdRun failure (corrupted PDF, locked
    signature DB, missing binary, …) silently bypass Defender. Now
    returns clean=False — the result still surfaces as status="error"
    for telemetry, but full_scan no longer treats the leg as a clean
    vote. Closes the only fail-open path in the scanner pipeline.

Reliability

  • state.JobStore.snapshot now returns a deep copy. Workers mutate
    nested lists (files_progress, files_safety) after update();
    the old shallow dict(job) aliased those references, so readers
    could iterate over a list mid-mutation and either crash or read
    inconsistent rows. Lock-scoped copy.deepcopy makes each snapshot
    caller-owned.

Performance / safety

  • pdf_safety.check_structure reads in 4 MB chunks instead of slurping
    the whole PDF.
    Old path was raw = pdf_path.read_bytes() followed
    by a full latin-1 decode — a 200 MB upload meant ~400 MB heap, and
    three concurrent scans could push the process to ~1.5 GB. New streaming
    loop with a 256-byte overlap window keeps heap at ~5 MB regardless of
    file size and preserves the existing pattern-match semantics
    (verified: 100 MB PDF scan, RSS delta 0.8 MB, /JavaScript matches
    100/100 expected hits).

Defensive

  • Refuse to start under multi-worker uvicorn. New
    _check_single_worker_invariant aborts startup with a clear message
    when WEB_CONCURRENCY > 1. JobStore is intentionally process-local;
    multi-worker mode would let one worker poll a job another worker
    owns and return phantom 404s. uvicorn.run(workers=1, ...) is now
    explicit at the entry-point too.

v1.13.0 — Multi-group merge · System fonts · 3-variant installer

16 May 12:31

Choose a tag to compare

Admin PDF Toolkit v1.13.0 — D Brand
Çok formatlı PDF dönüşüm aracı (Word / Excel / JPG), batch merge + ekip dağıtımı, AGPL-3.0.

📥 Hangi Dosyayı İndirmeliyim?

Senaryo İndirin
Sıradan ev/ofis bilgisayarı AdminPDFToolkit_Setup.exe (33 MB) — Windows kurulum sihirbazı, gerekli bileşenleri internet üzerinden indirir
Firewall'lu / internet kısıtlı iş PC'si AdminPDFToolkit_Setup_Offline.exe (549 MB) — her şey içeride, internet gerektirmez
Kurulum yapmadan denemek isteyenler Admin_PDF_Toolkit_Portable_v1.13.0.zip — ZIP'i açıp Admin PDF Toolkit Baslat.bat'a tıklayın

İlk açılışta ClamAV imza veritabanı (~300 MB) ve gerekirse EasyOCR modelleri arka planda indirilir (sadece online installer + portable ZIP için).


[1.13.0] — 2026-05-16

Added — multi-group batch merge

Reshape the batch-Excel-merge flow from "one merged Excel" to "one
merged Excel per detected PDF format". When /batch-analyze detects
multiple compatible groups (e.g. 6 call-log + 4 same-format tabular
PDFs), each group is processed in its own /batch-convert round and
gets its own result tab — preview, dedup, filter, and 3-tier team
distribution panel scoped per group.

  • Other-table groups carry their own column headers all the way through;
    no more force-mapping to the call-log schema.
  • Dedup now accepts user-picked match_columns for other-table groups
    (column-picker UI in the dedupe tab). Call-log keeps the Telefon
    default with normalize_phone collapsing.
  • Defensive inline doBatchAnalyze in the submit handler closes the
    regression where 3-PDF batches sometimes fell through to the ZIP
    branch when the analyze cache wasn't populated in time.

Added — PDF editor: locally installed system fonts

/pdf/edit/fonts now lists the host machine's installed TTF/OTF
families alongside the bundled Noto/DejaVu set, grouped under "🖥 Bu
bilgisayar (N font)" in the dropdown. Microsoft fonts are NOT bundled
(EULA forbids redistribution); we read them from C:\Windows\Fonts at
runtime. fsType=Restricted-License-Embedding fonts (Tahoma, Calibri,
Times New Roman, …) are filtered out so the editor never embeds a font
we don't have legal rights to redistribute when PyMuPDF subsets it.
Typical Win11 yield: 6 bundled + ~148 system fonts in ~150 ms first
scan, cached thereafter.

Added — safety pipeline: parallel scan + danger review

  • pdf_safety.full_scan and pipelines/safety.scan_files_with_progress
    fan scanners / files out in parallel (ClamAV INSTREAM handles
    concurrent sockets natively); 16-file batches drop from ~15 s to
    ~1-2 s.
  • New danger_review phase suspends the worker when a scanner flags a
    PDF and surfaces danger_file + danger_findings via SSE; the user
    clicks "Yine de Dönüştür" (skip_safety) or "İptal" (cancel_safety)
    on a modal. One review at a time via danger_lock.
  • Unsafe-accepted outputs get an "⚠ UYARI" red sheet (Excel) or top
    paragraph (Word) and a _GUVENSIZ filename suffix.

Added — 3-variant installer distribution

  • Portable ZIP (build_portable.py) — self-contained folder with
    vendored Python; double-click Sunucuyu Başlat.bat.
  • Online Inno Setup wizard (build_setup_inno.py) — ~32 MB
    installer that pulls Python + ClamAV + EasyOCR models from the
    network during install.
  • Offline Inno Setup wizard (build_setup_offline.py) — ~500-700
    MB installer with everything bundled for firewalled corporate PCs.
  • pystray-based tray launcher replaces the old console window;
    right-click → "Web arayüzünü aç" / "Sunucuyu durdur".

Changed

  • Lifespan startup now BLOCKS on clamd readiness (25 s timeout) so the
    first scan never hits a half-warm daemon; freshclam still runs in
    background.
  • write_merged_excel accepts an optional schema= parameter; the
    team-distribution download honours it for per-team Excels.
  • core/batch.py:parse_pdf_for_batch accepts an optional mode=
    parameter (5-tuple call site) — mode="other_table" keys rows by
    the group's headers without forcing the call-log schema.

Security — full audit sweep (2026-05-10)

Triggered by SECURITY_AUDIT_2026_05_10.md; closes ~20 issues beyond
the four documented in the Chrome extension roadmap. 394 tests still
green; one Windows-admin-only symlink test skipped.

Cross-origin / SSRF / LFI

  • /pdf/from-url redirect bypass closed — every 301/302 target is
    re-validated through _assert_public_url; URL basic-auth rejected;
    response body capped at 50 MB.
  • /pdf/from-html body capped at 10 MB (was the upload limit, i.e.
    ~2 GB before the cap drop).
  • xhtml2pdf link_callback now strict: only ht-font:// and
    data: schemes resolve; file:///etc/passwd, external http://
    return empty (defense against LFI through <img src=…> in
    attacker-supplied HTML).
  • save_pdf_upload runs gate_pdf_safety by default — closes 30+
    /pdf/* and /pdf/edit/* endpoints that previously bypassed the
    safety scanner.

CSRF

  • /admin/enable-mobile, /admin/disable-mobile,
    /admin/clamav-update, and DELETE /history now require an
    Origin/Referer matching the server's own host; blocks the
    cross-origin POST-from-evil-site attack against the operator's
    loopback session.
  • Mobile token migrated from ?key= URL parameter to #key= URL
    fragment — fragments never reach the server, never log, never
    appear in referer headers. ?key= fallback dropped from
    middleware. Existing JS already captures the bootstrap token from
    the URL into localStorage.

Path traversal / DoS

  • make_job_dir hardened in three layers: explicit separator/..
    reject, pre-mkdir resolve+containment check (avoids leftover
    directories from rejected probes), post-mkdir symlink check
    (guards against TOCTOU symlink races).
  • routers/batch.py token-taking endpoints validate check_token
    before touching the filesystem.
  • parse_int_list capped at 100 000 entries (defends against
    1-99999999 style page lists OOMing the worker).
  • Bounded background-worker concurrency via submit_worker /
    HT_MAX_INFLIGHT_JOBS (default 4); saturation returns 503 instead
    of unbounded thread spawn.

Authn / info disclosure

  • /health returns only {ok: true} to unauthenticated remote
    callers; version + telemetry restricted to loopback / mobile-token
    callers.
  • pdfid.py is now invoked through sys.executable (PATH-poisoning
    defense).
  • HT_LOOPBACK_BYPASS=false setting for reverse-proxy deployments
    where the proxy connects via 127.0.0.1 and would otherwise hide
    every remote client from the auth middleware.

Crypto / cert

  • Self-signed cert: datetime.now(timezone.utc) (replaces deprecated
    utcnow); validity reduced 5 y → 1 y; auto-rotates when fewer than
    30 days remain; key file is chmod 0600 on POSIX.

Hardening defaults

  • Default MAX_UPLOAD_MB lowered 2048 → 200 (raise via
    HT_MAX_UPLOAD_MB).
  • safe_filename adds NFKC normalisation + control-char strip.
  • extract_images capped (50 MB / image, 200 MB / job),
    pdf_to_csv at 100 k rows, pdf_to_markdown at 50 MB.
  • Baseline browser headers on every response:
    X-Content-Type-Options: nosniff, X-Frame-Options: DENY,
    Referrer-Policy: no-referrer,
    Cross-Origin-Opener-Policy: same-origin.

Added

  • CHROME_EXTENSION_ROADMAP.md — deferred plan for a Manifest V3
    browser extension that bridges the operator's browser to the
    locally-running app via localhost.

v1.12.0 — Convert Workspace · ClamAV daemon · smart Excel batch

30 Apr 19:56

Choose a tag to compare

🇹🇷 Türkçe — Yenilikler

🔄 Convert Workspace — yeniden tasarlanmış dönüştürme akışı

  • Modal kaldırıldı — tek-sayfa, sticky 3-adımlı breadcrumb (Hazırlık → Dönüştürme → Sonuç).
  • Format kartları:
    • 📊 Excel & 📝 Word: "OCR ile dene" toggle (taranmış / bozuk metin için)
    • 🖼 JPG: Çözünürlük (72 / 150 / 200 / 300 / 600 DPI) + kalite kaydırıcı (50–100)
  • Sekmeli sonuç paneli — Excel'de: Tüm Kayıtlar / Filtre / Mükerrer / Dağıtım

⚡ ClamAV daemon (clamd) — ~300× hızlanma

  • core/clamav_daemon.py — clamd lifespan'da arka planda başlar, signature DB hot kalır
  • INSTREAM TCP protokolü (subprocess yok, Türkçe karakter encoding sorunu yok)
  • Tek PDF: 5–15 sn → ~50 ms · 4 PDF batch: 20–60 sn → ~30 ms

🛡 Görünür güvenlik taraması + iptal butonu

  • Tarama worker thread'de · büyük yüzde + nabız atan kalkan · per-file mini progress bar
  • "⏩ Atla & Devam" butonu (POST /job-skip-safety/{kind}/{token})
  • Responsive grid (1 → 7 kolon, 50+ dosyaya kadar)

📊 Akıllı Excel batch

  • Çoklu PDF Excel: her dosya kendi .xlsx'inde (ZIP)
  • Otomatik kind tespiti: call-log / scanned / fatura / sözleşme / ekstre / fiş / mektup / rapor / form / kimlik
  • 2+ çağrı listesi → bonus _cagrilar_birlesik.xlsx ZIP'e eklenir
  • Hatalı dosya batch'i durdurmaz

🐚 Powered by ClamAV® branding

  • Tarama UI'sinde resmi attribution (clamav.net link)
  • NOTICE.txt + THIRD_PARTY_LICENSES.md — Cisco Systems Inc. ticari marka

🔓 SmartScreen tek-seferlik unblock

  • Sunucuyu Başlat.bat ilk açılışta Get-ChildItem | Unblock-File çağırır
  • "Yayımcı doğrulanmadı" uyarısı sadece launcher'da bir kez görünür, sonrası temiz

🛡 Bağımsız Güvenlik Doğrulaması

Yayınlanan ZIP 70+ antivirüs motorunda kontrol edildi — hiçbiri detection vermedi:

Tarayıcı Sonuç Rapor
VirusTotal 66 / 66 motor temiz https://www.virustotal.com/gui/file/40e7d5ff7210b1de389496274d915a82708ad2123ede82bed20a9a93804ea538
Hybrid Analysis (CrowdStrike Falcon) AV multi-scan + MetaDefender Multi Scan clean · "no specific threat" https://hybrid-analysis.com/sample/40e7d5ff7210b1de389496274d915a82708ad2123ede82bed20a9a93804ea538
MetaDefender Cloud (OPSWAT) Sandbox No Threats · 0 threat indicator · 0 IOC · 0 CVE https://metadefender.com/results/file/bzI2MDQzMER2M1FwRjl5NjJfVEU5enBMbjN4_mdaas
Kaspersky (kurumsal endpoint) ZIP + extracted folder, 0 detection (yerel tarama — public link yok)

Doğrulanan ZIP SHA-256: 40e7d5ff7210b1de389496274d915a82708ad2123ede82bed20a9a93804ea538


🇬🇧 English — Highlights

  • Convert Workspace — modal-free single-page redesign; sticky 3-step breadcrumb; tabbed Excel result panel; JPG DPI dropdown + quality slider; Word/Excel OCR toggles.
  • ClamAV daemon (clamd) — ~300× faster scans (5–15 s → ~50 ms) via INSTREAM TCP socket; bypasses Windows Turkish-filename ANSI/UTF-8 encoding issues.
  • Visible safety scan UI with cancel button (/job-skip-safety/{kind}/{token}); per-file mini progress bars; responsive grid handles 50+ files.
  • Smart per-file Excel batch/batch-files accepts target=excel; auto-detects each PDF's kind; produces an extra _cagrilar_birlesik.xlsx when 2+ inputs are call-logs.
  • One-shot SmartScreen unblock — first run of Sunucuyu Başlat.bat strips MotW from every file.
  • Powered by ClamAV® branding + Cisco trademark notice in NOTICE.txt.
  • Security: verified clean by VirusTotal (66/66), Hybrid Analysis, MetaDefender, and enterprise Kaspersky.

Tests: 394 passed across Win/macOS/Ubuntu × Python 3.11 / 3.12 / 3.13.

Full Changelog: v1.11.0...v1.12.0