Skip to content

feat(trukno): add external import connector (#6286)#6285

Open
hieuttmmo wants to merge 22 commits into
OpenCTI-Platform:masterfrom
hieuttmmo:codex/trukno-external-import
Open

feat(trukno): add external import connector (#6286)#6285
hieuttmmo wants to merge 22 commits into
OpenCTI-Platform:masterfrom
hieuttmmo:codex/trukno-external-import

Conversation

@hieuttmmo

@hieuttmmo hieuttmmo commented Apr 22, 2026

Copy link
Copy Markdown

Proposed changes

  • Add a new external-import/trukno connector for TruKno breach intelligence.
  • Import updated TruKno breaches into OpenCTI as STIX 2.1 reports with linked attack-pattern and malware objects.
  • Include Connector Manager metadata, configuration schema/docs, Docker packaging, and an upstream test suite.

Related issues

Checklist

  • I consider the submitted work as finished
  • I have signed my commits using GPG key.
  • I tested the code for its functionality using different use cases
  • I added/update the relevant documentation (either on github or on notion)
  • Where necessary I refactored code to improve the overall quality

Validation

  • uv run --python 3.11 python -m compileall external-import/trukno/src
  • docker build -t opencti/connector-trukno:verify external-import/trukno
  • in-container import smoke test for trukno_connector, pycti, yaml, and requests
  • entrypoint startup smoke test with dummy OpenCTI settings, confirming the container reaches the expected OpenCTI connectivity failure path
  • docker compose -f external-import/trukno/docker-compose.yml config
  • upstream connector test suite added and verified locally: python -m pytest external-import/trukno/tests -q (28 passed; black / isort / flake8 clean)

Scope of this first submission

This initial version intentionally keeps the ingestion scope narrow and reviewable: report, attack-pattern, malware. The connector does not yet create threat actors, intrusion sets, indicators, vulnerabilities, or richer relationship graphs beyond report object references.

Maintainer review updates

The following changes were applied during review to make this PR merge-ready (on top of the contributor's already-resolved review threads).

  • Coverage collection: dropped pytest-cov from tests/test-requirements.txt. The connector ships no pytest --cov configuration, so with pytest-cov pre-installed run_test.sh ran the suite without a coverage target and uploaded an empty report (codecov/patch was 0%). Without the pin, run_test.sh adds a bare --cov and measures the connector source (~88%), so codecov/patch now passes.
  • Config schema / docs convention (commit e8f68ea1ef): reduced the connector_config_schema.json required list to the repo convention (OPENCTI_URL, OPENCTI_TOKEN, TRUKNO_API_KEY) so the Connector Manager no longer forces platform-managed fields such as CONNECTOR_ID. To keep the schema and runtime consistent, load_config and the helper config now fall back to the schema-declared defaults for the optional fields, with added regression tests. CONNECTOR_CONFIG_DOC.md and the README requirements were updated to match.
  • Stable shared-object timestamps (commit 882fed4e86): attack-pattern and malware are shared reference objects keyed by a stable deterministic STIX id, but their created/modified were taken from the enclosing breach's publishedAt. The same TTP/malware referenced by several breaches therefore emitted the same id with different timestamps, making OpenCTI flip-flop the object on each ingest. They now use a fixed REFERENCE_OBJECT_TIMESTAMP, so a given id always yields a byte-identical object (the per-breach report keeps its publish date). Added a regression test asserting two breaches with different publish dates produce identical shared objects, and clarified in the README that CONNECTOR_ID is required at runtime but auto-injected by the Connector Manager (hence optional in the schema).
  • Work-item error handling + Alpine base image (commit b85c987dbf): run_once now wraps the per-breach loop and marks the OpenCTI work item to_processed(..., in_error=True) before re-raising on a mid-batch fetch/transform/send failure, instead of leaving the work stuck in a running state (the per-item checkpoint means the next cycle resumes after the last successfully imported breach). The Dockerfile was switched from the Debian python:3.12-slim base to the repo-standard python:3.12-alpine + apk toolchain (mirroring templates/external-import/Dockerfile), with the build-only git/build-base purged after install; verified the image builds and the runtime imports work in-container with git absent from the final image. Added regression tests for the errored/successful work paths and updated the packaging test to assert the Alpine pattern.
  • Empty-object_refs reports + helper-config consistency + dependency bump (commit b51b933b74): a breach with no relatedTTPs/relatedMalwares previously emitted a report with an empty object_refs list, which is invalid STIX 2.1 (a report MUST reference at least one object) and can be rejected on ingestion. transform_breach_to_bundle now builds the linked objects first and only emits the report when it has at least one object_ref; run_once skips the resulting empty bundle but still advances the per-item checkpoint so the breach is not refetched forever. Separately, _prepare_helper_config used setdefault, so an explicit empty string (e.g. connector.name: "") was passed through to the helper while load_config treats blanks as missing and applies the default - it now mirrors load_config and falls back to the default for missing or blank type/name/scope/log_level. Finally, pycti was bumped to the latest released 7.260609.0 (was 7.260604.0) in src/requirements.txt, with support_version in the manifest and the README requirements synced. Added regression tests for all three behaviours (28 passing).

All GitHub Actions checks (including the Lint & Format and STIX ID Linter workflows) and Codecov patch/project are green; 0 unresolved review threads.

Notes for the final merge

  • Non-blocking follow-ups for a future iteration (kept out of this narrow first submission): the report/attack-pattern/malware objects have no author Identity / created_by_ref or TLP marking (which is also why the report can have no other object to reference for entity-less breaches), and STIX ids are minted from a connector-local uuid5 namespace rather than the canonical pycti generators (functional and deterministic, but objects won't de-duplicate against other connectors). Both are reasonable to address as the ingestion scope grows.
  • The manifest ships verified: false - live end-to-end validation against the TruKno API remains the contributor's responsibility.

@filigran-cla-bot filigran-cla-bot Bot added the cla:pending CLA signature required. label Apr 22, 2026
@filigran-cla-bot

filigran-cla-bot Bot commented Apr 22, 2026

Copy link
Copy Markdown

Contributor License Agreement

CLA signed 💚

Thank you @hieuttmmo for signing the Contributor License Agreement! Your pull request can now be reviewed and merged.

We appreciate your contribution to Filigran's open source projects! ❤️

This is an automated message from the Filigran CLA Bot.

@filigran-cla-bot filigran-cla-bot Bot removed the cla:pending CLA signature required. label Apr 22, 2026
@hieuttmmo hieuttmmo marked this pull request as ready for review April 22, 2026 19:32
@hieuttmmo hieuttmmo changed the title [external-import] Add TruKno connector [TruKno] Add external import connector Apr 22, 2026
@hieuttmmo hieuttmmo force-pushed the codex/trukno-external-import branch from 0f538b5 to 4c706be Compare April 22, 2026 20:37
@romain-filigran romain-filigran added the community Contribution from the community. label Apr 27, 2026

@romain-filigran romain-filigran left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hieuttmmo . Thank you for your contribution. Can you resolve my comments.
Thank you

Comment thread external-import/trukno/__metadata__/connector_manifest.json Outdated
@hieuttmmo hieuttmmo force-pushed the codex/trukno-external-import branch 3 times, most recently from 23aa5e3 to d5c1edb Compare April 30, 2026 08:49
@hieuttmmo hieuttmmo force-pushed the codex/trukno-external-import branch from d5c1edb to 05fb803 Compare April 30, 2026 09:06
@codecov

codecov Bot commented Apr 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.10989% with 27 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...rnal-import/trukno/src/trukno_connector/runtime.py 86.53% 14 Missing ⚠️
...ernal-import/trukno/src/trukno_connector/config.py 89.85% 7 Missing ⚠️
external-import/trukno/src/main.py 44.44% 5 Missing ⚠️
...ternal-import/trukno/src/trukno_connector/state.py 95.65% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (4f03e87) and HEAD (b51b933). Click for more details.

HEAD has 109 uploads less than BASE
Flag BASE (4f03e87) HEAD (b51b933)
connectors 111 2
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #6285       +/-   ##
===========================================
- Coverage   29.27%    0.44%   -28.84%     
===========================================
  Files        1918     1894       -24     
  Lines      120028   119447      -581     
===========================================
- Hits        35144      530    -34614     
- Misses      84884   118917    +34033     
Files with missing lines Coverage Δ
...nal-import/trukno/src/trukno_connector/__init__.py 100.00% <100.00%> (ø)
...ernal-import/trukno/src/trukno_connector/client.py 100.00% <100.00%> (ø)
...ernal-import/trukno/src/trukno_connector/models.py 100.00% <100.00%> (ø)
...port/trukno/src/trukno_connector/opencti_compat.py 100.00% <100.00%> (ø)
...al-import/trukno/src/trukno_connector/transform.py 100.00% <100.00%> (ø)
...ternal-import/trukno/src/trukno_connector/state.py 95.65% <95.65%> (ø)
external-import/trukno/src/main.py 44.44% <44.44%> (ø)
...ernal-import/trukno/src/trukno_connector/config.py 89.85% <89.85%> (ø)
...rnal-import/trukno/src/trukno_connector/runtime.py 86.53% <86.53%> (ø)

... and 1089 files with indirect coverage changes

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hieuttmmo

Copy link
Copy Markdown
Author

hey @romain-filigran - is there anything else I can do to accelerate the process for merging this PR? Thanks.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a new external-import/trukno connector to ingest TruKno breach intelligence into OpenCTI as STIX 2.1 Reports with linked attack-pattern and malware objects, and wires it into the repository’s packaging and test tooling.

Changes:

  • Added the TruKno connector implementation (client, config/runtime, state management, STIX transformation), plus Docker packaging and operator docs/metadata.
  • Added an upstream test suite for the connector (client/config/runtime/state/transform + fixtures).
  • Updated CI/test harness to install required system dependencies and refined run_test.sh dependency handling.

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
run_test.sh Adjusts dependency scope detection and pycti handling during test runs.
external-import/trukno/tests/test-requirements.txt Declares connector test dependencies.
external-import/trukno/tests/test_state_transform.py Tests state checkpointing, transform output, and bundle cleanup behavior.
external-import/trukno/tests/test_runtime.py Tests one-cycle runtime behavior and config loading paths.
external-import/trukno/tests/test_main.py Smoke test for importability of entrypoint and key modules.
external-import/trukno/tests/test_config.py Tests config validation and required fields.
external-import/trukno/tests/test_client.py Tests TruKno API client request construction and error propagation.
external-import/trukno/tests/fixtures/breach_with_entities.json Fixture payload containing related TTPs/malware.
external-import/trukno/tests/fixtures/breach_list.json Fixture payload for breach list endpoint.
external-import/trukno/tests/fixtures/breach_detail.json Fixture payload for breach details endpoint.
external-import/trukno/tests/conftest.py Test path/bootstrap helpers and fixture root.
external-import/trukno/src/trukno_connector/transform.py Converts TruKno breach payloads into STIX bundle objects.
external-import/trukno/src/trukno_connector/state.py Implements incremental checkpoint state and timestamp parsing/formatting.
external-import/trukno/src/trukno_connector/runtime.py Loads config, builds helper/client, runs polling loop and persists checkpoints.
external-import/trukno/src/trukno_connector/opencti_compat.py Performs bundle cleanup filtering for OpenCTI ingestion.
external-import/trukno/src/trukno_connector/models.py Defines minimal typed model(s) for API list results.
external-import/trukno/src/trukno_connector/config.py Loads/merges config from YAML/env and validates required values.
external-import/trukno/src/trukno_connector/client.py Implements TruKno HTTP client for listing updates and fetching details.
external-import/trukno/src/trukno_connector/__init__.py Exposes the connector’s public API surface.
external-import/trukno/src/requirements.txt Connector runtime dependencies.
external-import/trukno/src/main.py Connector entrypoint wrapper around runtime main loop.
external-import/trukno/src/config.yml.sample Sample YAML configuration for manual runs.
external-import/trukno/README.md Connector documentation (scope, config, behavior, deployment).
external-import/trukno/entrypoint.sh Container entrypoint script.
external-import/trukno/Dockerfile Docker image build for the connector.
external-import/trukno/docker-compose.yml Example compose deployment for the connector.
external-import/trukno/.dockerignore Docker build context exclusions.
external-import/trukno/__metadata__/logo.svg Connector logo asset for metadata/manager.
external-import/trukno/__metadata__/connector_manifest.json Connector manifest metadata (title, version support, image, etc.).
external-import/trukno/__metadata__/connector_config_schema.json JSON schema for connector configuration (Connector Manager).
external-import/trukno/__metadata__/CONNECTOR_CONFIG_DOC.md Generated/maintained config documentation for the connector.
.github/workflows/tests-connectors.yml Installs system dependencies needed for connector test runs in GitHub Actions.
.circleci/config.yml Installs libmagic1 in CircleCI job dependencies.

Comment thread external-import/trukno/Dockerfile Outdated
Comment thread run_test.sh Outdated
Comment thread external-import/trukno/src/trukno_connector/transform.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 3 comments.

Comment thread external-import/trukno/README.md Outdated
Comment thread external-import/trukno/__metadata__/connector_config_schema.json
Comment thread external-import/trukno/__metadata__/CONNECTOR_CONFIG_DOC.md Outdated
@SamuelHassine SamuelHassine dismissed their stale review June 5, 2026 05:16

Waiting for the team

Address the remaining Copilot review threads on the TruKno connector.

- Reduce the connector_config_schema.json `required` list to the repo
  convention (`OPENCTI_URL`, `OPENCTI_TOKEN`, `TRUKNO_API_KEY`) so the
  Connector Manager no longer forces operators to supply platform-managed
  fields such as `CONNECTOR_ID`.
- Make the runtime honour the schema-declared defaults: `load_config` and
  the helper config now fall back to the documented defaults for the
  optional fields (connector name/scope, TruKno base URL, interval,
  lookback), keeping the schema and runtime consistent.
- Update CONNECTOR_CONFIG_DOC.md to flag only the genuinely required fields.
- Fix the README requirements so the supported platform version matches the
  manifest `support_version` (>= 7.x) and clarify required vs optional vars.
- Add regression tests covering default application and the still-required
  connector id.
SamuelHassine
SamuelHassine previously approved these changes Jun 5, 2026

@SamuelHassine SamuelHassine left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent full-file review of the trukno external-import connector (client, config, runtime, transform, state, opencti_compat, packaging + tests). The code is clean and well-factored: env/YAML config merge with required-field validation, deterministic STIX bundle building, x_opencti_*-preserving cleanup, incremental checkpoint state, and a solid 20-test suite (~88% source coverage).

I pushed e8f68ea1ef to resolve the three remaining Copilot threads and keep the connector consistent with repo conventions:

  • Reduced the connector_config_schema.json required list to OPENCTI_URL, OPENCTI_TOKEN, TRUKNO_API_KEY (matching alienvault / ctm360-hackerview-feed), so the Connector Manager no longer forces platform-managed fields like CONNECTOR_ID.
  • Made load_config and the helper config fall back to the schema-declared defaults for the now-optional fields, so the schema and runtime stay in sync and the connector can't crash when only the required fields are supplied. Added regression tests.
  • Synced CONNECTOR_CONFIG_DOC.md and the README requirements (OpenCTI Platform >= 7.x, consistent with support_version).

All GitHub Actions checks and Codecov patch/project are green; 0 unresolved threads; mergeable: MERGEABLE. Non-blocking follow-ups for a future iteration remain noted in the description (author Identity/created_by_ref + TLP marking, and switching the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators). Approving.

@SamuelHassine

Copy link
Copy Markdown
Member

Review & fix summary

  • Independent full-file review of the connector + tests — no blocking issues found.
  • Resolved the three remaining Copilot threads in signed commit e8f68ea1ef:
    • connector_config_schema.json required now matches the repo convention (OPENCTI_URL, OPENCTI_TOKEN, TRUKNO_API_KEY) — CONNECTOR_ID and other platform-managed/default fields are no longer forced on Connector Manager users.
    • Runtime (load_config + helper config) now applies the schema-declared defaults for the optional fields, so schema and runtime stay consistent; added regression tests (20 passing).
    • CONNECTOR_CONFIG_DOC.md required column and the README requirements updated; README now states OpenCTI Platform >= 7.x to match the manifest support_version.
  • All GitHub Actions checks + Codecov patch/project are green; 0 unresolved review threads; mergeable: MERGEABLE.

Non-blocking follow-ups for a future iteration (kept out of this narrow first submission): add an author Identity / created_by_ref + TLP marking on emitted objects, and switch STIX ids from the connector-local uuid5 namespace to the canonical pycti generators so objects de-duplicate across connectors. verified: false remains (live end-to-end validation pending).

@hieuttmmo

Copy link
Copy Markdown
Author

Thank you so much @SamuelHassine and @romain-filigran Will keep that in mind for the next iteration

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 3 comments.

Comment thread external-import/trukno/src/trukno_connector/transform.py
Comment thread external-import/trukno/src/trukno_connector/transform.py
Comment thread external-import/trukno/README.md
@SamuelHassine SamuelHassine changed the title [trukno] Add external import connector feat(trukno): add external import connector (#6286) Jun 7, 2026
attack-pattern and malware objects are shared reference entities keyed by a
stable deterministic STIX id, but their created/modified were taken from the
enclosing breach's publishedAt. The same TTP/malware referenced by multiple
breaches therefore produced the same STIX id with different timestamps, making
OpenCTI flip-flop the object on every ingest. Use a fixed reference timestamp
for these shared objects so a given id always yields an identical object; the
per-breach report keeps its publish date. Adds a regression test asserting two
breaches with different publish dates emit byte-identical shared objects.

Also clarify in the README that CONNECTOR_ID is required at runtime but is
auto-injected by the OpenCTI Connector Manager (hence optional in the schema),
so it only has to be supplied for manual / docker-compose deployments.
SamuelHassine
SamuelHassine previously approved these changes Jun 7, 2026

@SamuelHassine SamuelHassine left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving after addressing the three remaining Copilot threads in signed commit 882fed4e86.

Independent full-file re-review of the connector (client, config, runtime, transform, state, opencti_compat, packaging + tests): clean and well-factored, with env/YAML config merge + required-field validation, meta= structured logging, x_opencti_*-preserving bundle cleanup, incremental checkpoint state, and a 21-test suite (~88% source coverage).

The one substantive item from this round: attack-pattern/malware are shared reference objects keyed by a stable deterministic id, but their created/modified were derived from each breach's publishedAt, so the same TTP/malware referenced by multiple breaches emitted the same id with different timestamps (OpenCTI would flip-flop the object on re-ingest). They now use a fixed REFERENCE_OBJECT_TIMESTAMP so a given id always yields a byte-identical object; the per-breach report keeps its publish date. Added a regression test, and clarified in the README that CONNECTOR_ID is required at runtime but auto-injected by the Connector Manager (hence optional in the schema).

All GitHub Actions checks and Codecov patch/project are green; 0 unresolved threads; mergeable: MERGEABLE. Non-blocking follow-ups remain noted in the description (author Identity/created_by_ref + TLP marking, and switching the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators for cross-connector de-duplication). verified: false stays pending live end-to-end validation. LGTM.

@SamuelHassine

Copy link
Copy Markdown
Member

Review & fix summary (round 3)

Independent full-file re-review of the trukno connector plus the three remaining Copilot threads from the last review.

  • Stable shared-object timestamps (the substantive one): attack-pattern/malware are shared reference objects with stable deterministic STIX ids, but their created/modified came from each breach's publishedAt — so the same TTP/malware across multiple breaches emitted the same id with different timestamps, which OpenCTI would flip-flop on every ingest. Fixed in 882fed4e86 by using a fixed REFERENCE_OBJECT_TIMESTAMP for these shared objects (the per-breach report keeps its publish date). I took Copilot's "stable timestamps for a stable id" option rather than per-breach id namespacing, so the TTP/malware stay single shared nodes instead of fragmenting per report. Added a regression test asserting two breaches with different publish dates emit byte-identical shared objects.
  • README CONNECTOR_ID: clarified that it is required at runtime but auto-injected by the OpenCTI Connector Manager (hence intentionally not in the schema's required list), so the schema/UI and README no longer contradict.

Status

  • All checks green: connector tests (21 passing locally), codecov/patch, codecov/project, signed commits, PR-conventions, CLA.
  • 0 unresolved review threads.
  • mergeable: MERGEABLE, re-approved.

Non-blocking follow-ups (future iteration)

  • Author Identity / created_by_ref + TLP marking on the emitted objects.
  • Switch the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators so objects de-duplicate across connectors.
  • verified: false remains pending live end-to-end validation against the TruKno API.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 2 comments.

Comment thread external-import/trukno/src/trukno_connector/runtime.py
Comment thread external-import/trukno/Dockerfile Outdated
- Wrap the per-breach import loop in run_once so a mid-batch fetch/transform/send failure marks the OpenCTI work item as processed with in_error=True before re-raising, instead of leaving it stuck in a running state. The per-item checkpoint is preserved, so the next cycle resumes after the last successfully imported breach (matches the to_processed(..., in_error=True) pattern used by ctm360-cyna-feed).
- Switch the Dockerfile from the Debian python:3.12-slim base to the repo-standard python:3.12-alpine + apk toolchain (mirroring templates/external-import/Dockerfile), purging the build-only git/build-base after installing requirements. Verified the image builds and the runtime imports work, with git absent from the final image.
- Add regression tests for the errored and successful work paths, and update the packaging test to assert the Alpine pattern.
SamuelHassine
SamuelHassine previously approved these changes Jun 7, 2026

@SamuelHassine SamuelHassine left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving after addressing the two remaining Copilot threads from the latest review in signed commit b85c987dbf.

Independent full-file re-review of the trukno connector (client, config, runtime, transform, state, opencti_compat, packaging + tests). Two items applied this round:

  • run_once now marks the OpenCTI work item to_processed(..., in_error=True) before re-raising on a mid-batch failure, so a failed cycle no longer leaves a work item stuck running (the per-item checkpoint means the next cycle resumes after the last successfully imported breach).
  • The Dockerfile was switched from Debian python:3.12-slim to the repo-standard python:3.12-alpine + apk toolchain (mirroring templates/external-import/Dockerfile). I verified the Alpine image builds and the runtime imports work in-container, with git absent from the final image.

Added regression tests for both work paths and updated the packaging test to the Alpine pattern. All checks green; 0 unresolved threads; mergeable: MERGEABLE. LGTM.

@SamuelHassine

Copy link
Copy Markdown
Member

Review & fix summary (round 4)

Independent full-file re-review of the trukno connector plus the two remaining Copilot threads from the latest review, fixed in signed commit b85c987dbf.

  • Work-item error handling: run_once now wraps the per-breach loop and marks the OpenCTI work item to_processed(..., in_error=True) before re-raising on a mid-batch fetch/transform/send failure, instead of leaving it stuck in a running state (matches the ctm360-cyna-feed pattern). The per-item checkpoint is preserved, so the next cycle resumes after the last successfully imported breach.
  • Alpine base image: switched the Dockerfile from Debian python:3.12-slim to the repo-standard python:3.12-alpine + apk toolchain (mirroring templates/external-import/Dockerfile), purging the build-only git/build-base after install. Verified the image builds and the runtime imports (trukno_connector, pycti, yaml, requests) work in-container with git absent from the final image.
  • Tests: added regression tests for the errored/successful work paths and updated the packaging test to assert the Alpine pattern (23 passing locally); black / isort / flake8 clean.

Status

  • All checks green: connector tests, codecov/patch, codecov/project, signed commits, PR-conventions, CLA.
  • 0 unresolved review threads. mergeable: MERGEABLE, re-approved.

Non-blocking follow-ups (future iteration)

  • Author Identity / created_by_ref + TLP marking on the emitted objects.
  • Switch the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators so objects de-duplicate across connectors.
  • verified: false remains pending live end-to-end validation against the TruKno API.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 2 comments.

Comment thread external-import/trukno/src/trukno_connector/transform.py Outdated
Comment thread external-import/trukno/src/trukno_connector/runtime.py
…, bump pycti

Resolve the two remaining Copilot review threads and keep dependencies current.

- transform: a breach with no relatedTTPs/relatedMalwares produced a STIX 2.1
  report with an empty object_refs list, which is invalid (object_refs MUST
  reference at least one object) and can be rejected by OpenCTI/pycti ingestion.
  The report is now only emitted when it has at least one linkable
  attack-pattern/malware; run_once skips the resulting empty bundle but still
  advances the per-item checkpoint so the breach is not refetched forever.
- runtime: _prepare_helper_config used setdefault(), so an explicit empty string
  (e.g. connector.name: "") was passed through to the helper while load_config
  treated it as missing and applied the default - leaving the helper and parsed
  config inconsistent. It now mirrors load_config and falls back to the default
  for missing or blank values.
- deps: bump pycti to the latest released 7.260609.0 (was 7.260604.0) in
  src/requirements.txt and sync support_version in the manifest and the README
  requirements, per the repo convention.
- tests: add regression coverage for the skipped empty breach, the non-empty
  object_refs guarantee, and the blank/explicit helper-config fields (28 passing;
  black / isort / flake8 clean).

@SamuelHassine SamuelHassine left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving after an independent full-file re-review of the trukno connector and resolving the two remaining Copilot threads from the latest review, in signed commit b51b933b74.

Two items addressed plus a dependency bump:

  • Empty object_refs (the substantive one): a breach with no relatedTTPs/relatedMalwares emitted a report with an empty object_refs list, which is invalid STIX 2.1 and can be rejected on ingestion. transform_breach_to_bundle now only emits the report when it has at least one linkable attack-pattern/malware, and run_once skips the resulting empty bundle while still advancing the per-item checkpoint (so the breach is not refetched forever and the checkpoint never stalls).
  • Helper-config consistency: _prepare_helper_config used setdefault, so an explicit blank value bypassed the defaults that load_config applies; it now mirrors load_config and falls back to the default for missing or blank connector fields.
  • Bumped pycti to the latest released 7.260609.0 and synced support_version / the README requirements.

Added regression tests for all three (28 passing locally; black / isort / flake8 clean). All GitHub Actions checks - including the new Lint & Format and STIX ID Linter workflows - plus Codecov patch/project are green; 0 unresolved review threads; mergeable: MERGEABLE. LGTM.

@SamuelHassine

Copy link
Copy Markdown
Member

Review & fix summary (round 5)

Independent full-file re-review of the trukno connector plus the two remaining Copilot threads from the latest review, fixed in signed commit b51b933b74.

  • Empty object_refs reports: a breach with no relatedTTPs/relatedMalwares produced a report with an empty object_refs list - invalid STIX 2.1 (a report MUST reference at least one object) and rejectable on ingestion. transform_breach_to_bundle now only emits the report when it has at least one linkable attack-pattern/malware; run_once skips the resulting empty bundle but still advances the per-item checkpoint so the breach is not refetched forever.
  • Helper-config consistency: _prepare_helper_config used setdefault, so an explicit blank (e.g. connector.name: "") bypassed the defaults that load_config applies. It now mirrors load_config and falls back to the default for missing or blank type/name/scope/log_level.
  • Dependency bump: pycti bumped to the latest released 7.260609.0 (was 7.260604.0), with support_version in the manifest and the README requirements synced.
  • Tests: added regression coverage for the skipped empty breach, the non-empty object_refs guarantee, and the blank/explicit helper-config fields (28 passing; black / isort / flake8 clean).

Status

  • All checks green: connector tests, Base Linter (flake8), Ensure Formatting, STIX ID Linter, codecov/patch, codecov/project, signed commits, PR-conventions, CLA.
  • 0 unresolved review threads. mergeable: MERGEABLE, re-approved.

Non-blocking follow-ups (future iteration)

  • Author Identity / created_by_ref + TLP marking on the emitted objects (this would also let entity-less breaches keep a report, since it would always carry the author ref).
  • Switch the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators so objects de-duplicate across connectors.
  • verified: false remains pending live end-to-end validation against the TruKno API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Contribution from the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(trukno): add external import connector

5 participants