feat(trukno): add external import connector (#6286)#6285
Conversation
Contributor License Agreement✅ CLA signed 💚 Thank you @hieuttmmo for signing the Contributor License Agreement! Your pull request can now be reviewed and merged. We appreciate your contribution to Filigran's open source projects! ❤️ This is an automated message from the Filigran CLA Bot. |
0f538b5 to
4c706be
Compare
romain-filigran
left a comment
There was a problem hiding this comment.
Hey @hieuttmmo . Thank you for your contribution. Can you resolve my comments.
Thank you
23aa5e3 to
d5c1edb
Compare
d5c1edb to
05fb803
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6285 +/- ##
===========================================
- Coverage 29.27% 0.44% -28.84%
===========================================
Files 1918 1894 -24
Lines 120028 119447 -581
===========================================
- Hits 35144 530 -34614
- Misses 84884 118917 +34033
📢 Thoughts on this report? Let us know! 🚀 New features to boost your workflow:
|
|
hey @romain-filigran - is there anything else I can do to accelerate the process for merging this PR? Thanks. |
There was a problem hiding this comment.
Pull request overview
This pull request adds a new external-import/trukno connector to ingest TruKno breach intelligence into OpenCTI as STIX 2.1 Reports with linked attack-pattern and malware objects, and wires it into the repository’s packaging and test tooling.
Changes:
- Added the TruKno connector implementation (client, config/runtime, state management, STIX transformation), plus Docker packaging and operator docs/metadata.
- Added an upstream test suite for the connector (client/config/runtime/state/transform + fixtures).
- Updated CI/test harness to install required system dependencies and refined
run_test.shdependency handling.
Reviewed changes
Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
run_test.sh |
Adjusts dependency scope detection and pycti handling during test runs. |
external-import/trukno/tests/test-requirements.txt |
Declares connector test dependencies. |
external-import/trukno/tests/test_state_transform.py |
Tests state checkpointing, transform output, and bundle cleanup behavior. |
external-import/trukno/tests/test_runtime.py |
Tests one-cycle runtime behavior and config loading paths. |
external-import/trukno/tests/test_main.py |
Smoke test for importability of entrypoint and key modules. |
external-import/trukno/tests/test_config.py |
Tests config validation and required fields. |
external-import/trukno/tests/test_client.py |
Tests TruKno API client request construction and error propagation. |
external-import/trukno/tests/fixtures/breach_with_entities.json |
Fixture payload containing related TTPs/malware. |
external-import/trukno/tests/fixtures/breach_list.json |
Fixture payload for breach list endpoint. |
external-import/trukno/tests/fixtures/breach_detail.json |
Fixture payload for breach details endpoint. |
external-import/trukno/tests/conftest.py |
Test path/bootstrap helpers and fixture root. |
external-import/trukno/src/trukno_connector/transform.py |
Converts TruKno breach payloads into STIX bundle objects. |
external-import/trukno/src/trukno_connector/state.py |
Implements incremental checkpoint state and timestamp parsing/formatting. |
external-import/trukno/src/trukno_connector/runtime.py |
Loads config, builds helper/client, runs polling loop and persists checkpoints. |
external-import/trukno/src/trukno_connector/opencti_compat.py |
Performs bundle cleanup filtering for OpenCTI ingestion. |
external-import/trukno/src/trukno_connector/models.py |
Defines minimal typed model(s) for API list results. |
external-import/trukno/src/trukno_connector/config.py |
Loads/merges config from YAML/env and validates required values. |
external-import/trukno/src/trukno_connector/client.py |
Implements TruKno HTTP client for listing updates and fetching details. |
external-import/trukno/src/trukno_connector/__init__.py |
Exposes the connector’s public API surface. |
external-import/trukno/src/requirements.txt |
Connector runtime dependencies. |
external-import/trukno/src/main.py |
Connector entrypoint wrapper around runtime main loop. |
external-import/trukno/src/config.yml.sample |
Sample YAML configuration for manual runs. |
external-import/trukno/README.md |
Connector documentation (scope, config, behavior, deployment). |
external-import/trukno/entrypoint.sh |
Container entrypoint script. |
external-import/trukno/Dockerfile |
Docker image build for the connector. |
external-import/trukno/docker-compose.yml |
Example compose deployment for the connector. |
external-import/trukno/.dockerignore |
Docker build context exclusions. |
external-import/trukno/__metadata__/logo.svg |
Connector logo asset for metadata/manager. |
external-import/trukno/__metadata__/connector_manifest.json |
Connector manifest metadata (title, version support, image, etc.). |
external-import/trukno/__metadata__/connector_config_schema.json |
JSON schema for connector configuration (Connector Manager). |
external-import/trukno/__metadata__/CONNECTOR_CONFIG_DOC.md |
Generated/maintained config documentation for the connector. |
.github/workflows/tests-connectors.yml |
Installs system dependencies needed for connector test runs in GitHub Actions. |
.circleci/config.yml |
Installs libmagic1 in CircleCI job dependencies. |
Address the remaining Copilot review threads on the TruKno connector. - Reduce the connector_config_schema.json `required` list to the repo convention (`OPENCTI_URL`, `OPENCTI_TOKEN`, `TRUKNO_API_KEY`) so the Connector Manager no longer forces operators to supply platform-managed fields such as `CONNECTOR_ID`. - Make the runtime honour the schema-declared defaults: `load_config` and the helper config now fall back to the documented defaults for the optional fields (connector name/scope, TruKno base URL, interval, lookback), keeping the schema and runtime consistent. - Update CONNECTOR_CONFIG_DOC.md to flag only the genuinely required fields. - Fix the README requirements so the supported platform version matches the manifest `support_version` (>= 7.x) and clarify required vs optional vars. - Add regression tests covering default application and the still-required connector id.
SamuelHassine
left a comment
There was a problem hiding this comment.
Independent full-file review of the trukno external-import connector (client, config, runtime, transform, state, opencti_compat, packaging + tests). The code is clean and well-factored: env/YAML config merge with required-field validation, deterministic STIX bundle building, x_opencti_*-preserving cleanup, incremental checkpoint state, and a solid 20-test suite (~88% source coverage).
I pushed e8f68ea1ef to resolve the three remaining Copilot threads and keep the connector consistent with repo conventions:
- Reduced the
connector_config_schema.jsonrequiredlist toOPENCTI_URL,OPENCTI_TOKEN,TRUKNO_API_KEY(matchingalienvault/ctm360-hackerview-feed), so the Connector Manager no longer forces platform-managed fields likeCONNECTOR_ID. - Made
load_configand the helper config fall back to the schema-declared defaults for the now-optional fields, so the schema and runtime stay in sync and the connector can't crash when only the required fields are supplied. Added regression tests. - Synced
CONNECTOR_CONFIG_DOC.mdand the README requirements (OpenCTI Platform >= 7.x, consistent withsupport_version).
All GitHub Actions checks and Codecov patch/project are green; 0 unresolved threads; mergeable: MERGEABLE. Non-blocking follow-ups for a future iteration remain noted in the description (author Identity/created_by_ref + TLP marking, and switching the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators). Approving.
|
Review & fix summary
Non-blocking follow-ups for a future iteration (kept out of this narrow first submission): add an author |
|
Thank you so much @SamuelHassine and @romain-filigran Will keep that in mind for the next iteration |
attack-pattern and malware objects are shared reference entities keyed by a stable deterministic STIX id, but their created/modified were taken from the enclosing breach's publishedAt. The same TTP/malware referenced by multiple breaches therefore produced the same STIX id with different timestamps, making OpenCTI flip-flop the object on every ingest. Use a fixed reference timestamp for these shared objects so a given id always yields an identical object; the per-breach report keeps its publish date. Adds a regression test asserting two breaches with different publish dates emit byte-identical shared objects. Also clarify in the README that CONNECTOR_ID is required at runtime but is auto-injected by the OpenCTI Connector Manager (hence optional in the schema), so it only has to be supplied for manual / docker-compose deployments.
SamuelHassine
left a comment
There was a problem hiding this comment.
Re-approving after addressing the three remaining Copilot threads in signed commit 882fed4e86.
Independent full-file re-review of the connector (client, config, runtime, transform, state, opencti_compat, packaging + tests): clean and well-factored, with env/YAML config merge + required-field validation, meta= structured logging, x_opencti_*-preserving bundle cleanup, incremental checkpoint state, and a 21-test suite (~88% source coverage).
The one substantive item from this round: attack-pattern/malware are shared reference objects keyed by a stable deterministic id, but their created/modified were derived from each breach's publishedAt, so the same TTP/malware referenced by multiple breaches emitted the same id with different timestamps (OpenCTI would flip-flop the object on re-ingest). They now use a fixed REFERENCE_OBJECT_TIMESTAMP so a given id always yields a byte-identical object; the per-breach report keeps its publish date. Added a regression test, and clarified in the README that CONNECTOR_ID is required at runtime but auto-injected by the Connector Manager (hence optional in the schema).
All GitHub Actions checks and Codecov patch/project are green; 0 unresolved threads; mergeable: MERGEABLE. Non-blocking follow-ups remain noted in the description (author Identity/created_by_ref + TLP marking, and switching the STIX ids from the connector-local uuid5 namespace to the canonical pycti generators for cross-connector de-duplication). verified: false stays pending live end-to-end validation. LGTM.
Review & fix summary (round 3)Independent full-file re-review of the
Status
Non-blocking follow-ups (future iteration)
|
- Wrap the per-breach import loop in run_once so a mid-batch fetch/transform/send failure marks the OpenCTI work item as processed with in_error=True before re-raising, instead of leaving it stuck in a running state. The per-item checkpoint is preserved, so the next cycle resumes after the last successfully imported breach (matches the to_processed(..., in_error=True) pattern used by ctm360-cyna-feed). - Switch the Dockerfile from the Debian python:3.12-slim base to the repo-standard python:3.12-alpine + apk toolchain (mirroring templates/external-import/Dockerfile), purging the build-only git/build-base after installing requirements. Verified the image builds and the runtime imports work, with git absent from the final image. - Add regression tests for the errored and successful work paths, and update the packaging test to assert the Alpine pattern.
SamuelHassine
left a comment
There was a problem hiding this comment.
Re-approving after addressing the two remaining Copilot threads from the latest review in signed commit b85c987dbf.
Independent full-file re-review of the trukno connector (client, config, runtime, transform, state, opencti_compat, packaging + tests). Two items applied this round:
run_oncenow marks the OpenCTI work itemto_processed(..., in_error=True)before re-raising on a mid-batch failure, so a failed cycle no longer leaves a work item stuck running (the per-item checkpoint means the next cycle resumes after the last successfully imported breach).- The Dockerfile was switched from Debian
python:3.12-slimto the repo-standardpython:3.12-alpine+apktoolchain (mirroringtemplates/external-import/Dockerfile). I verified the Alpine image builds and the runtime imports work in-container, withgitabsent from the final image.
Added regression tests for both work paths and updated the packaging test to the Alpine pattern. All checks green; 0 unresolved threads; mergeable: MERGEABLE. LGTM.
Review & fix summary (round 4)Independent full-file re-review of the
Status
Non-blocking follow-ups (future iteration)
|
…, bump pycti Resolve the two remaining Copilot review threads and keep dependencies current. - transform: a breach with no relatedTTPs/relatedMalwares produced a STIX 2.1 report with an empty object_refs list, which is invalid (object_refs MUST reference at least one object) and can be rejected by OpenCTI/pycti ingestion. The report is now only emitted when it has at least one linkable attack-pattern/malware; run_once skips the resulting empty bundle but still advances the per-item checkpoint so the breach is not refetched forever. - runtime: _prepare_helper_config used setdefault(), so an explicit empty string (e.g. connector.name: "") was passed through to the helper while load_config treated it as missing and applied the default - leaving the helper and parsed config inconsistent. It now mirrors load_config and falls back to the default for missing or blank values. - deps: bump pycti to the latest released 7.260609.0 (was 7.260604.0) in src/requirements.txt and sync support_version in the manifest and the README requirements, per the repo convention. - tests: add regression coverage for the skipped empty breach, the non-empty object_refs guarantee, and the blank/explicit helper-config fields (28 passing; black / isort / flake8 clean).
SamuelHassine
left a comment
There was a problem hiding this comment.
Re-approving after an independent full-file re-review of the trukno connector and resolving the two remaining Copilot threads from the latest review, in signed commit b51b933b74.
Two items addressed plus a dependency bump:
- Empty
object_refs(the substantive one): a breach with norelatedTTPs/relatedMalwaresemitted areportwith an emptyobject_refslist, which is invalid STIX 2.1 and can be rejected on ingestion.transform_breach_to_bundlenow only emits the report when it has at least one linkableattack-pattern/malware, andrun_onceskips the resulting empty bundle while still advancing the per-item checkpoint (so the breach is not refetched forever and the checkpoint never stalls). - Helper-config consistency:
_prepare_helper_configusedsetdefault, so an explicit blank value bypassed the defaults thatload_configapplies; it now mirrorsload_configand falls back to the default for missing or blank connector fields. - Bumped
pyctito the latest released7.260609.0and syncedsupport_version/ the README requirements.
Added regression tests for all three (28 passing locally; black / isort / flake8 clean). All GitHub Actions checks - including the new Lint & Format and STIX ID Linter workflows - plus Codecov patch/project are green; 0 unresolved review threads; mergeable: MERGEABLE. LGTM.
Review & fix summary (round 5)Independent full-file re-review of the
Status
Non-blocking follow-ups (future iteration)
|
Proposed changes
external-import/truknoconnector for TruKno breach intelligence.attack-patternandmalwareobjects.Related issues
Checklist
Validation
uv run --python 3.11 python -m compileall external-import/trukno/srcdocker build -t opencti/connector-trukno:verify external-import/truknotrukno_connector,pycti,yaml, andrequestsdocker compose -f external-import/trukno/docker-compose.yml configpython -m pytest external-import/trukno/tests -q(28 passed;black/isort/flake8clean)Scope of this first submission
This initial version intentionally keeps the ingestion scope narrow and reviewable:
report,attack-pattern,malware. The connector does not yet create threat actors, intrusion sets, indicators, vulnerabilities, or richer relationship graphs beyond report object references.Maintainer review updates
The following changes were applied during review to make this PR merge-ready (on top of the contributor's already-resolved review threads).
pytest-covfromtests/test-requirements.txt. The connector ships no pytest--covconfiguration, so withpytest-covpre-installedrun_test.shran the suite without a coverage target and uploaded an empty report (codecov/patchwas 0%). Without the pin,run_test.shadds a bare--covand measures the connector source (~88%), socodecov/patchnow passes.e8f68ea1ef): reduced theconnector_config_schema.jsonrequiredlist to the repo convention (OPENCTI_URL,OPENCTI_TOKEN,TRUKNO_API_KEY) so the Connector Manager no longer forces platform-managed fields such asCONNECTOR_ID. To keep the schema and runtime consistent,load_configand the helper config now fall back to the schema-declared defaults for the optional fields, with added regression tests.CONNECTOR_CONFIG_DOC.mdand the README requirements were updated to match.882fed4e86):attack-patternandmalwareare shared reference objects keyed by a stable deterministic STIX id, but theircreated/modifiedwere taken from the enclosing breach'spublishedAt. The same TTP/malware referenced by several breaches therefore emitted the same id with different timestamps, making OpenCTI flip-flop the object on each ingest. They now use a fixedREFERENCE_OBJECT_TIMESTAMP, so a given id always yields a byte-identical object (the per-breachreportkeeps its publish date). Added a regression test asserting two breaches with different publish dates produce identical shared objects, and clarified in the README thatCONNECTOR_IDis required at runtime but auto-injected by the Connector Manager (hence optional in the schema).b85c987dbf):run_oncenow wraps the per-breach loop and marks the OpenCTI work itemto_processed(..., in_error=True)before re-raising on a mid-batch fetch/transform/send failure, instead of leaving the work stuck in a running state (the per-item checkpoint means the next cycle resumes after the last successfully imported breach). The Dockerfile was switched from the Debianpython:3.12-slimbase to the repo-standardpython:3.12-alpine+apktoolchain (mirroringtemplates/external-import/Dockerfile), with the build-onlygit/build-basepurged after install; verified the image builds and the runtime imports work in-container withgitabsent from the final image. Added regression tests for the errored/successful work paths and updated the packaging test to assert the Alpine pattern.b51b933b74): a breach with norelatedTTPs/relatedMalwarespreviously emitted areportwith an emptyobject_refslist, which is invalid STIX 2.1 (a report MUST reference at least one object) and can be rejected on ingestion.transform_breach_to_bundlenow builds the linked objects first and only emits the report when it has at least oneobject_ref;run_onceskips the resulting empty bundle but still advances the per-item checkpoint so the breach is not refetched forever. Separately,_prepare_helper_configusedsetdefault, so an explicit empty string (e.g.connector.name: "") was passed through to the helper whileload_configtreats blanks as missing and applies the default - it now mirrorsload_configand falls back to the default for missing or blanktype/name/scope/log_level. Finally,pyctiwas bumped to the latest released7.260609.0(was7.260604.0) insrc/requirements.txt, withsupport_versionin the manifest and the README requirements synced. Added regression tests for all three behaviours (28 passing).All GitHub Actions checks (including the Lint & Format and STIX ID Linter workflows) and Codecov patch/project are green; 0 unresolved review threads.
Notes for the final merge
Identity/created_by_refor TLP marking (which is also why the report can have no other object to reference for entity-less breaches), and STIX ids are minted from a connector-localuuid5namespace rather than the canonicalpyctigenerators (functional and deterministic, but objects won't de-duplicate against other connectors). Both are reasonable to address as the ingestion scope grows.verified: false- live end-to-end validation against the TruKno API remains the contributor's responsibility.