Skip to content

chore: address PR comments#6581

Draft
carlostomas-dd wants to merge 5 commits into
OpenCTI-Platform:masterfrom
carlostomas-dd:carlos.tomas/pr-comments
Draft

chore: address PR comments#6581
carlostomas-dd wants to merge 5 commits into
OpenCTI-Platform:masterfrom
carlostomas-dd:carlos.tomas/pr-comments

Conversation

@carlostomas-dd

Copy link
Copy Markdown

Proposed changes

Related issues

Checklist

  • I consider the submitted work as finished
  • I have signed my commits using GPG key.
  • I tested the code for its functionality using different use cases
  • I added/update the relevant documentation (either on github or on notion)
  • Where necessary I refactored code to improve the overall quality

Further comments

carlostomas-dd and others added 5 commits May 20, 2026 09:40
… Threat Intel

Adds a stream connector that subscribes to the OpenCTI live stream and
forwards indicator events (create/update/delete) to the Datadog Threat
Intelligence API. Supports ip_address, domain, and sha256 indicator
types; batches up to 10,000 indicators or flushes every 30s with gzip
compression and tenacity-based retry.
Resolves the seven outstanding Copilot review threads on the PR
plus one senior-reviewer correctness finding folded into the same
signed commit. Each fix ships with a regression test so the
contract is pinned end-to-end.

Copilot threads
---------------

* **``utils.py:143`` — ``datetime.fromisoformat`` crashes on the
  ``Z`` suffix** that OpenCTI / STIX stream payloads commonly use
  (``2024-04-29T12:33:20.098Z``). The previous shape raised
  ``ValueError`` on Python < 3.11 and killed the stream callback
  for the offending event. New ``_parse_valid_until`` helper
  normalises ``Z`` → ``+00:00``, coerces tz-naive to UTC, and
  returns ``None`` on unparseable input so a malformed upstream
  field cannot take down the connector. The expiry comparison is
  now always aware-vs-aware, so the downstream
  ``< datetime.now(timezone.utc)`` cannot raise ``TypeError``
  either. Three new tests
  (``test_is_valid_event_accepts_z_suffixed_future_valid_until``,
  ``test_is_valid_event_rejects_z_suffixed_expired_valid_until``,
  ``test_is_valid_event_does_not_crash_on_unparseable_valid_until``)
  pin the behaviour.

* **``utils.py:106`` — docstring/return-type mismatch**. The old
  docstring said the function returned the event ``data`` dict
  (truthy) when valid, but it actually returned a ``bool``.
  Rewrote the docstring to say ``True`` if the event should be
  forwarded, ``False`` otherwise — matches the actual return and
  what the call site (`connector.process_message`) consumes.

* **``api_client.py:67–70`` / ``api_client.py:127`` /
  ``api_client.py:144–146`` — positional dicts in
  ``connector_logger.*`` calls**. Switched every call site
  (``_post_indicators`` tenacity ``before_sleep``, the POST-response
  debug log, ``_flush_batch``'s success / failure logs, and
  ``_append_to_batch``'s missing-id error) to the explicit
  ``meta={...}`` keyword shape used everywhere else in the repo
  (e.g. ``external-import/email-intel-imap/src/base_connector/connector.py``).
  Passing a dict positionally to pycti's logger lets the dict
  be misinterpreted as ``*args`` format args and the structured
  context get dropped from the JSON log line.

* **``api_client.py:126`` — ``_flush_batch`` releasing
  ``batch_lock`` during the HTTP POST**. The previous shape held
  the lock for the full retry cycle (up to ~60 s of exponential
  backoff during a Datadog outage), blocking every incoming
  ``_append_to_batch`` call and serialising the timer-driven
  flushes against ``process_indicator``. Rewrote to snapshot the
  batch under the lock (shallow copy keeps value references
  shared with the live batch), POST without the lock, then
  re-acquire only to clean up the entries we actually sent.
  Cleanup uses an ``is`` identity check so a concurrent
  ``_append_to_batch`` that REPLACES one of the entries during
  the POST (e.g. an update arriving mid-flight) leaves the new
  state in the batch for the next flush instead of being
  silently dropped. Two new tests:
  ``test_append_can_proceed_concurrently_with_in_flight_post``
  uses a threading.Event-coordinated slow POST to assert
  ``_append_to_batch`` runs while the POST is in flight (the
  previous lock-held-during-POST shape would deadlock-block this
  test);
  ``test_flush_preserves_concurrent_replacement`` simulates an
  update arriving during the POST and asserts the new state is
  retained in the batch.

* **``connector.py:75`` — positional dicts in
  ``process_message``'s logs**. Same fix pattern as
  ``api_client.py``; switched ``"Message received"``,
  ``"Cannot parse message"`` and ``"Message parsed"`` to the
  ``meta={...}`` keyword shape.

* **``config.yml.sample:8`` — template placeholders**. Renamed
  the ``connector.name`` placeholder from ``Template Connector``
  to ``Datadog Threat Intel`` and set ``connector.scope`` to
  ``indicator`` so an operator can ``cp config.yml.sample
  config.yml`` and run the connector with only the secret
  values to fill in.

* **``connector_manifest.json:14`` — ``support_version`` /
  README mismatch**. The manifest declared ``>=6.8.12`` but the
  README said ``>=6.8.13``. Aligned the manifest to ``>=6.8.13``
  to match the README; the platform's connector registry uses
  the manifest as the source of truth for the OpenCTI version
  gate, so the README's stricter constraint must win to avoid
  letting an operator install the connector on a platform that
  cannot run it.

Senior-reviewer finding folded into the same commit
---------------------------------------------------

* **``utils.py:138`` — ``delete`` events for expired indicators
  silently dropped, leaving stale entries in Datadog**. The
  pre-fix expiry filter dropped every event for an indicator
  whose ``valid_until`` was in the past — including ``delete``
  events. The realistic scenario: an indicator is created and
  forwarded to Datadog, later expires in OpenCTI, then a user
  deletes it on the OpenCTI side. The previous shape silently
  swallowed the delete, leaving the stale indicator in Datadog
  indefinitely. The expiry gate now exempts ``delete`` events
  so the remote feed stays consistent with OpenCTI. Two new
  tests
  (``test_is_valid_event_delete_passes_through_expired_valid_until``,
  ``test_is_valid_event_delete_passes_through_expired_z_suffixed_valid_until``)
  pin the bypass; a sanity test
  (``test_is_valid_event_create_still_rejected_when_expired``)
  ensures the bypass does NOT widen the gate for ``create`` /
  ``update`` events on an already-expired indicator.

Lint clean (``python -m py_compile``, ``black --check``,
``isort --check-only --profile black --line-length 88``,
``flake8 --select=F``, ``flake8 --ignore=E,W``) across
``stream/datadog-intel/{src,tests}/``. Full test suite is
**76 / 76 green** locally
(``cd stream/datadog-intel && python -m pytest tests/``); the
eight new tests are the eight added by this commit and the
68 pre-existing tests continue to pass unchanged.

Co-authored-by: Carlos Tomás <carlos.tomas@datadoghq.com>
@filigran-cla-bot filigran-cla-bot Bot added the cla:pending CLA signature required. label Jun 2, 2026
@filigran-cla-bot

filigran-cla-bot Bot commented Jun 2, 2026

Copy link
Copy Markdown

Contributor License Agreement

CLA signed 💚

Thank you @carlostomas-dd for signing the Contributor License Agreement! Your pull request can now be reviewed and merged.

We appreciate your contribution to Filigran's open source projects! ❤️

This is an automated message from the Filigran CLA Bot.

@filigran-cla-bot filigran-cla-bot Bot removed the cla:pending CLA signature required. label Jun 5, 2026
@SamuelHassine SamuelHassine changed the title Carlos.tomas/pr comments chore: address PR comments Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants