Add mongosync_insights to repository by Nixiss · Pull Request #24 · mongodb/kb-assets

Nixiss · 2026-06-15T15:21:24Z

No description provided.

+    Returns the full snapshot dict (including template_data) or None.
+    """
+    path = _snapshot_path(snapshot_id)
+    if not os.path.exists(path):


+        return None
+
+    try:
+        with open(path, 'r', encoding='utf-8') as f:


+
+    # Refresh mtime on snapshot file
+    try:
+        os.utime(path, None)


+        pass
+
+    meta_path = _snapshot_meta_path(snapshot_id)
+    if os.path.exists(meta_path):


+    meta_path = _snapshot_meta_path(snapshot_id)
+    if os.path.exists(meta_path):
+        try:
+            os.utime(meta_path, None)


+
+        if os.path.exists(meta_path):
+            try:
+                os.remove(meta_path)


+
+    response.set_cookie(
+        SESSION_COOKIE_NAME,
+        session_id,


+
+    response.set_cookie(
+        SESSION_COOKIE_NAME,
+        session_id,


+        ), 400
+
+    db_name = (session_data or {}).get("verifier_db_name", "migration_verification_metadata")
+    return jsonify(gatherVerifierMetrics(connection_string, db_name))


+        return jsonify(result)
+    except Exception as e:
+        logger.error("Log search error: %s", e)
+        return jsonify({"error": "Search failed", "detail": str(e)}), 500


semgrep-code-mongodb · 2026-06-15T15:24:15Z

+        # Make HTTP GET request to the endpoint
+        url = f"http://{endpoint_url}"
+        logger.info(f"Fetching data from endpoint: {url}")
+        response = requests.get(url, timeout=10)


Semgrep identified an issue in your code:

requests.get fetches url over plain HTTP, so anyone on the network can read or alter the endpoint response.

More details about this

requests.get(url, timeout=10) sends data to the url built as f"http://{endpoint_url}", so this call always uses plain HTTP instead of encrypted HTTPS.

A plausible attack is:

A user runs this code on a shared office, cloud, or coffee-shop network, and endpoint_url points to a real service such as api.internal.example.com/status.

Because url starts with http://, the request from requests.get(...) is sent in cleartext. An attacker on the same network can intercept it with tools like tcpdump or mitmproxy.

The attacker can read the full request and response, including any headers, tokens, cookies, or operational data this endpoint returns in data = response.json().

The attacker can also tamper with the response before your code parses it. That lets them inject fake JSON into data, which then flows into progress, warnings_list, and the dashboard values shown to users.

Example interception on the same network:

sudo tcpdump -A -i wlan0 'tcp port 80 and host api.internal.example.com'

If this request includes sensitive headers or the endpoint returns internal replication state, those values are exposed because the url is fetched over HTTP.

Dataflow graph

flowchart LR classDef invis fill:white, stroke: none classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none subgraph File0["mongosync_insights/lib/live_migration_metrics.py"] direction LR %% Source subgraph Source direction LR v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/live_migration_metrics.py#L561 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 561] http://</a>"] end %% Intermediate subgraph Traces0[Traces] direction TB v2["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/live_migration_metrics.py#L561 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 561] url</a>"] end %% Sink subgraph Sink direction LR v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/live_migration_metrics.py#L563 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 563] url</a>"] end end %% Class Assignment Source:::invis Sink:::invis Traces0:::invis File0:::invis %% Connections Source --> Traces0 Traces0 --> Sink

Loading

To resolve this comment:

✨ Commit fix suggestion

Change the URL construction to use https:// instead of http://, for example url = f"https://{endpoint_url}".

Keep the requests.get call the same, but point it at the new HTTPS URL: response = requests.get(url, timeout=10).

If endpoint_url can already include a scheme, normalize it before building the request so you do not accidentally keep an insecure URL. For example, replace a leading http:// with https://, or parse it and rebuild it with the https scheme.

If the endpoint does not support HTTPS yet, update the service configuration or caller input so endpoint_url refers to an HTTPS-enabled endpoint such as example.com:443 instead of a plain HTTP listener.

Manually verify that the request still succeeds against the target endpoint and that certificate errors do not occur. If you see TLS verification failures, fix the server certificate or trust configuration rather than disabling verification with verify=False.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by request-with-http.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

Fix the code

Reply /fp $reason (if security gap doesn’t exist)

Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)

Reply /other $reason (e.g., test-only)

_{You can view more details about this finding in the Semgrep AppSec Platform.}

semgrep-code-mongodb · 2026-06-15T15:24:16Z

+        )
+        response.headers["X-XSS-Protection"] = "1; mode=block"
+        response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
+        return response


Semgrep identified an issue in your code:

add_security_headers sets several security headers but omits X-Permitted-Cross-Domain-Policies before returning the Flask response. This can leave older Flash/Silverlight cross-domain access to your origin less restricted than intended.

More details about this

add_security_headers(response) adds several browser security headers to every Flask response, but it returns response without setting response.headers["X-Permitted-Cross-Domain-Policies"].

A plausible misuse is: 1) an attacker hosts a malicious Flash or Silverlight file on another site, 2) that content tries to load data from this app's origin, such as pages returned by hub() or files served by mi_static_js(filename), 3) because add_security_headers does not send X-Permitted-Cross-Domain-Policies, older Adobe/Microsoft cross-domain policy behavior is not explicitly locked down, so the plugin may rely on permissive defaults or other policy files, 4) the attacker then reads or embeds information from your domain through that plugin content.

The issue is specifically at the end of add_security_headers: the function sets Strict-Transport-Security, X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Content-Security-Policy, and Permissions-Policy, then return response without adding X-Permitted-Cross-Domain-Policies.

Dataflow graph

flowchart LR classDef invis fill:white, stroke: none classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none subgraph File0["mongosync_insights/mongosync_insights.py"] direction LR %% Source subgraph Source direction LR v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"</a>"] end %% Intermediate %% Sink subgraph Sink direction LR v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L78 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 78] return response</a>"] end end %% Class Assignment Source:::invis Sink:::invis File0:::invis %% Connections Source --> Sink

Loading

To resolve this comment:

✨ Commit fix suggestion

Suggested change

return response

response.headers["X-Permitted-Cross-Domain-Policies"] = "none"

return response

View step-by-step instructions

Update the add_security_headers function to set the missing X-Permitted-Cross-Domain-Policies header on every response.

Add a header assignment before return response, for example: response.headers["X-Permitted-Cross-Domain-Policies"] = "none".

Use "none" unless the application must explicitly allow Adobe cross-domain policy files. This tells clients not to load cross-domain policies from your site.

Keep this header in the existing @app.after_request handler so it is applied consistently to all routes.

Alternatively, if you intentionally serve a valid cross-domain policy file for legacy Flash or Silverlight integrations, set the header to the narrowest value that still works, such as response.headers["X-Permitted-Cross-Domain-Policies"] = "master-only" instead of a broader policy.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by after-request-permitted-cross-domain-policies.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

Fix the code

Reply /fp $reason (if security gap doesn’t exist)

Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)

Reply /other $reason (e.g., test-only)

_{You can view more details about this finding in the Semgrep AppSec Platform.}

semgrep-code-mongodb · 2026-06-15T15:24:18Z

+        )
+        response.headers["X-XSS-Protection"] = "1; mode=block"
+        response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
+        return response


Semgrep identified an issue in your code:

add_security_headers(response) sets several security headers but leaves caching unspecified, so authenticated responses may be stored and later exposed from browser or proxy caches.

More details about this

add_security_headers(response) adds headers like Strict-Transport-Security, X-Frame-Options, and Content-Security-Policy, but it returns the same response without setting any Cache-Control header. If /, /logout, or an error page later includes user-specific data, a shared browser, disk cache, or proxy can store that authenticated response and show it to the next person who uses the machine or intermediary.

Plausible exploit path:

A victim signs in and requests a page that returns sensitive content through Flask, and add_security_headers(response) runs on that response.

Because response.headers never gets a Cache-Control value, the browser or an intermediate cache is free to keep a copy of that page.

On a shared kiosk, VDI, or corporate proxy, an attacker opens the same URL after the victim logs out and the cached response is reused.

The attacker can read whatever the original response contained, such as account details, internal app data, or tokens embedded in the page, without needing the victim’s current session.

The risky part here is not the existing security headers themselves; it is that this global @app.after_request handler applies to every returned response but leaves caching behavior unspecified.

Dataflow graph

flowchart LR classDef invis fill:white, stroke: none classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none subgraph File0["mongosync_insights/mongosync_insights.py"] direction LR %% Source subgraph Source direction LR v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"</a>"] end %% Intermediate %% Sink subgraph Sink direction LR v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L78 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 78] return response</a>"] end end %% Class Assignment Source:::invis Sink:::invis File0:::invis %% Connections Source --> Sink

Loading

To resolve this comment:

✨ Commit fix suggestion

Suggested change

return response

@app.after_request

def add_security_headers(response):

response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"

response.headers["X-Content-Type-Options"] = "nosniff"

response.headers["X-Frame-Options"] = "DENY"

response.headers["Referrer-Policy"] = "no-referrer"

response.headers["Content-Security-Policy"] = (

"default-src 'self'; "

"script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly; "

"style-src 'self' 'unsafe-inline'; "

"img-src 'self' data: blob: https:; "

"font-src 'self' data:; "

"connect-src 'self' blob:;"

)

response.headers["X-XSS-Protection"] = "1; mode=block"

response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"

# Keep static/public assets cacheable while preventing caching of pages that may

# contain authenticated, session-bound, or user-specific content.

if request.path.startswith("/static/") or request.path.startswith("/images/"):

response.headers["Cache-Control"] = "public, max-age=31536000, immutable"

else:

response.headers["Cache-Control"] = "no-store, no-cache, must-revalidate, private"

response.headers["Pragma"] = "no-cache"

response.headers["Expires"] = "0"

return response

View step-by-step instructions

Update the add_security_headers @app.after_request handler to also set a Cache-Control header on responses.

Use a restrictive value for authenticated or user-specific pages, for example response.headers["Cache-Control"] = "no-store, no-cache, must-revalidate, private".

Add a legacy compatibility header if you need to support older HTTP/1.0 caches: response.headers["Pragma"] = "no-cache".

Add an expiration header to prevent stale cached copies, for example response.headers["Expires"] = "0". This makes browsers and proxies avoid storing sensitive responses.

Apply the strict cache headers at least to routes that return authenticated content or anything tied to a session cookie, such as pages reached after login, logout responses, and any page that shows user-specific data.

Alternatively, if some routes serve static or fully public content and you want them to remain cacheable, set Cache-Control conditionally in add_security_headers, such as checking request.path and only using no-store, no-cache, must-revalidate, private for sensitive routes while leaving public assets with a separate cache policy.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by after-request-cache-control.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

Fix the code

Reply /fp $reason (if security gap doesn’t exist)

Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)

Reply /other $reason (e.g., test-only)

_{You can view more details about this finding in the Semgrep AppSec Platform.}

semgrep-code-mongodb · 2026-06-15T15:24:23Z

Semgrep found 1 tainted-log-injection-stdlib-flask finding:

mongosync_insights/lib/logs_metrics.py
- L113 - Triage

Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.

View Dataflow Graph

flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>mongosync_insights/lib/logs_metrics.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] request.files</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] file</a>"]

            v3["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L76 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 76] filename</a>"]

            v4["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L107 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 107] detect_mime_type</a>"]

            v5["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L29 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 29] filename</a>"]

            v6["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L43 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 43] mime_type</a>"]

            v7["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L107 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 107] file_mime_type</a>"]
        end
            v2 --> v3
            v3 --> v4
            v4 --> v5
            v5 --> v6
            v6 --> v7
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L113 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 113] f&quot;Invalid MIME type: {file_mime_type}. Allowed: {ALLOWED_MIME_TYPES}&quot;</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

Fix the code
Reply /fp $reason (if security gap doesn’t exist)
Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
Reply /other $reason (e.g., test-only)

iammfh · 2026-06-16T15:37:11Z

+    --license "Apache-2.0" \
+    --vendor "MongoDB Support" \
+    --description "Mongosync Insights — MongoDB migration monitoring dashboard" \
+    --url "https://github.com/mongodb/support-tools" \


this build script will pull the code from support-tools repo.

Add mongosync_insights to repository

b185be6

Nixiss requested review from iammfh, kallimachos and kevinadi as code owners June 15, 2026 15:21

github-advanced-security AI found potential problems Jun 15, 2026

View reviewed changes

semgrep-code-mongodb Bot reviewed Jun 15, 2026

View reviewed changes

iammfh reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mongosync_insights to repository#24

Add mongosync_insights to repository#24
Nixiss wants to merge 1 commit into
mainfrom
mongosync_insights

Nixiss commented Jun 15, 2026

Uh oh!

semgrep-code-mongodb Bot Jun 15, 2026

Uh oh!

semgrep-code-mongodb Bot Jun 15, 2026

Uh oh!

semgrep-code-mongodb Bot Jun 15, 2026

Uh oh!

semgrep-code-mongodb Bot commented Jun 15, 2026

Uh oh!

iammfh Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	return response
	response.headers["X-Permitted-Cross-Domain-Policies"] = "none"
	return response

-        return response
+    @app.after_request
+    def add_security_headers(response):
+        response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
+        response.headers["X-Content-Type-Options"] = "nosniff"
+        response.headers["X-Frame-Options"] = "DENY"
+        response.headers["Referrer-Policy"] = "no-referrer"
+        response.headers["Content-Security-Policy"] = (
+            "default-src 'self'; "
+            "script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly; "
+            "style-src 'self' 'unsafe-inline'; "
+            "img-src 'self' data: blob: https:; "
+            "font-src 'self' data:; "
+            "connect-src 'self' blob:;"
+        )
+        response.headers["X-XSS-Protection"] = "1; mode=block"
+        response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
+        # Keep static/public assets cacheable while preventing caching of pages that may
+        # contain authenticated, session-bound, or user-specific content.
+        if request.path.startswith("/static/") or request.path.startswith("/images/"):
+            response.headers["Cache-Control"] = "public, max-age=31536000, immutable"
+        else:
+            response.headers["Cache-Control"] = "no-store, no-cache, must-revalidate, private"
+            response.headers["Pragma"] = "no-cache"
+            response.headers["Expires"] = "0"
+        return response

Conversation

Nixiss commented Jun 15, 2026

Uh oh!

semgrep-code-mongodb Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

semgrep-code-mongodb Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

semgrep-code-mongodb Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

semgrep-code-mongodb Bot commented Jun 15, 2026

Uh oh!

iammfh Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants