Skip to content

Add mongosync_insights to repository#24

Open
Nixiss wants to merge 1 commit into
mainfrom
mongosync_insights
Open

Add mongosync_insights to repository#24
Nixiss wants to merge 1 commit into
mainfrom
mongosync_insights

Conversation

@Nixiss

@Nixiss Nixiss commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

Returns the full snapshot dict (including template_data) or None.
"""
path = _snapshot_path(snapshot_id)
if not os.path.exists(path):
return None

try:
with open(path, 'r', encoding='utf-8') as f:

# Refresh mtime on snapshot file
try:
os.utime(path, None)
pass

meta_path = _snapshot_meta_path(snapshot_id)
if os.path.exists(meta_path):
meta_path = _snapshot_meta_path(snapshot_id)
if os.path.exists(meta_path):
try:
os.utime(meta_path, None)

if os.path.exists(meta_path):
try:
os.remove(meta_path)

response.set_cookie(
SESSION_COOKIE_NAME,
session_id,

response.set_cookie(
SESSION_COOKIE_NAME,
session_id,
), 400

db_name = (session_data or {}).get("verifier_db_name", "migration_verification_metadata")
return jsonify(gatherVerifierMetrics(connection_string, db_name))
return jsonify(result)
except Exception as e:
logger.error("Log search error: %s", e)
return jsonify({"error": "Search failed", "detail": str(e)}), 500
# Make HTTP GET request to the endpoint
url = f"http://{endpoint_url}"
logger.info(f"Fetching data from endpoint: {url}")
response = requests.get(url, timeout=10)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep identified an issue in your code:

requests.get fetches url over plain HTTP, so anyone on the network can read or alter the endpoint response.

More details about this

requests.get(url, timeout=10) sends data to the url built as f"http://{endpoint_url}", so this call always uses plain HTTP instead of encrypted HTTPS.

A plausible attack is:

  1. A user runs this code on a shared office, cloud, or coffee-shop network, and endpoint_url points to a real service such as api.internal.example.com/status.
  2. Because url starts with http://, the request from requests.get(...) is sent in cleartext. An attacker on the same network can intercept it with tools like tcpdump or mitmproxy.
  3. The attacker can read the full request and response, including any headers, tokens, cookies, or operational data this endpoint returns in data = response.json().
  4. The attacker can also tamper with the response before your code parses it. That lets them inject fake JSON into data, which then flows into progress, warnings_list, and the dashboard values shown to users.

Example interception on the same network:

sudo tcpdump -A -i wlan0 'tcp port 80 and host api.internal.example.com'

If this request includes sensitive headers or the endpoint returns internal replication state, those values are exposed because the url is fetched over HTTP.

Dataflow graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>mongosync_insights/lib/live_migration_metrics.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/live_migration_metrics.py#L561 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 561] http://</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/live_migration_metrics.py#L561 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 561] url</a>"]
        end
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/live_migration_metrics.py#L563 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 563] url</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink


Loading

To resolve this comment:

✨ Commit fix suggestion
  1. Change the URL construction to use https:// instead of http://, for example url = f"https://{endpoint_url}".
  2. Keep the requests.get call the same, but point it at the new HTTPS URL: response = requests.get(url, timeout=10).
  3. If endpoint_url can already include a scheme, normalize it before building the request so you do not accidentally keep an insecure URL. For example, replace a leading http:// with https://, or parse it and rebuild it with the https scheme.
  4. If the endpoint does not support HTTPS yet, update the service configuration or caller input so endpoint_url refers to an HTTPS-enabled endpoint such as example.com:443 instead of a plain HTTP listener.
  5. Manually verify that the request still succeeds against the target endpoint and that certificate errors do not occur. If you see TLS verification failures, fix the server certificate or trust configuration rather than disabling verification with verify=False.
💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by request-with-http.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

  • Fix the code
  • Reply /fp $reason (if security gap doesn’t exist)
  • Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
  • Reply /other $reason (e.g., test-only)

You can view more details about this finding in the Semgrep AppSec Platform.

)
response.headers["X-XSS-Protection"] = "1; mode=block"
response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
return response

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep identified an issue in your code:

add_security_headers sets several security headers but omits X-Permitted-Cross-Domain-Policies before returning the Flask response. This can leave older Flash/Silverlight cross-domain access to your origin less restricted than intended.

More details about this

add_security_headers(response) adds several browser security headers to every Flask response, but it returns response without setting response.headers["X-Permitted-Cross-Domain-Policies"].

A plausible misuse is: 1) an attacker hosts a malicious Flash or Silverlight file on another site, 2) that content tries to load data from this app's origin, such as pages returned by hub() or files served by mi_static_js(filename), 3) because add_security_headers does not send X-Permitted-Cross-Domain-Policies, older Adobe/Microsoft cross-domain policy behavior is not explicitly locked down, so the plugin may rely on permissive defaults or other policy files, 4) the attacker then reads or embeds information from your domain through that plugin content.

The issue is specifically at the end of add_security_headers: the function sets Strict-Transport-Security, X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Content-Security-Policy, and Permissions-Policy, then return response without adding X-Permitted-Cross-Domain-Policies.

Dataflow graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>mongosync_insights/mongosync_insights.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] response.headers[&quot;Strict-Transport-Security&quot;] = &quot;max-age=31536000; includeSubDomains&quot;</a>"]
        end
        %% Intermediate

        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L78 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 78] return response</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    File0:::invis

    %% Connections

    Source --> Sink


Loading

To resolve this comment:

✨ Commit fix suggestion

Suggested change
return response
response.headers["X-Permitted-Cross-Domain-Policies"] = "none"
return response
View step-by-step instructions
  1. Update the add_security_headers function to set the missing X-Permitted-Cross-Domain-Policies header on every response.
  2. Add a header assignment before return response, for example: response.headers["X-Permitted-Cross-Domain-Policies"] = "none".
  3. Use "none" unless the application must explicitly allow Adobe cross-domain policy files. This tells clients not to load cross-domain policies from your site.
  4. Keep this header in the existing @app.after_request handler so it is applied consistently to all routes.

Alternatively, if you intentionally serve a valid cross-domain policy file for legacy Flash or Silverlight integrations, set the header to the narrowest value that still works, such as response.headers["X-Permitted-Cross-Domain-Policies"] = "master-only" instead of a broader policy.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by after-request-permitted-cross-domain-policies.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

  • Fix the code
  • Reply /fp $reason (if security gap doesn’t exist)
  • Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
  • Reply /other $reason (e.g., test-only)

You can view more details about this finding in the Semgrep AppSec Platform.

)
response.headers["X-XSS-Protection"] = "1; mode=block"
response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
return response

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep identified an issue in your code:

add_security_headers(response) sets several security headers but leaves caching unspecified, so authenticated responses may be stored and later exposed from browser or proxy caches.

More details about this

add_security_headers(response) adds headers like Strict-Transport-Security, X-Frame-Options, and Content-Security-Policy, but it returns the same response without setting any Cache-Control header. If /, /logout, or an error page later includes user-specific data, a shared browser, disk cache, or proxy can store that authenticated response and show it to the next person who uses the machine or intermediary.

Plausible exploit path:

  1. A victim signs in and requests a page that returns sensitive content through Flask, and add_security_headers(response) runs on that response.
  2. Because response.headers never gets a Cache-Control value, the browser or an intermediate cache is free to keep a copy of that page.
  3. On a shared kiosk, VDI, or corporate proxy, an attacker opens the same URL after the victim logs out and the cached response is reused.
  4. The attacker can read whatever the original response contained, such as account details, internal app data, or tokens embedded in the page, without needing the victim’s current session.

The risky part here is not the existing security headers themselves; it is that this global @app.after_request handler applies to every returned response but leaves caching behavior unspecified.

Dataflow graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>mongosync_insights/mongosync_insights.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] response.headers[&quot;Strict-Transport-Security&quot;] = &quot;max-age=31536000; includeSubDomains&quot;</a>"]
        end
        %% Intermediate

        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/mongosync_insights.py#L78 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 78] return response</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    File0:::invis

    %% Connections

    Source --> Sink


Loading

To resolve this comment:

✨ Commit fix suggestion

Suggested change
return response
@app.after_request
def add_security_headers(response):
response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
response.headers["X-Content-Type-Options"] = "nosniff"
response.headers["X-Frame-Options"] = "DENY"
response.headers["Referrer-Policy"] = "no-referrer"
response.headers["Content-Security-Policy"] = (
"default-src 'self'; "
"script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly; "
"style-src 'self' 'unsafe-inline'; "
"img-src 'self' data: blob: https:; "
"font-src 'self' data:; "
"connect-src 'self' blob:;"
)
response.headers["X-XSS-Protection"] = "1; mode=block"
response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
# Keep static/public assets cacheable while preventing caching of pages that may
# contain authenticated, session-bound, or user-specific content.
if request.path.startswith("/static/") or request.path.startswith("/images/"):
response.headers["Cache-Control"] = "public, max-age=31536000, immutable"
else:
response.headers["Cache-Control"] = "no-store, no-cache, must-revalidate, private"
response.headers["Pragma"] = "no-cache"
response.headers["Expires"] = "0"
return response
View step-by-step instructions
  1. Update the add_security_headers @app.after_request handler to also set a Cache-Control header on responses.
  2. Use a restrictive value for authenticated or user-specific pages, for example response.headers["Cache-Control"] = "no-store, no-cache, must-revalidate, private".
  3. Add a legacy compatibility header if you need to support older HTTP/1.0 caches: response.headers["Pragma"] = "no-cache".
  4. Add an expiration header to prevent stale cached copies, for example response.headers["Expires"] = "0". This makes browsers and proxies avoid storing sensitive responses.
  5. Apply the strict cache headers at least to routes that return authenticated content or anything tied to a session cookie, such as pages reached after login, logout responses, and any page that shows user-specific data.
  6. Alternatively, if some routes serve static or fully public content and you want them to remain cacheable, set Cache-Control conditionally in add_security_headers, such as checking request.path and only using no-store, no-cache, must-revalidate, private for sensitive routes while leaving public assets with a separate cache policy.
💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by after-request-cache-control.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

  • Fix the code
  • Reply /fp $reason (if security gap doesn’t exist)
  • Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
  • Reply /other $reason (e.g., test-only)

You can view more details about this finding in the Semgrep AppSec Platform.

@semgrep-code-mongodb

Copy link
Copy Markdown
Contributor

Semgrep found 1 tainted-log-injection-stdlib-flask finding:

  • mongosync_insights/lib/logs_metrics.py

Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.

View Dataflow Graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>mongosync_insights/lib/logs_metrics.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] request.files</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L64 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 64] file</a>"]

            v3["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L76 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 76] filename</a>"]

            v4["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L107 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 107] detect_mime_type</a>"]

            v5["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L29 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 29] filename</a>"]

            v6["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L43 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 43] mime_type</a>"]

            v7["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L107 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 107] file_mime_type</a>"]
        end
            v2 --> v3
            v3 --> v4
            v4 --> v5
            v5 --> v6
            v6 --> v7
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb/kb-assets/blob/b185be682d0b9442bd10bf49cedcee7d23ed754f/mongosync_insights/lib/logs_metrics.py#L113 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 113] f&quot;Invalid MIME type: {file_mime_type}. Allowed: {ALLOWED_MIME_TYPES}&quot;</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink

Loading

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

  • Fix the code
  • Reply /fp $reason (if security gap doesn’t exist)
  • Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
  • Reply /other $reason (e.g., test-only)

--license "Apache-2.0" \
--vendor "MongoDB Support" \
--description "Mongosync Insights — MongoDB migration monitoring dashboard" \
--url "https://github.com/mongodb/support-tools" \

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this build script will pull the code from support-tools repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants