Skip to content

feat: improve Application Insights logging and telemetry handling#842

Open
Abdul-Microsoft wants to merge 9 commits intodev-v4from
psl-logging-improvements
Open

feat: improve Application Insights logging and telemetry handling#842
Abdul-Microsoft wants to merge 9 commits intodev-v4from
psl-logging-improvements

Conversation

@Abdul-Microsoft
Copy link
Collaborator

Purpose

  • This pull request introduces significant improvements to telemetry and event tracking in the backend, focusing on better integration with Azure Application Insights and OpenTelemetry. The main enhancements include enabling automatic request tracing for FastAPI, adding manual spans for WebSocket connections, attaching session IDs to telemetry spans for improved traceability, and standardizing event names for easier monitoring and analysis.

Telemetry and OpenTelemetry integration:

  • Added OpenTelemetry instrumentation for FastAPI, enabling automatic request tracing, dependency tracking, and operation_id propagation. WebSocket URLs are excluded from auto-instrumentation to reduce telemetry noise. [1] [2]
  • Suppressed noisy Azure Monitor exporter logs for cleaner logging output.

WebSocket telemetry improvements:

  • Manually created telemetry spans for WebSocket connections, including session_id attributes when available, to ensure proper traceability since auto-instrumentation is excluded for these endpoints.
  • Updated WebSocket event tracking to include session_id and improved event naming conventions (WebSocket_Connected, WebSocket_Disconnected). [1] [2]

Session ID propagation:

  • Attached session_id to telemetry spans in multiple endpoints (process_request, plan_approval, user_clarification, agent_message_user) to provide end-to-end traceability across user sessions. [1] [2] [3] [4]

Event naming standardization:

  • Standardized event names for telemetry tracking, replacing generic or inconsistent names with clearer, structured names (e.g., Plan_Created, Error_Plan_Creation_Failed, Config_Team_Uploaded, Error_Config_Model_Validation_Failed). Dynamic event names are used for agent messages and plan approvals to improve monitoring granularity. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

Error tracking improvements:

  • Updated error event names throughout the codebase for greater clarity and consistency, making it easier to monitor and diagnose failures (e.g., Error_User_Not_Found, Error_Init_Team_Failed, Error_RAI_Check_Failed). [1] [2] [3] [4] [5] [6]

These changes collectively enhance observability, simplify telemetry analysis, and improve traceability for user sessions and backend operations.

Does this introduce a breaking change?

  • Yes
  • No

How to Test

  • Get the code
git clone [repo-address]
cd [repo-name]
git checkout [branch-name]
npm install
  • Test the code

What to Check

Verify that the following are valid

  • ...

Other Information

…PI application

- Attach session IDs to spans for better traceability in Application Insights.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances backend observability by integrating Azure Application Insights with OpenTelemetry, adding request auto-instrumentation for FastAPI, and improving manual telemetry for WebSocket flows and session-level traceability.

Changes:

  • Configure Azure Monitor (Application Insights) with FastAPI OpenTelemetry auto-instrumentation and reduced exporter noise.
  • Add manual spans for WebSocket connections and propagate session_id into spans/events across key endpoints.
  • Standardize telemetry event names for clearer monitoring and alerting.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/backend/v4/common/services/plan_service.py Removes plan-approval event tracking from the service layer (telemetry is now handled at the API layer).
src/backend/v4/api/router.py Adds WebSocket spans/session_id propagation and standardizes event names across multiple endpoints.
src/backend/app.py Configures Azure Monitor + FastAPI OpenTelemetry instrumentation and suppresses noisy exporter logs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@Abdul-Microsoft
Copy link
Collaborator Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Mar 10, 2026

@Abdul-Microsoft I've opened a new pull request, #843, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 5 commits March 10, 2026 06:14
…gic in user_clarification endpoint

Co-authored-by: Abdul-Microsoft <192570837+Abdul-Microsoft@users.noreply.github.com>
fix: resolve memory_store unbound error and RAI condition logic in user_clarification endpoint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants