feat: improve Application Insights logging and telemetry handling#842
feat: improve Application Insights logging and telemetry handling#842Abdul-Microsoft wants to merge 9 commits intodev-v4from
Conversation
…PI application - Attach session IDs to spans for better traceability in Application Insights.
There was a problem hiding this comment.
Pull request overview
This PR enhances backend observability by integrating Azure Application Insights with OpenTelemetry, adding request auto-instrumentation for FastAPI, and improving manual telemetry for WebSocket flows and session-level traceability.
Changes:
- Configure Azure Monitor (Application Insights) with FastAPI OpenTelemetry auto-instrumentation and reduced exporter noise.
- Add manual spans for WebSocket connections and propagate
session_idinto spans/events across key endpoints. - Standardize telemetry event names for clearer monitoring and alerting.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/backend/v4/common/services/plan_service.py |
Removes plan-approval event tracking from the service layer (telemetry is now handled at the API layer). |
src/backend/v4/api/router.py |
Adds WebSocket spans/session_id propagation and standardizes event names across multiple endpoints. |
src/backend/app.py |
Configures Azure Monitor + FastAPI OpenTelemetry instrumentation and suppresses noisy exporter logs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@Abdul-Microsoft I've opened a new pull request, #843, to work on those changes. Once the pull request is ready, I'll request review from you. |
…gic in user_clarification endpoint Co-authored-by: Abdul-Microsoft <192570837+Abdul-Microsoft@users.noreply.github.com>
fix: resolve memory_store unbound error and RAI condition logic in user_clarification endpoint
Purpose
Telemetry and OpenTelemetry integration:
WebSocket telemetry improvements:
WebSocket_Connected,WebSocket_Disconnected). [1] [2]Session ID propagation:
process_request,plan_approval,user_clarification,agent_message_user) to provide end-to-end traceability across user sessions. [1] [2] [3] [4]Event naming standardization:
Plan_Created,Error_Plan_Creation_Failed,Config_Team_Uploaded,Error_Config_Model_Validation_Failed). Dynamic event names are used for agent messages and plan approvals to improve monitoring granularity. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]Error tracking improvements:
Error_User_Not_Found,Error_Init_Team_Failed,Error_RAI_Check_Failed). [1] [2] [3] [4] [5] [6]These changes collectively enhance observability, simplify telemetry analysis, and improve traceability for user sessions and backend operations.
Does this introduce a breaking change?
How to Test
What to Check
Verify that the following are valid
Other Information