[WIP] Fix KeyError in import_traces method for missing fields#12
[WIP] Fix KeyError in import_traces method for missing fields#12
Conversation
… fields - Added import_traces() function to project-book.ipynb - Validates required fields: trace_id, state, input, output - Provides clear error messages for missing/invalid fields - Handles FileNotFoundError, JSONDecodeError, and ValueError - Added comprehensive test suite in tests/ Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
- Added import_traces_demo.py to demonstrate the function - Updated examples/README.md to document the new example - Demo shows all error handling scenarios Co-authored-by: toolate28 <105518313+toolate28@users.noreply.github.com>
a83ad87 to
1d8d219
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds a new import_traces() function to safely load and validate trace data from JSON files. The function performs comprehensive validation of required fields (trace_id, state, input, output) and provides clear error messages when validation fails.
Changes:
- Added
import_traces()function implementation to project-book.ipynb with validation and error handling - Added comprehensive test suite with both pytest (test_import_traces.py) and standalone (test_import_traces_simple.py) versions
- Added demonstration script (import_traces_demo.py) showing usage scenarios
- Updated examples/README.md with documentation for the new functionality
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| project-book.ipynb | Added import_traces() function with comprehensive validation logic for trace data |
| tests/test_import_traces.py | Added pytest-based test suite covering 13 test scenarios including edge cases |
| tests/test_import_traces_simple.py | Added standalone test runner (no pytest dependency) with 10 test scenarios |
| examples/import_traces_demo.py | Added demonstration script showing 4 usage scenarios and error handling |
| examples/README.md | Added documentation section for import_traces_demo.py with usage instructions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Build a helpful error message | ||
| error_msg = ( | ||
| f"Trace at index {idx} is missing required fields: {', '.join(missing_fields)}. " | ||
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " |
There was a problem hiding this comment.
The condition if trace.keys() will always evaluate to True because dict.keys() returns a dict_keys view object which is always truthy, even for an empty dict. This means the 'none' fallback will never be shown. The condition should be if trace or if len(trace) > 0 to properly check if the dictionary has any keys.
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " | |
| f"Available fields: {', '.join(trace.keys()) if trace else 'none'}. " |
| # Build a helpful error message | ||
| error_msg = ( | ||
| f"Trace at index {idx} is missing required fields: {', '.join(missing_fields)}. " | ||
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " |
There was a problem hiding this comment.
The condition if trace.keys() will always evaluate to True because dict.keys() returns a dict_keys view object which is always truthy, even for an empty dict. This means the 'none' fallback will never be shown. The condition should be if trace or if len(trace) > 0 to properly check if the dictionary has any keys.
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " | |
| f"Available fields: {', '.join(trace.keys()) if trace else 'none'}. " |
| "def import_traces(json_file_path: str) -> List[Dict[str, Any]]:", | ||
| " \"\"\"", | ||
| " Import trace data from a JSON file with proper error handling.", | ||
| " ", | ||
| " Handles JSON files containing trace data with required fields:", | ||
| " - trace_id: Unique identifier for the trace", | ||
| " - state: Current state of the trace", | ||
| " - input: Input data for the trace", | ||
| " - output: Output data from the trace", | ||
| " ", | ||
| " Args:", | ||
| " json_file_path: Path to the JSON file containing trace data", | ||
| " ", | ||
| " Returns:", | ||
| " List of trace dictionaries with validated fields", | ||
| " ", | ||
| " Raises:", | ||
| " FileNotFoundError: If the JSON file doesn't exist", | ||
| " json.JSONDecodeError: If the file contains invalid JSON", | ||
| " ValueError: If required fields are missing or invalid", | ||
| " ", | ||
| " Example:", | ||
| " >>> traces = import_traces(\"traces.json\")", | ||
| " >>> for trace in traces:", | ||
| " ... print(f\"Trace {trace['trace_id']}: {trace['state']}\")", | ||
| " \"\"\"", |
There was a problem hiding this comment.
The function has an all-or-nothing behavior where it rejects the entire file if any trace is invalid, even if other traces are valid. This behavior should be explicitly documented in the docstring, as users might expect partial imports or a way to skip invalid traces. Consider adding a note in the docstring explaining this design choice.
| if missing_fields: | ||
| error_msg = ( | ||
| f"Trace at index {idx} is missing required fields: {', '.join(missing_fields)}. " | ||
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " |
There was a problem hiding this comment.
The condition if trace.keys() will always evaluate to True because dict.keys() returns a dict_keys view object which is always truthy, even for an empty dict. This means the 'none' fallback will never be shown. The condition should be if trace or if len(trace) > 0 to properly check if the dictionary has any keys.
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " | |
| f"Available fields: {', '.join(trace.keys()) if trace else 'none'}. " |
|
|
||
|
|
||
| # Function is defined locally for testing purposes. | ||
| # The actual implementation is in project-book.ipynb. | ||
| def import_traces(json_file_path: str) -> List[Dict[str, Any]]: | ||
| """ | ||
| Import trace data from a JSON file with proper error handling. | ||
|
|
||
| Handles JSON files containing trace data with required fields: | ||
| - trace_id: Unique identifier for the trace | ||
| - state: Current state of the trace | ||
| - input: Input data for the trace | ||
| - output: Output data from the trace | ||
|
|
||
| Args: | ||
| json_file_path: Path to the JSON file containing trace data | ||
|
|
||
| Returns: | ||
| List of trace dictionaries with validated fields | ||
|
|
||
| Raises: | ||
| FileNotFoundError: If the JSON file doesn't exist | ||
| json.JSONDecodeError: If the file contains invalid JSON | ||
| ValueError: If required fields are missing or invalid | ||
| """ | ||
| # Check if file exists | ||
| if not Path(json_file_path).exists(): | ||
| raise FileNotFoundError(f"Trace file not found: {json_file_path}") | ||
|
|
||
| # Read and parse JSON | ||
| try: | ||
| with open(json_file_path, 'r', encoding='utf-8') as f: | ||
| data = json.load(f) | ||
| except json.JSONDecodeError as e: | ||
| raise json.JSONDecodeError( | ||
| f"Invalid JSON in file {json_file_path}: {str(e)}", | ||
| e.doc, | ||
| e.pos | ||
| ) | ||
|
|
||
| # Ensure data is a list | ||
| if isinstance(data, dict): | ||
| # If it's a single trace object, wrap it in a list | ||
| data = [data] | ||
| elif not isinstance(data, list): | ||
| raise ValueError( | ||
| f"Expected JSON to contain a list or dict, got {type(data).__name__}" | ||
| ) | ||
|
|
||
| # Required fields for trace data | ||
| required_fields = ['trace_id', 'state', 'input', 'output'] | ||
|
|
||
| # Validate each trace | ||
| validated_traces = [] | ||
| errors = [] | ||
|
|
||
| for idx, trace in enumerate(data): | ||
| if not isinstance(trace, dict): | ||
| errors.append(f"Trace at index {idx} is not a dictionary: {type(trace).__name__}") | ||
| continue | ||
|
|
||
| # Check for missing required fields | ||
| missing_fields = [field for field in required_fields if field not in trace] | ||
|
|
||
| if missing_fields: | ||
| # Build a helpful error message | ||
| error_msg = ( | ||
| f"Trace at index {idx} is missing required fields: {', '.join(missing_fields)}. " | ||
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " | ||
| f"Required fields are: {', '.join(required_fields)}" | ||
| ) | ||
| errors.append(error_msg) | ||
| continue | ||
|
|
||
| # Validate that required fields are not None | ||
| none_fields = [field for field in required_fields if trace[field] is None] | ||
| if none_fields: | ||
| error_msg = ( | ||
| f"Trace at index {idx} has null values for required fields: {', '.join(none_fields)}" | ||
| ) | ||
| errors.append(error_msg) | ||
| continue | ||
|
|
||
| validated_traces.append(trace) | ||
|
|
||
| # If we have errors, raise a comprehensive error message | ||
| if errors: | ||
| error_summary = f"Found {len(errors)} invalid trace(s) in {json_file_path}:\n" | ||
| error_summary += "\n".join(f" - {err}" for err in errors[:5]) # Show first 5 errors | ||
| if len(errors) > 5: | ||
| error_summary += f"\n ... and {len(errors) - 5} more error(s)" | ||
| raise ValueError(error_summary) | ||
|
|
||
| if not validated_traces: | ||
| raise ValueError(f"No valid traces found in {json_file_path}") | ||
|
|
||
| return validated_traces | ||
|
|
||
|
|
There was a problem hiding this comment.
The import_traces function is duplicated across multiple files (tests/test_import_traces.py, tests/test_import_traces_simple.py, examples/import_traces_demo.py, and project-book.ipynb). This code duplication makes maintenance difficult and error-prone. Consider creating a shared module or package structure where the function is defined once and imported by all test and example files.
| # Function is defined locally for testing purposes. | |
| # The actual implementation is in project-book.ipynb. | |
| def import_traces(json_file_path: str) -> List[Dict[str, Any]]: | |
| """ | |
| Import trace data from a JSON file with proper error handling. | |
| Handles JSON files containing trace data with required fields: | |
| - trace_id: Unique identifier for the trace | |
| - state: Current state of the trace | |
| - input: Input data for the trace | |
| - output: Output data from the trace | |
| Args: | |
| json_file_path: Path to the JSON file containing trace data | |
| Returns: | |
| List of trace dictionaries with validated fields | |
| Raises: | |
| FileNotFoundError: If the JSON file doesn't exist | |
| json.JSONDecodeError: If the file contains invalid JSON | |
| ValueError: If required fields are missing or invalid | |
| """ | |
| # Check if file exists | |
| if not Path(json_file_path).exists(): | |
| raise FileNotFoundError(f"Trace file not found: {json_file_path}") | |
| # Read and parse JSON | |
| try: | |
| with open(json_file_path, 'r', encoding='utf-8') as f: | |
| data = json.load(f) | |
| except json.JSONDecodeError as e: | |
| raise json.JSONDecodeError( | |
| f"Invalid JSON in file {json_file_path}: {str(e)}", | |
| e.doc, | |
| e.pos | |
| ) | |
| # Ensure data is a list | |
| if isinstance(data, dict): | |
| # If it's a single trace object, wrap it in a list | |
| data = [data] | |
| elif not isinstance(data, list): | |
| raise ValueError( | |
| f"Expected JSON to contain a list or dict, got {type(data).__name__}" | |
| ) | |
| # Required fields for trace data | |
| required_fields = ['trace_id', 'state', 'input', 'output'] | |
| # Validate each trace | |
| validated_traces = [] | |
| errors = [] | |
| for idx, trace in enumerate(data): | |
| if not isinstance(trace, dict): | |
| errors.append(f"Trace at index {idx} is not a dictionary: {type(trace).__name__}") | |
| continue | |
| # Check for missing required fields | |
| missing_fields = [field for field in required_fields if field not in trace] | |
| if missing_fields: | |
| # Build a helpful error message | |
| error_msg = ( | |
| f"Trace at index {idx} is missing required fields: {', '.join(missing_fields)}. " | |
| f"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. " | |
| f"Required fields are: {', '.join(required_fields)}" | |
| ) | |
| errors.append(error_msg) | |
| continue | |
| # Validate that required fields are not None | |
| none_fields = [field for field in required_fields if trace[field] is None] | |
| if none_fields: | |
| error_msg = ( | |
| f"Trace at index {idx} has null values for required fields: {', '.join(none_fields)}" | |
| ) | |
| errors.append(error_msg) | |
| continue | |
| validated_traces.append(trace) | |
| # If we have errors, raise a comprehensive error message | |
| if errors: | |
| error_summary = f"Found {len(errors)} invalid trace(s) in {json_file_path}:\n" | |
| error_summary += "\n".join(f" - {err}" for err in errors[:5]) # Show first 5 errors | |
| if len(errors) > 5: | |
| error_summary += f"\n ... and {len(errors) - 5} more error(s)" | |
| raise ValueError(error_summary) | |
| if not validated_traces: | |
| raise ValueError(f"No valid traces found in {json_file_path}") | |
| return validated_traces | |
| import importlib.util | |
| # Load the shared import_traces implementation from the examples module | |
| _examples_dir = Path(__file__).resolve().parents[1] / "examples" | |
| _import_traces_path = _examples_dir / "import_traces_demo.py" | |
| _spec = importlib.util.spec_from_file_location( | |
| "import_traces_demo", _import_traces_path | |
| ) | |
| _import_traces_module = importlib.util.module_from_spec(_spec) | |
| assert _spec is not None and _spec.loader is not None | |
| _spec.loader.exec_module(_import_traces_module) | |
| import_traces = _import_traces_module.import_traces |
| " # Build a helpful error message", | ||
| " error_msg = (", | ||
| " f\"Trace at index {idx} is missing required fields: {', '.join(missing_fields)}. \"", | ||
| " f\"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. \"", |
There was a problem hiding this comment.
The condition if trace.keys() will always evaluate to True because dict.keys() returns a dict_keys view object which is always truthy, even for an empty dict. This means the 'none' fallback will never be shown. The condition should be if trace or if len(trace) > 0 to properly check if the dictionary has any keys.
| " f\"Available fields: {', '.join(trace.keys()) if trace.keys() else 'none'}. \"", | |
| " f\"Available fields: {', '.join(trace.keys()) if trace else 'none'}. \"", |
| malformed_file.write_text("{not valid json") | ||
|
|
||
| try: | ||
| traces = import_traces(str(malformed_file)) |
There was a problem hiding this comment.
Variable traces is not used.
| traces = import_traces(str(malformed_file)) | |
| import_traces(str(malformed_file)) |
| invalid_file.write_text(json.dumps(invalid_data, indent=2)) | ||
|
|
||
| try: | ||
| traces = import_traces(str(invalid_file)) |
There was a problem hiding this comment.
| print("\n📋 Demo 3: Handling file not found") | ||
| print("-" * 60) | ||
| try: | ||
| traces = import_traces("nonexistent_file.json") |
| """ | ||
| import json | ||
| import pytest | ||
| import tempfile |
There was a problem hiding this comment.
Import of 'tempfile' is not used.
| import tempfile |
|
@copilot apply changes based on the comments in this thread |
Fix: Add error handling for import_traces() method with missing JSON fields
Implementation Plan
✅ Implementation Complete
Changes Made
Added
import_traces()function toproject-book.ipynb:Added comprehensive test suite:
tests/test_import_traces.py- pytest-based tests (for when pytest is available)tests/test_import_traces_simple.py- standalone test runnerAdded demonstration script:
examples/import_traces_demo.py- Shows all usage scenariosexamples/README.mdwith documentationQuality Assurance:
Error Handling
The function now properly handles:
Example Error Messages
Before (would crash with KeyError):
After (clear, actionable error):
Original prompt
This section details on the original issue you should resolve
<issue_title>The
import_traces()method will crash with an unhandledKeyErrorif a JSON file contains trace data with missing required fields liketrace_id,state,input, oroutput.</issue_title><issue_description>The
import_traces()method will crash with an unhandledKeyErrorif a JSON file contains trace data with missing required fields liketrace_id,state,input, oroutput.View Details
📝 Patch Details
Analysis
Unhandled KeyError when importing trace data with missing required fields
What fails:
NpcTeleprompter.import_traces()raises unhandledKeyErrorif a JSON file contains trace data missing required fields (trace_id,state,input, oroutput)How to reproduce:
Result: Unhandled
KeyError: 'trace_id'with no context about which trace caused the problem or what fields were missing. Entire import operation fails without loading any traces.Expected: Should provide clear error message indicating which trace failed and what fields were missing, helping users fix corrupted or third-party generated JSON files.
Fix implemented:
NpcTrace.from_dict()to check all required fields and raise descriptiveValueErrorwith field names and trace dataimport_traces()method will crash with an unhandledKeyErrorif a JSON file contains trace data with missing required fields liketrace_id\,state\,input\, oroutput\. #7✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.