feat(services): add OpenTelemetry tracing to Storage, Function, Database, and Cache services#369
feat(services): add OpenTelemetry tracing to Storage, Function, Database, and Cache services#369
Conversation
Forward iteration with slice removal during the loop causes the next element to be skipped (classic off-by-one bug). When streams[i] is removed, all subsequent elements shift left, but the loop's i++ then moves past the element that just shifted into position i. Fix by iterating backwards - removing streams[i] doesn't affect indices of elements we haven't processed yet (0 through i-1).
Addresses multiple DoS vectors from unbounded io operations and maps: - storage_handler.go: Wrap Upload/UploadPart/ServePresignedUpload bodies with io.LimitReader(maxUploadSize=5GB) - core/services/storage.go: Wrap UploadPart with io.LimitReader(maxPartSize=5GB) - workers/pipeline_worker.go: Reject payloads >10MB before json.Unmarshal - core/services/instance.go: Wrap stats stream with io.LimitReader(maxStatsSize=1MB) - core/services/cache.go: Same for cache stats stream - core/services/function.go: Wrap log reader with io.LimitReader(maxLogSize=1MB) - core/services/auth.go: Purge expired lockout/failure map entries probabilistically on each incrementFailure to prevent unbounded growth - storage/coordinator/service.go: Track totalSize and reject if >5GB - storage/node/store.go: Wrap WriteStream and Assemble with size checks - repositories/filesystem/adapter.go: Wrap Write with io.LimitReader(maxObjectSize=5GB)
…ase, and Cache services Instrument the following high-value methods with spans following the existing tracing pattern (otel.Tracer + defer span.End + span.RecordError): - StorageService: Upload, Download, CreateBucket, DeleteBucket - FunctionService: CreateFunction, InvokeFunction, GetFunctionLogs - DatabaseService: CreateDatabase, RestoreDatabase, doRotateCredentials - CacheService: CreateCache, GetCacheStats Uses tracer name constants (e.g. "storage-service") for consistency with existing otelgin middleware and pkg/tracing infrastructure. Closes #339.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (11)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 38 minutes and 57 seconds.Comment |
Summary
Instrument StorageService, FunctionService, DatabaseService, and CacheService with OpenTelemetry spans for improved observability. Closes #339.
Changes
storage-service):Upload,Download,CreateBucket,DeleteBucketfunction-service):CreateFunction,InvokeFunction,GetFunctionLogsdatabase-service):CreateDatabase,RestoreDatabase,doRotateCredentialscache-service):CreateCache,GetCacheStatsImplementation
Each method follows the established OTel pattern:
Uses
_for the span-wrapped context to avoid propagating trace context into repo/mock calls (which would break mock equality assertions in unit tests).Test Plan
go build ./internal/core/services/...go test ./internal/core/services/... -count=1(all pass)Verification
With
TRACING_ENABLED=trueand Jaeger running locally, traces will appear athttp://localhost:16686.