Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion go-sdk/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
v1.0.0
v1.1.0
153 changes: 153 additions & 0 deletions inferflow/PREDICT_APIS_AND_FEATURE_LOGGING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Predict APIs and Feature Logging

This document describes the Predict gRPC APIs exposed by Inferflow, and the high-level request/logging flow for PointWise, PairWise, and SlateWise inference.

## APIs Exposed

The `Predict` service exposes three RPCs:

- `InferPointWise(PointWiseRequest) returns (PointWiseResponse)`
- `InferPairWise(PairWiseRequest) returns (PairWiseResponse)`
- `InferSlateWise(SlateWiseRequest) returns (SlateWiseResponse)`

Reference: `inferflow/server/proto/predict.proto`.

## API Intent

### PointWise (sometimes referred to as "pintwise")

- Use when you need per-target scoring.
- Input:
- `context_features` (request-level features)
- `target_input_schema` (schema for `Target.feature_values`)
- `targets` (entities/items to score)
- Output:
- `target_output_schema`
- `target_scores` (one score row per target)
- `request_error` (request-level error)

### PairWise

- Use when you need pair-level scoring/ranking and optional per-target outputs.
- Input:
- `targets` (base target pool)
- `pairs` (`first_target_index` + `second_target_index`)
- `pair_input_schema` for pair-level features
- `target_input_schema` for target-level features
- Output:
- `pair_scores` aligned with `pair_output_schema`
- `target_scores` aligned with `target_output_schema`
- `request_error`

### SlateWise

- Use when you need slate-level scoring plus optional per-target outputs.
- Input:
- `targets` (base target pool)
- `slates` (`target_indices` per slate)
- `slate_input_schema` for slate-level features
- `target_input_schema` for target-level features
- Output:
- `slate_scores` aligned with `slate_output_schema`
- `target_scores` aligned with `target_output_schema`
- `request_error`

## High-Level Flow

The runtime path is common across all three APIs with adapter-specific shaping:

1. Receive gRPC request in `PredictService` (`predict_handler.go`).
2. Load model config via `config.GetModelConfig(model_config_id)`.
3. Adapt request into `components.ComponentRequest` (`predict_adapter.go`):
- Build `ComponentData` (target matrix).
- For PairWise/SlateWise, also build `SlateData` (slate matrix).
4. Execute DAG components via `executor.Execute(...)`.
5. Build RPC response from matrices (`predict_response.go`):
- PointWise from target matrix.
- PairWise/SlateWise from target + slate matrices.
6. Emit metrics (`request.total`, `latency`, `batch.size`).
7. Optionally trigger feature logging asynchronously (`maybeLogInferenceResponse`).

## Matrix Model Used by Predict

- `ComponentData`:
- Main per-target matrix.
- Always present.
- `SlateData`:
- Per-slate matrix.
- Present for PairWise and SlateWise.
- Contains `slate_target_indices` and slate-level features.

## Feature Logging: Current Behavior

Feature logging is implemented in `inferflow/handlers/inferflow/feature_logging.go`.

### Trigger Conditions

Logging is attempted only when all are true:

- `conf.ResponseConfig.LoggingPerc > 0`
- Random sampling check passes: `rand.Intn(100)+1 <= LoggingPerc`
- `tracking_id` is non-empty

Reference: `maybeLogInferenceResponse` in `predict_handler.go`.

### Logging Format

V2 format is selected using:

- `config.GetModelConfigMap().ServiceConfig.V2LoggingType`

Supported format values:

- `proto`
- `arrow`
- `parquet`

### Logged Message Shape

Logged payload uses `InferflowLog` (`inferflow_logging.proto`):

- `user_id`
- `tracking_id`
- `model_config_id`
- `entities`
- `features` (`PerEntityFeatures.encoded_features`)
- `metadata`
- `parent_entity`

### What Data Is Logged Today

Current logging functions (`logInferflowResponseBytes`, `logInferflowResponseArrow`, `logInferflowResponseParquet`) read from:

- `compRequest.ComponentData` (target matrix)

They do not currently read from:

- `compRequest.SlateData` (slate matrix)

So for PairWise/SlateWise requests, logging currently captures target-matrix features, not slate-matrix rows.

### Metadata and Transport

- Metadata byte packs:
- compression-enabled bit
- cache version
- format type (proto/arrow/parquet)
- Logs are sent to Kafka through prism logger (`PublishInferenceInsightsLog`).
- Event name used: `inferflow_inference_logs`.

### Batching

- Logs are batched before Kafka publish.
- Batch size uses `ResponseConfig.LogBatchSize`, default `500`.

## Notes for Future Slate Logging

If slate logging is required, the cleanest approach is to add a parallel slate logging path that:

- reads from `compRequest.SlateData`
- builds slate-oriented encoded payloads
- publishes with same `tracking_id` and `model_config_id` for correlation

This avoids breaking existing target-log consumers.
2 changes: 1 addition & 1 deletion inferflow/cmd/inferflow/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Stage 1: Build the Go binary
FROM golang:1.24.4-bullseye AS builder
FROM golang:1.24.9-bookworm AS builder

ARG TARGETOS
ARG TARGETARCH
Expand Down
22 changes: 20 additions & 2 deletions inferflow/go.mod
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
module github.com/Meesho/BharatMLStack/inferflow

go 1.24.4
go 1.24.9

require (
github.com/DataDog/datadog-go/v5 v5.5.0
github.com/Meesho/BharatMLStack/helix-client v1.0.0-alpha-649f16
github.com/apache/arrow/go/v16 v16.1.0
github.com/cockroachdb/cmux v0.0.0-20170110192607-30d10be49292
github.com/coocood/freecache v1.2.4
github.com/dgraph-io/ristretto v0.2.0
github.com/emirpasic/gods v1.18.1
github.com/h2so5/half v1.0.0
github.com/knadh/koanf v1.5.0
github.com/parquet-go/parquet-go v0.27.0
github.com/rs/zerolog v1.34.0
github.com/segmentio/kafka-go v0.4.50
github.com/spaolacci/murmur3 v1.1.0
github.com/spf13/viper v1.19.0
github.com/stretchr/testify v1.11.1
Expand All @@ -23,23 +26,32 @@ require (

require (
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/andybalholm/brotli v1.1.1 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/coreos/go-semver v0.3.0 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/fsnotify/fsnotify v1.7.0 // indirect
github.com/goccy/go-json v0.10.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/flatbuffers v24.3.25+incompatible // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/hashicorp/hcl v1.0.0 // indirect
github.com/klauspost/compress v1.17.9 // indirect
github.com/klauspost/cpuid/v2 v2.3.0 // indirect
github.com/magiconair/properties v1.8.7 // indirect
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/mitchellh/copystructure v1.2.0 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/parquet-go/bitpack v1.0.0 // indirect
github.com/parquet-go/jsonlite v1.0.0 // indirect
github.com/pelletier/go-toml v1.9.5 // indirect
github.com/pelletier/go-toml/v2 v2.2.4 // indirect
github.com/pierrec/lz4/v4 v4.1.21 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/sagikazarmark/locafero v0.4.0 // indirect
Expand All @@ -50,15 +62,21 @@ require (
github.com/spf13/pflag v1.0.5 // indirect
github.com/stretchr/objx v0.5.2 // indirect
github.com/subosito/gotenv v1.6.0 // indirect
github.com/twpayne/go-geom v1.6.1 // indirect
github.com/zeebo/xxh3 v1.0.2 // indirect
go.etcd.io/etcd/api/v3 v3.5.17 // indirect
go.etcd.io/etcd/client/pkg/v3 v3.5.17 // indirect
go.uber.org/atomic v1.9.0 // indirect
go.uber.org/multierr v1.9.0 // indirect
go.uber.org/zap v1.21.0 // indirect
golang.org/x/exp v0.0.0-20250106191152-7588d65b2ba8 // indirect
golang.org/x/mod v0.26.0 // indirect
golang.org/x/net v0.43.0 // indirect
golang.org/x/sys v0.35.0 // indirect
golang.org/x/sync v0.16.0 // indirect
golang.org/x/sys v0.38.0 // indirect
golang.org/x/text v0.28.0 // indirect
golang.org/x/tools v0.35.0 // indirect
golang.org/x/xerrors v0.0.0-20231012003039-104605ab7028 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20250707201910-8d1bb00bc6a7 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20250825161204-c5933d9347a5 // indirect
google.golang.org/grpc/examples v0.0.0-20230318005552-70c52915099a // indirect
Expand Down
Loading
Loading