Conversation
WalkthroughAdds GCS/model-name helper utilities, protobuf-based config text parsing and model name replacement, instance-count update logic integrated into edit flow, test coverage for helpers and replacements, and migrates local model config types to generated protogen types (proto go_package change). Changes
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
horizon/internal/inferflow/handler/inferflow.go (2)
375-437:⚠️ Potential issue | 🟠 MajorFix nil-error dereference in ScaleUp latest-request lookup.
IfGetLatestRequestreturns a non-emptyErrorwitherr == nil,err.Error()will panic. Handle the two cases separately.🛠️ Suggested fix
- latestSourceRequest, err = m.GetLatestRequest(sourceConfigID) - if err != nil || latestSourceRequest.Error != emptyResponse { - return Response{}, errors.New("failed to get latest request for the source configID: " + sourceConfigID + ": " + err.Error()) - } + latestSourceRequest, err = m.GetLatestRequest(sourceConfigID) + if err != nil { + return Response{}, errors.New("failed to get latest request for the source configID: " + sourceConfigID + ": " + err.Error()) + } + if latestSourceRequest.Error != emptyResponse { + return Response{}, errors.New("failed to get latest request for the source configID: " + sourceConfigID + ": " + fmt.Sprint(latestSourceRequest.Error)) + }
579-615:⚠️ Potential issue | 🟡 MinorGuard against empty deployable ID list.
If there are no deployable IDs, callingGetByIdsmay error or waste a DB call.🔧 Suggested guard
- serviceDeployables, err := m.ServiceDeployableConfigRepo.GetByIds(deployableIDs) - if err != nil { - return GetAllRequestConfigsResponse{}, errors.New("failed to get service deployable configs: " + err.Error()) - } - - deployableMap := make(map[int]string) - for _, sd := range serviceDeployables { - deployableMap[sd.ID] = sd.Name - } + deployableMap := make(map[int]string) + if len(deployableIDs) > 0 { + serviceDeployables, err := m.ServiceDeployableConfigRepo.GetByIds(deployableIDs) + if err != nil { + return GetAllRequestConfigsResponse{}, errors.New("failed to get service deployable configs: " + err.Error()) + } + for _, sd := range serviceDeployables { + deployableMap[sd.ID] = sd.Name + } + }
🤖 Fix all issues with AI agents
In @.pre-commit-config.yaml:
- Line 7: Create the missing script referenced by the pre-commit config named
pre-commit-scripts/runner.sh: add a file with a valid shebang (e.g.
#!/usr/bin/env bash), implement a simple runner that forwards arguments to the
underlying hook logic (or invokes the repo's lint/test commands) and returns
appropriate exit codes, and ensure the file is made executable (chmod +x) so the
pre-commit entry in .pre-commit-config.yaml can run successfully.
In `@horizon/internal/externalcall/prometheus_client.go`:
- Around line 178-184: GetPredatorModelTraffic builds the prometheus query_range
URL without the required "/select/100" prefix; update the fmt.Sprintf in
GetPredatorModelTraffic to use "/select/100/prometheus/api/v1/query_range"
(keeping the same parameters: p.BaseURL, escapePrometheusQuery(query), start,
end, step) so the constructed URL matches other methods like GetModelNames and
routes correctly.
In `@horizon/internal/inferflow/handler/inferflow_helpers.go`:
- Around line 8-10: Wrap the call to
m.ServiceDeployableConfigRepo.GetById(deployableID) with the same repo-call
metrics pattern used elsewhere in this module: start a timer/observer before the
call, increment a "repo.get.count" metric (or the module's equivalent) and on
return observe latency; if err != nil increment the repo error counter (and
log/record the error via the module's error metric) before returning the wrapped
error. Use the same metric names and helper functions used for other DB calls in
this file to ensure consistency (reference the call
m.ServiceDeployableConfigRepo.GetById and the returned serviceDeployableConfig
variable when adding the timing/count/error instrumentation).
In `@horizon/internal/inferflow/handler/inferflow_review.go`:
- Around line 39-43: The code calls m.InferFlowRequestRepo.GetRequestByID into
tempRequest but only checks err; update the logic in the inferflow_review.go
block that uses tempRequest (type inferflow_request.Table) to detect a not-found
result (e.g., zero-value/empty Table) after GetRequestByID and return an
appropriate error Response (use the same error-wrapping style as existing
returns) instead of continuing; reference tempRequest, GetRequestByID and
m.InferFlowRequestRepo to locate the change and ensure you treat an empty
tempRequest as "request not found" and return early.
In `@horizon/internal/inferflow/handler/inferflow_validation.go`:
- Around line 16-20: The loop over ComponentConfig.FeatureComponents
dereferences featureComponent.FSRequest.Label without checking FSRequest for
nil, which can panic; update the loop in the inferflow validation (the block
iterating over ComponentConfig.FeatureComponents and using
featureComponent.FSRequest.Label) to first check if featureComponent.FSRequest
== nil and skip (or handle) that component before reading .Label (and optionally
log or record the missing FSRequest for debugging).
In `@horizon/internal/predator/handler/predator_approval.go`:
- Around line 254-313: processEditDBUpdateStage currently updates each model
config inline causing partial commits on mid-loop failures; wrap the entire DB
update phase in a single transaction so all model updates either commit or roll
back together. Modify Predator (processEditDBUpdateStage) to begin a transaction
from the repository/DB (use whatever transactional method your
PredatorConfigRepo exposes or add BeginTx/WithTx), perform all
GetActiveModelByModelName, marshal/update fields, and call
PredatorConfigRepo.Update within that transaction, and then commit at the end;
on any error ensure you rollback the transaction and return the error (so GCS
rollback can be coordinated), and remove early returns that bypass transaction
rollback. Ensure transaction covers the whole for-loop and that Update uses the
transaction context/handle.
In `@horizon/internal/predator/handler/predator_functional_testing.go`:
- Around line 25-44: The loop assumes each row v[batchIdx] has length
featureCount and can panic; add a bounds check before accessing
v[batchIdx][featureIdx] in the nested loop (use lenRow := len(v[batchIdx]) and
verify featureIdx < lenRow) and return a clear error if a row is shorter
(include batchIdx, lenRow and expected featureCount). Apply this check around
the access used by the FP16/FP32 branches (where
serializer.Float32ToFloat16Bytes, math.Float32bits and
binary.LittleEndian.PutUint32 are invoked) so you fail fast instead of risking
an index out of range panic.
In `@horizon/internal/repositories/sql/inferflow/request/repository.go`:
- Around line 128-130: The duplicate-check in DoesConfigIdExistWithRequestType
currently restricts matches to STATUS = 'PENDING APPROVAL', which can allow
duplicates because APPROVED (and other non-REJECTED) requests are ignored;
update the filter in InferflowRequest.DoesConfigIdExistWithRequestType to match
the original semantics used by Promote/ScaleUp handlers (e.g., use "STATUS !=
'REJECTED'" or an equivalent condition that excludes only REJECTED records) so
any active/non-rejected request for the same configID and requestType blocks a
new request; locate and update the SQL predicate in
DoesConfigIdExistWithRequestType accordingly and ensure Promote and ScaleUp
continue to rely on this method for duplicate detection.
In `@horizon/pkg/configschemaclient/client.go`:
- Around line 81-98: processNumerixInput currently splits numerixInput on "@"
and directly uses inputParts[1], which can panic if the split yields fewer than
2 parts; update processNumerixInput to validate inputParts length before
accessing index 1 (for example skip entries where strings.Split(numerixInput,
"@") returns < 2 or set a safe default for FeatureType), referencing the
ScoreMapping loop and the use of inputParts and SchemaComponents (FeatureName,
FeatureType, FeatureSize) to ensure you handle malformed numerixInput values
safely.
In `@horizon/pkg/configschemaclient/types.go`:
- Around line 112-129: The JSON tag for FSRequest.FeatureGroups uses camelCase
("featureGroups") instead of the repo's snake_case convention; update the struct
FSRequest in horizon/pkg/configschemaclient/types.go to use `feature_groups` for
the FeatureGroups field's json tag (i.e., FSRequest.FeatureGroups ->
json:"feature_groups"`) and ensure the same correction is applied to the
duplicate FSRequest definition in
horizon/internal/repositories/sql/inferflow/models.go so unmarshalling matches
expected sample JSON.
🟡 Minor comments (17)
horizon/internal/externalcall/prometheus_client.go-173-176 (1)
173-176:⚠️ Potential issue | 🟡 MinorNormalize
serviceNameto lowercase before building the query.
This keeps service labels consistent even if callers pass mixed-case values and aligns with existing conventions.🔧 Proposed fix
import ( "encoding/json" "fmt" "io" "net/http" "net/url" "strconv" + "strings" "sync" "time" )- query := fmt.Sprintf( + service := strings.ToLower(serviceName) + query := fmt.Sprintf( "sum by (model)(increase(nv_inference_count{service=\"%s\"}[1m]))", - serviceName, + service, )Based on learnings: Within the BharatMLStack horizon codebase, ensure that service names (inferflow, predator, numerix) are stored in lowercase in database tables and are consistently used in lowercase throughout the code.
horizon/internal/inferflow/handler/inferflow_helpers.go-10-10 (1)
10-10:⚠️ Potential issue | 🟡 MinorFix duplicated word in the error message.
Minor typo: “service service”.
Proposed change
- return "", fmt.Errorf("failed to fetch service service deployable config for name generation: %w", err) + return "", fmt.Errorf("failed to fetch service deployable config for name generation: %w", err)horizon/internal/predator/README.md-7-15 (1)
7-15:⚠️ Potential issue | 🟡 MinorSpecify a language for the fenced code block.
markdownlint flags this; adding a language keeps docs clean.
Proposed change
-``` +```text internal/predator/ ├── controller/ (wires routes to handler) ├── handler/ (Config implementation + helpers) ├── route/ ├── proto/ and generated Go ├── init.go └── README.md -``` +```horizon/internal/inferflow/README.md-7-16 (1)
7-16:⚠️ Potential issue | 🟡 MinorSpecify a language for the fenced code block.
markdownlint flags this; adding a language keeps docs clean.
Proposed change
-``` +```text internal/inferflow/ ├── controller/ (wires routes to handler) ├── handler/ (Config implementation + helpers) ├── etcd/ (ETCD config read/write for inferflow) ├── route/ ├── handler/proto/ and generated Go ├── init.go └── README.md -``` +```horizon/internal/repositories/sql/servicedeployableconfig/table.go-31-32 (1)
31-32:⚠️ Potential issue | 🟡 MinorClarify nullability intent for
DeployableTag.
OverrideTestingis correctly configured withdefault:falseas Go's bool type is never null. However,DeployableTaglacks an explicit nullability constraint. If the field is optional, consider addingdefault:nullor use*stringto avoid empty-string ambiguity. If required, addnot nullfor consistency with similar constraints elsewhere in the struct (e.g.,Host).horizon/internal/repositories/sql/inferflow/config/table.go-29-29 (1)
29-29:⚠️ Potential issue | 🟡 MinorUse
*stringto preserve NULL semantics for nullable column.The database column
source_config_idis defined asvarchar(255) NULL, but the GORM struct maps it asstringinstead of*string. This mismatch means NULL values are converted to empty strings on read, and there's no way to distinguish between "never set" (NULL) and "explicitly empty" on write. While the current code shows no explicit NULL checks, this semantic loss could become a latent bug if future queries or validation logic needs to differentiate NULL from empty string.Consider changing to:
SourceConfigID *string `gorm:"column:source_config_id"`horizon/internal/inferflow/controller/controller.go-370-380 (1)
370-380:⚠️ Potential issue | 🟡 MinorInconsistent error response format.
The error response on line 376 returns a plain string
"Error getting feature schema"instead of using the structuredhandler.Responsetype witherr.Error(). This is inconsistent with other handlers in this file (e.g.,Onboard,Promote,Edit) and loses the actual error message.Also,
strings.TrimSpaceis applied toversionbut notmodel_config_id, which may be inconsistent.🔧 Proposed fix for consistent error handling
func (c *V1) GetFeatureSchema(ctx *gin.Context) { response, err := c.Config.GetFeatureSchema(handler.FeatureSchemaRequest{ - ModelConfigId: ctx.Query("model_config_id"), + ModelConfigId: strings.TrimSpace(ctx.Query("model_config_id")), Version: strings.TrimSpace(ctx.Query("version")), }) if err != nil { - ctx.JSON(api.NewBadRequestError(err.Error()).StatusCode, "Error getting feature schema") + ctx.JSON(api.NewBadRequestError(err.Error()).StatusCode, handler.Response{ + Error: err.Error(), + Data: handler.Message{Message: ""}, + }) return } ctx.JSON(200, response) }horizon/internal/inferflow/handler/inferflow_functional_testing.go-71-75 (1)
71-75:⚠️ Potential issue | 🟡 MinorEndpoint port stripping may fail for IPv6 addresses.
The logic
strings.LastIndex(ep, ":")to find and strip ports will incorrectly handle IPv6 addresses like[::1]:8080since colons appear within the address itself. If IPv6 endpoints are possible, consider usingnet.SplitHostPortfor robust parsing.horizon/internal/inferflow/handler/inferflow_functional_testing.go-60-60 (1)
60-60:⚠️ Potential issue | 🟡 MinorTypo in function name:
ExecuteFuncitonalTestRequestThe function name contains a typo ("Funciton" instead of "Function"). This should be corrected to
ExecuteFunctionalTestRequestfor consistency and discoverability.✏️ Proposed fix
-func (m *InferFlow) ExecuteFuncitonalTestRequest(request ExecuteRequestFunctionalTestingRequest) (ExecuteRequestFunctionalTestingResponse, error) { +func (m *InferFlow) ExecuteFunctionalTestRequest(request ExecuteRequestFunctionalTestingRequest) (ExecuteRequestFunctionalTestingResponse, error) {horizon/internal/inferflow/handler/inferflow_fetch.go-62-86 (1)
62-86:⚠️ Potential issue | 🟡 MinorErrors from
GetConfigare silently ignored.If
m.infrastructureHandler.GetConfig()fails or returns an invalid config, no error is propagated. The caller receives a map that may be missing entries without knowing why. Consider collecting errors or at minimum logging failures.🛠️ Suggested improvement
func (m *InferFlow) batchFetchRingMasterConfigs(serviceDeployables map[int]*service_deployable_config.ServiceDeployableConfig) (map[int]infrastructurehandler.Config, error) { ringMasterConfigs := make(map[int]infrastructurehandler.Config) var mu sync.Mutex var wg sync.WaitGroup + var errMu sync.Mutex + var errs []error semaphore := make(chan struct{}, 10) for id, deployable := range serviceDeployables { wg.Add(1) go func(deployableID int, sd *service_deployable_config.ServiceDeployableConfig) { defer wg.Done() semaphore <- struct{}{} defer func() { <-semaphore }() config := m.infrastructureHandler.GetConfig(sd.Name, inferflowPkg.AppEnv) + if config == (infrastructurehandler.Config{}) { + errMu.Lock() + errs = append(errs, fmt.Errorf("failed to get config for deployable %d (%s)", deployableID, sd.Name)) + errMu.Unlock() + return + } mu.Lock() ringMasterConfigs[deployableID] = config mu.Unlock() }(id, deployable) } wg.Wait() + if len(errs) > 0 { + // Log or return aggregated errors based on requirements + } return ringMasterConfigs, nil }horizon/internal/predator/handler/predator_upload.go-187-200 (1)
187-200:⚠️ Potential issue | 🟡 MinorMisleading function name:
validateVersionHasFilesonly checks folder existence.The function name suggests it validates that the version folder contains files, but it only calls
CheckFolderExists. The error message at line 195 mentions "version folder 1/ is empty" but the actual check doesn't verify emptiness—it only checks if the folder path prefix exists.🔧 Proposed fix to align name with behavior
-// validateVersionHasFiles checks if version folder has at least one non-empty file -func (p *Predator) validateVersionHasFiles(srcBucket, versionPath string) error { +// validateVersionFolderExists checks if the version folder exists in GCS +func (p *Predator) validateVersionFolderExists(srcBucket, versionPath string) error { exists, err := p.GcsClient.CheckFolderExists(srcBucket, versionPath) if err != nil { return fmt.Errorf("failed to check version folder contents: %w", err) } if !exists { - return fmt.Errorf("version folder 1/ is empty - must contain model files") + return fmt.Errorf("version folder 1/ not found - must exist with model files") } - log.Info().Msg("Version folder 1/ contains files") + log.Info().Msg("Version folder 1/ exists") return nil }horizon/internal/predator/handler/predator_fetch.go-104-124 (1)
104-124:⚠️ Potential issue | 🟡 MinorMissing key existence check for
deployableConfigsmap lookup.Line 124 accesses
deployableConfigs[serviceDiscovery.ServiceDeployableID]without checking if the key exists. If the infrastructure handler call failed inbatchFetchDeployableConfigs, this entry won't exist, and you'll get a zero-valueConfigsilently.🛡️ Proposed fix
+ // Get infrastructure config (HPA/replica info) + infraConfig, infraExists := deployableConfigs[serviceDiscovery.ServiceDeployableID] + if !infraExists { + continue // Skip if infrastructure config not found + } - // Get infrastructure config (HPA/replica info) - infraConfig := deployableConfigs[serviceDiscovery.ServiceDeployableID]horizon/internal/predator/handler/predator_upload.go-377-395 (1)
377-395:⚠️ Potential issue | 🟡 MinorUse
path.Dir()andpath.Base()for more robust path manipulation.The validation on line 382-384 ensures
srcPathis non-empty, but the path extraction logic at lines 386-388 usingstrings.Split()andstrings.TrimSuffix()is less robust than using standard library functions. The codebase already usespath.Dir()andpath.Base()elsewhere (e.g.,extractGCSDetailsin the same file); apply the same pattern here:srcBasePath := path.Dir(srcPath) srcModelName := path.Base(srcPath)This is more idiomatic and handles edge cases consistently.
horizon/internal/repositories/sql/inferflow/config/repository.go-163-176 (1)
163-176:⚠️ Potential issue | 🟡 MinorRemove the manual
updated_atassignment from the map; theBeforeUpdatehook handles it.GORM's
Updates(map)method triggers theBeforeUpdatehook, and the hook'sSetColumn()call overrides any value in the map. The line"updated_at": time.Now()in the map is therefore redundant and has no effect—the hook will set it totime.Now()regardless.horizon/internal/predator/handler/predator_validation.go-278-307 (1)
278-307:⚠️ Potential issue | 🟡 MinorComment/logic mismatch in health check.
The comment says “at least one pod healthy” but the code requires all deployments healthy. Update the comment or loosen the condition.📝 Comment-only fix
- // Consider deployment healthy if at least one pod is healthy and running + // Deployment considered healthy only when all deployments report Healthy return false, nilhorizon/internal/jobs/bulkdeletestrategy/predator_service.go-181-272 (1)
181-272:⚠️ Potential issue | 🟡 MinorGuard empty discoveryConfigIds to avoid unnecessary repo calls.
If there are no active discovery configs, skipping the DB query keeps behavior predictable.🔧 Suggested guard
- predatorConfigList, err := predatorBulkDeleteRepos.predatorConfigRepo.FindByDiscoveryConfigIdsAndAge(discoveryConfigIds, zeroTrafficDays) + if len(discoveryConfigIds) == 0 { + log.Info().Msg("No active discovery configs found for deployable") + return nil, nil, parentToChildMapping, trafficData, nil + } + predatorConfigList, err := predatorBulkDeleteRepos.predatorConfigRepo.FindByDiscoveryConfigIdsAndAge(discoveryConfigIds, zeroTrafficDays)horizon/internal/predator/handler/predator_validation.go-18-70 (1)
18-70:⚠️ Potential issue | 🟡 MinorDuplicate model detection won’t work with the current map-based check.
deployableModelMapdeduplicates by key, so the later count always ends at 1 and duplicates slip through. Track duplicates before inserting into the map.🛠️ Suggested fix
- // Build maps from requested models + // Build maps from requested models + detect duplicates per deployable + modelNameCount := make(map[int]map[string]int) for _, predatorConfig := range predatorConfigList { // Get service deployable ID for this model discoveryConfig, err := p.ServiceDiscoveryRepo.GetById(predatorConfig.DiscoveryConfigID) if err != nil { log.Error().Err(err).Msgf("failed to fetch discovery config for model %s", predatorConfig.ModelName) return false, fmt.Errorf(errFailedToFetchDiscoveryConfigForModel, predatorConfig.ModelName, err) } + if modelNameCount[discoveryConfig.ServiceDeployableID] == nil { + modelNameCount[discoveryConfig.ServiceDeployableID] = map[string]int{} + } + modelNameCount[discoveryConfig.ServiceDeployableID][predatorConfig.ModelName]++ + if modelNameCount[discoveryConfig.ServiceDeployableID][predatorConfig.ModelName] > 1 { + return false, fmt.Errorf(errDuplicateModelNameInDeployable, predatorConfig.ModelName, discoveryConfig.ServiceDeployableID) + } requestedModelMap[predatorConfig.ModelName] = predatorConfig requestedDeployableMap[discoveryConfig.ServiceDeployableID] = true
🧹 Nitpick comments (22)
horizon/internal/externalcall/prometheus_client.go (2)
14-18: Add a short doc comment clarifyingdaysAgosemantics.✍️ Proposed doc comment
type PrometheusClient interface { GetModelNames(serviceName string) ([]string, error) GetInferflowConfigNames(serviceName string) ([]string, error) GetNumerixConfigNames() ([]string, error) + // GetPredatorModelTraffic returns model traffic for the past N days (daysAgo must be > 0). GetPredatorModelTraffic(serviceName string, daysAgo int) (map[string]PredatorModelTraffic, error) }
96-108: Keep Prometheus decode helpers unexported unless used outside the package.
IfPredatorModelResponseis only used for JSON decoding here, unexport it to avoid widening the public surface.♻️ Proposed change
-type PredatorModelResponse struct { +type predatorModelResponse struct { Status string `json:"status"` IsPartial bool `json:"isPartial"` Data struct { ResultType string `json:"resultType"` Result []struct { Metric struct { Model string `json:"model"` } `json:"metric"` Values [][]interface{} `json:"values"` // [[timestamp, "value"], ...] } `json:"result"` } `json:"data"` }- var pr PredatorModelResponse + var pr predatorModelResponseinferflow/handlers/inferflow/inferflow.go (1)
69-69: Extract the init log message into a constant.This literal was added inside a function; please hoist it to a
constnear the existing constant block.Proposed change
const ( UserId = "USER-ID" appVersionCode = "APP-VERSION-CODE" appVersion = "app_version_code" defaultErrMsg = "something went wrong!" userId = "user_id" modelConfigId = "model-config-id" + inferflowInitLogMsg = "Inferflow handler initialized" ) @@ - logger.Info("Inferflow handler initialized") + logger.Info(inferflowInitLogMsg)As per coding guidelines: "we should not use the literal values directly inside functions, they should be stored into separate variables and the variables should be used".
horizon/internal/externalcall/gcs_client_test.go (1)
65-119: Consolidate the repeated nil-client setup and error assertions into helpers.This block repeats the same setup and error checks; a helper keeps tests shorter and aligned with the duplication guideline.
Proposed change (example)
+func newNilClient() *GCSClient { + return &GCSClient{client: nil, ctx: context.Background()} +} + +func requireNotInitialized(t *testing.T, err error) { + require.Error(t, err) + assert.Contains(t, err.Error(), "not initialized") +} @@ - g := &GCSClient{client: nil, ctx: context.Background()} + g := newNilClient() folders, err := g.ListFolders("bucket", "prefix/") - require.Error(t, err) + requireNotInitialized(t, err) assert.Nil(t, folders) - assert.Contains(t, err.Error(), "not initialized")As per coding guidelines: "if duplicate code exists it should be moved to a common method".
horizon/internal/inferflow/handler/inferflow_helpers.go (1)
21-24: Hoist TTL values into a package-level variable/const.Avoid literals inside the method by defining a shared slice.
Proposed change
+var loggingTTLValues = []int{30, 60, 90} + func (m *InferFlow) GetLoggingTTL() (GetLoggingTTLResponse, error) { return GetLoggingTTLResponse{ - Data: []int{30, 60, 90}, + Data: loggingTTLValues, }, nil }As per coding guidelines: "we should not use the literal values directly inside functions, they should be stored into separate variables and the variables should be used".
horizon/internal/inferflow/etcd/models.go (1)
81-87: Consider consolidating duplicateSeenScoreComponentdefinitions.The
SeenScoreComponentstruct is defined identically in three locations:
horizon/internal/inferflow/etcd/models.go(this file)horizon/internal/inferflow/handler/models.gohorizon/internal/repositories/sql/inferflow/models.goThis duplication could lead to drift if one definition is updated but others are not. Consider defining this type in a shared package and importing it where needed.
horizon/internal/repositories/sql/predatorconfig/sql.go (1)
155-168: Clarify the distinction between similar query methods.This method (
FindByDiscoveryConfigIdsAndAge) queries bycreated_at < cutoffDate AND active = true, whileFindByDiscoveryIDsAndCreatedBefore(lines 126-135) queries byupdated_at < cutoffDatewithout filtering onactive.The naming is similar but behavior differs significantly:
- One filters by
created_atwithactive = true- The other filters by
updated_atwithout an active filterConsider adding documentation to clarify the distinct use cases for each method, or unifying them if they serve the same purpose.
horizon/internal/repositories/sql/inferflow/models.go (1)
253-257: Consider documenting expected types forFeatureSize.Using
anyforFeatureSizeprovides flexibility but loses type safety. If the expected types are known (e.g.,int,[]int,string), consider documenting them or using a more specific type/interface.horizon/internal/predator/handler/predator_functional_testing.go (2)
205-217: Extract magic numbers as named constants.The values
10and128used to replace dynamic dimensions (-1) are magic numbers. Consider extracting them as named constants for clarity and maintainability.♻️ Proposed refactor
+const ( + defaultBatchDimension = 10 + defaultFeatureDimension = 128 +) + for i, dim := range result { if dim == -1 { if i == 0 { - result[i] = 10 + result[i] = defaultBatchDimension } else { - result[i] = 128 + result[i] = defaultFeatureDimension } log.Debug().Msgf("Replaced dynamic dimension -1 at position %d with %d", i, result[i])
156-163: Data truncation is silent when batch size exceeds available data.If
end > len(dataSlice), the batch is silently skipped. This could mask issues where the expected batch count doesn't match the actual data. Consider logging a warning or returning an error.♻️ Proposed fix to add warning
for i := int64(0); i < batchSize; i++ { start := i * elementsPerBatch end := start + elementsPerBatch if end <= int64(len(dataSlice)) { batch := dataSlice[start:end] result = append(result, batch) + } else { + log.Warn().Msgf("Batch %d truncated: expected %d elements but only %d available", i, end, len(dataSlice)) } }horizon/internal/repositories/sql/discoveryconfig/sql.go (1)
124-126:ReactivateByIDTxdoesn't update audit fields.Unlike
DeactivateByIDwhich setsupdated_byandupdated_at,ReactivateByIDTxonly setsactive = true. For audit consistency, consider updating the timestamp and updater.♻️ Proposed fix to add audit fields
-func (r *discoveryConfigRepo) ReactivateByIDTx(tx *gorm.DB, id int) error { - return tx.Model(&DiscoveryConfig{}).Where("id = ?", id).Update("active", true).Error +func (r *discoveryConfigRepo) ReactivateByIDTx(tx *gorm.DB, id int, updatedBy string) error { + return tx.Model(&DiscoveryConfig{}).Where("id = ?", id).Updates(map[string]interface{}{ + "active": true, + "updated_by": updatedBy, + "updated_at": time.Now(), + }).Error }Note: This would require updating the interface signature and all callers.
horizon/internal/repositories/sql/inferflow/request/repository.go (2)
96-106: Redundant error return and inconsistent error handling.Line 105 returns
result.Errorwhich was already checked and is nil at that point. Additionally, returning an error when no records are found differs from other methods likeGetApprovedRequestsByConfigIDwhich return empty slices.♻️ Proposed fix
func (g *InferflowRequest) GetByConfigIDandVersion(configID string, version int) ([]Table, error) { var tables []Table result := g.db.Where("config_id = ? AND version = ? AND status = 'APPROVED'", configID, version).Find(&tables) if result.Error != nil { return nil, result.Error } if len(tables) == 0 { return nil, errors.New("no request found with the given config_id and version") } - return tables, result.Error + return tables, nil }
162-172:GetRequestByIDreturns empty struct on not found instead of an error.Returning an empty
Table{}withnilerror when a record is not found differs from typical repository patterns (which return an error). Callers must check for an empty struct rather than an error, which could lead to subtle bugs.Consider either:
- Returning a pointer (
*Table) withnilwhen not found- Returning
gorm.ErrRecordNotFoundas the errorhorizon/internal/configs/app_config.go (1)
74-74: Inconsistent field alignment.Line 74 has extra whitespace before the
inttype compared to adjacent fields (lines 73, 75, 76). This appears to be a formatting inconsistency.🧹 Proposed formatting fix
- BulkDeleteInferflowMaxInactiveDays int `mapstructure:"bulk_delete_inferflow_max_inactive_days"` + BulkDeleteInferflowMaxInactiveDays int `mapstructure:"bulk_delete_inferflow_max_inactive_days"`horizon/internal/inferflow/handler/inferflow_constants.go (1)
16-17: Consider removing redundant boolean aliases.
activeTrue = trueandactiveFalse = falseprovide no abstraction benefit over usingtrue/falsedirectly. They add indirection without semantic value.horizon/internal/inferflow/handler/inferflow_functional_testing.go (1)
160-161: Inconsistent logging:fmt.Printlninstead ofzerolog.The file imports and uses
zerolog(e.g., line 131, 162), but lines 160 and 177 usefmt.Printlnfor error logging. Use the structured logger consistently.♻️ Proposed fix
if err != nil { - fmt.Println("Error getting inferflow config: ", err) + log.Error().Err(err).Msg("Error getting inferflow config") } else if inferFlowConfig == nil {err = m.InferFlowConfigRepo.Update(inferFlowConfig) if err != nil { - fmt.Println("Error updating inferflow config: ", err) + log.Error().Err(err).Msg("Error updating inferflow config") }Also applies to: 177-178
horizon/internal/inferflow/handler/inferflow_test.go (1)
115-153: Consider consolidating mock repositories.Both
inferflow_test.goandpredator_test.godefine nearly identical mock implementations ofServiceDeployableRepository. Consider extracting a shared mock into a test helper package to reduce duplication.horizon/internal/repositories/sql/inferflow/config/repository.go (1)
137-146: Consider extracting shared delete logic.
DeleteByConfigIDTxduplicates the logic fromDeleteTx(lines 72-81). Consider extracting the common delete-by-config-id logic into a shared helper to reduce duplication.♻️ Proposed refactor
+func deleteByConfigID(tx *gorm.DB, configID string) error { + result := tx.Where("config_id = ?", configID).Delete(&Table{}) + if result.Error != nil { + return result.Error + } + if result.RowsAffected == 0 { + return errors.New("no config found with the given config_id") + } + return nil +} func (g *InferflowConfig) DeleteTx(tx *gorm.DB, configID string) error { - result := tx.Where("config_id = ?", configID).Delete(&Table{}) - if result.Error != nil { - return result.Error - } - if result.RowsAffected == 0 { - return errors.New("no config found with the given config_id") - } - return nil + return deleteByConfigID(tx, configID) } ... func (g *InferflowConfig) DeleteByConfigIDTx(tx *gorm.DB, configID string) error { - result := tx.Where("config_id = ?", configID).Delete(&Table{}) - if result.Error != nil { - return result.Error - } - if result.RowsAffected == 0 { - return errors.New("no config found with the given config_id") - } - return nil + return deleteByConfigID(tx, configID) }horizon/internal/predator/handler/predator_upload.go (1)
497-532: Duplicate implementation ofreplaceModelNameInConfigPreservingFormat.This function duplicates the logic in
horizon/internal/externalcall/gcs_client.go(replaceModelNameInConfig, lines 465-510). Consider extracting to a shared utility to avoid maintaining two implementations.horizon/internal/predator/handler/predator_helpers.go (1)
14-34: Redundant length check inparseGCSURL.Line 25 checks
len(parts) < 1, butstrings.SplitN(trimmed, "/", 2)will always return at least one element (the original string if no separator is found). This check is redundant.♻️ Simplified version
trimmed := strings.TrimPrefix(gcsURL, gcsPrefix) parts := strings.SplitN(trimmed, slashConstant, 2) - if len(parts) < 1 { - return constant.EmptyString, constant.EmptyString, false - } bucket = parts[0] + if bucket == "" { + return constant.EmptyString, constant.EmptyString, false + } if len(parts) == 2 { objectPath = parts[1] } return bucket, objectPath, truehorizon/internal/externalcall/gcs_client.go (1)
465-510: Consider pre-compiling regex patterns.The
replaceModelNameInConfigfunction compiles regex patterns (namePatternandvaluePattern) on every invocation. For frequently called functions, consider compiling these at package level.♻️ Pre-compile regex patterns
+var ( + namePattern = regexp.MustCompile(`name\s*:\s*"([^"]+)"`) + valuePattern = regexp.MustCompile(`"([^"]+)"`) +) func replaceModelNameInConfig(data []byte, destModelName string) []byte { content := string(data) lines := strings.Split(content, "\n") for i, line := range lines { trimmed := strings.TrimSpace(line) if strings.HasPrefix(trimmed, "name:") { leadingWhitespace := len(line) - len(strings.TrimLeft(line, " \t")) if leadingWhitespace >= 2 { continue } - namePattern := regexp.MustCompile(`name\s*:\s*"([^"]+)"`) matches := namePattern.FindStringSubmatch(line) // ... rest of function - valuePattern := regexp.MustCompile(`"([^"]+)"`) valueReplaced := valuePattern.ReplaceAllString(matched, fmt.Sprintf(`"%s"`, destModelName))horizon/internal/predator/handler/predator_fetch.go (1)
27-56: Use batch fetch methods to avoid N+1 query pattern.The current implementation fetches each discovery config and service deployable individually in a loop (lines 30 and 51). The repositories provide batch methods that should be used instead:
GetByDiscoveryIDs()for discovery configsGetByIds()for service deployablesThis will reduce database round-trips from N+1 to a single query per resource type.
|
Links: Reason: 📘 Please check the Turbo Turtle FAQs for more details. 👉 If the reason is not self-explanatory or not covered in the FAQ, please reach out to @devops-oncall in the #devops-tech channel. |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
quick-start/db-init/scripts/init-mysql.sh (1)
687-705:⚠️ Potential issue | 🟡 MinorService value should be lowercase.
This seed insert uses
'INFERFLOW', but service names must be stored lowercase. Please change to'inferflow', and consider a simple assertion in this script to fail if any service value is not lowercase.Based on learnings: “ensure that service names (inferflow, predator, numerix) are stored in lowercase in database tables and are consistently used in lowercase throughout the code.”
🤖 Fix all issues with AI agents
In `@horizon/internal/predator/handler/predator_approval.go`:
- Around line 196-201: The config-source update (call to
updateInstanceCountInConfigSource) is applied before downstream steps (GCS
transfer, DB update, restart) and is not rolled back on failure; either move the
update to after those steps succeed or implement a rollback: read and cache the
original config before calling updateInstanceCountInConfigSource (use the same
configBucket/configPath/modelName context and the local variable
transferredGcsModelData for flow), then if any subsequent operation (the GCS
transfer, DB update, or restart code paths that follow this block) returns an
error, call a restore function to write the cached original config back;
alternatively, defer performing updateInstanceCountInConfigSource until all
downstream operations succeed and then call it as the final step.
- Around line 223-252: The function updateInstanceCountInConfigSource should
validate the instanceCount parameter before touching GCS: add a defensive check
at the top of updateInstanceCountInConfigSource (before reading config) to
return a clear error if instanceCount is <= 0 (or otherwise invalid), so you
never write an invalid Triton config; reference the function name
updateInstanceCountInConfigSource and the downstream helper
replaceInstanceCountInConfigPreservingFormat when adding the guard and error
message.
| if configBucket != "" && configPath != "" && payload.MetaData.InstanceCount > 0 { | ||
| if err := p.updateInstanceCountInConfigSource(configBucket, configPath, modelName, payload.MetaData.InstanceCount); err != nil { | ||
| log.Error().Err(err).Msgf("Failed to update instance count in config-source for model %s", modelName) | ||
| return transferredGcsModelData, err | ||
| } | ||
| } |
There was a problem hiding this comment.
Config-source update lacks rollback on later failures.
If the instance_count update succeeds but the subsequent GCS transfer / DB update / restart fails, the config-source stays mutated while the request is marked failed. Consider deferring this update until success or capturing the original config and restoring it on any downstream failure.
🤖 Prompt for AI Agents
In `@horizon/internal/predator/handler/predator_approval.go` around lines 196 -
201, The config-source update (call to updateInstanceCountInConfigSource) is
applied before downstream steps (GCS transfer, DB update, restart) and is not
rolled back on failure; either move the update to after those steps succeed or
implement a rollback: read and cache the original config before calling
updateInstanceCountInConfigSource (use the same
configBucket/configPath/modelName context and the local variable
transferredGcsModelData for flow), then if any subsequent operation (the GCS
transfer, DB update, or restart code paths that follow this block) returns an
error, call a restore function to write the cached original config back;
alternatively, defer performing updateInstanceCountInConfigSource until all
downstream operations succeed and then call it as the final step.
There was a problem hiding this comment.
Mutation is not harmful and rollback will cause more read writes.
There was a problem hiding this comment.
@paras-agarwal-meesho, understood. I acknowledge your decision to proceed without rollback for the config-source update to minimize read/write operations.
✏️ Learnings added
Learnt from: paras-agarwal-meesho
Repo: Meesho/BharatMLStack PR: 289
File: horizon/internal/predator/handler/predator_approval.go:196-201
Timestamp: 2026-02-17T07:03:43.427Z
Learning: In the Predator handler (horizon/internal/predator/handler), config-source mutations (e.g., instance count updates) are intentionally not rolled back on downstream failures to avoid additional read/write operations, as these mutations are considered non-harmful even if the request ultimately fails.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| func (p *Predator) updateInstanceCountInConfigSource(bucket, basePath, modelName string, instanceCount int) error { | ||
| if modelName == "" { | ||
| return fmt.Errorf("model name is empty, required to update instance count in config-source") | ||
| } | ||
|
|
||
| configPath := path.Join(basePath, modelName, configFile) | ||
| configData, err := p.GcsClient.ReadFile(bucket, configPath) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to read config.pbtxt from config-source for model %s: %w", modelName, err) | ||
| } | ||
|
|
||
| var modelConfig ModelConfig | ||
| if err := prototext.Unmarshal(configData, &modelConfig); err != nil { | ||
| return fmt.Errorf("failed to parse config.pbtxt from config-source for model %s: %w", modelName, err) | ||
| } | ||
| if len(modelConfig.InstanceGroup) == 0 { | ||
| return fmt.Errorf("%s (model %s)", errNoInstanceGroup, modelName) | ||
| } | ||
|
|
||
| newConfigData, err := replaceInstanceCountInConfigPreservingFormat(configData, instanceCount) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to update instance count in config for model %s: %w", modelName, err) | ||
| } | ||
| if err := p.GcsClient.UploadFile(bucket, configPath, newConfigData); err != nil { | ||
| return fmt.Errorf("failed to upload config.pbtxt to config-source for model %s: %w", modelName, err) | ||
| } | ||
|
|
||
| log.Info().Msgf("Updated instance_count to %d in config-source for model %s", instanceCount, modelName) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Validate instanceCount before writing.
The helper is public and should defensively guard against zero/negative values to avoid invalid Triton configs even if callers misbehave.
✅ Suggested guard
func (p *Predator) updateInstanceCountInConfigSource(bucket, basePath, modelName string, instanceCount int) error {
if modelName == "" {
return fmt.Errorf("model name is empty, required to update instance count in config-source")
}
+ if instanceCount <= 0 {
+ return fmt.Errorf("instance count must be > 0 for model %s", modelName)
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (p *Predator) updateInstanceCountInConfigSource(bucket, basePath, modelName string, instanceCount int) error { | |
| if modelName == "" { | |
| return fmt.Errorf("model name is empty, required to update instance count in config-source") | |
| } | |
| configPath := path.Join(basePath, modelName, configFile) | |
| configData, err := p.GcsClient.ReadFile(bucket, configPath) | |
| if err != nil { | |
| return fmt.Errorf("failed to read config.pbtxt from config-source for model %s: %w", modelName, err) | |
| } | |
| var modelConfig ModelConfig | |
| if err := prototext.Unmarshal(configData, &modelConfig); err != nil { | |
| return fmt.Errorf("failed to parse config.pbtxt from config-source for model %s: %w", modelName, err) | |
| } | |
| if len(modelConfig.InstanceGroup) == 0 { | |
| return fmt.Errorf("%s (model %s)", errNoInstanceGroup, modelName) | |
| } | |
| newConfigData, err := replaceInstanceCountInConfigPreservingFormat(configData, instanceCount) | |
| if err != nil { | |
| return fmt.Errorf("failed to update instance count in config for model %s: %w", modelName, err) | |
| } | |
| if err := p.GcsClient.UploadFile(bucket, configPath, newConfigData); err != nil { | |
| return fmt.Errorf("failed to upload config.pbtxt to config-source for model %s: %w", modelName, err) | |
| } | |
| log.Info().Msgf("Updated instance_count to %d in config-source for model %s", instanceCount, modelName) | |
| return nil | |
| } | |
| func (p *Predator) updateInstanceCountInConfigSource(bucket, basePath, modelName string, instanceCount int) error { | |
| if modelName == "" { | |
| return fmt.Errorf("model name is empty, required to update instance count in config-source") | |
| } | |
| if instanceCount <= 0 { | |
| return fmt.Errorf("instance count must be > 0 for model %s", modelName) | |
| } | |
| configPath := path.Join(basePath, modelName, configFile) | |
| configData, err := p.GcsClient.ReadFile(bucket, configPath) | |
| if err != nil { | |
| return fmt.Errorf("failed to read config.pbtxt from config-source for model %s: %w", modelName, err) | |
| } | |
| var modelConfig ModelConfig | |
| if err := prototext.Unmarshal(configData, &modelConfig); err != nil { | |
| return fmt.Errorf("failed to parse config.pbtxt from config-source for model %s: %w", modelName, err) | |
| } | |
| if len(modelConfig.InstanceGroup) == 0 { | |
| return fmt.Errorf("%s (model %s)", errNoInstanceGroup, modelName) | |
| } | |
| newConfigData, err := replaceInstanceCountInConfigPreservingFormat(configData, instanceCount) | |
| if err != nil { | |
| return fmt.Errorf("failed to update instance count in config for model %s: %w", modelName, err) | |
| } | |
| if err := p.GcsClient.UploadFile(bucket, configPath, newConfigData); err != nil { | |
| return fmt.Errorf("failed to upload config.pbtxt to config-source for model %s: %w", modelName, err) | |
| } | |
| log.Info().Msgf("Updated instance_count to %d in config-source for model %s", instanceCount, modelName) | |
| return nil | |
| } |
🤖 Prompt for AI Agents
In `@horizon/internal/predator/handler/predator_approval.go` around lines 223 -
252, The function updateInstanceCountInConfigSource should validate the
instanceCount parameter before touching GCS: add a defensive check at the top of
updateInstanceCountInConfigSource (before reading config) to return a clear
error if instanceCount is <= 0 (or otherwise invalid), so you never write an
invalid Triton config; reference the function name
updateInstanceCountInConfigSource and the downstream helper
replaceInstanceCountInConfigPreservingFormat when adding the guard and error
message.
There was a problem hiding this comment.
check is done at caller site and function wont be called anywhere else.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
horizon/internal/predator/handler/predator_upload.go (1)
500-534:⚠️ Potential issue | 🟡 MinorRemove unused
replaceModelNameInConfigPreservingFormatmethod and its associated test.This method is only referenced in unit tests and is no longer called in production code—both
copyConfigToProdConfigSourceandcopyConfigToNewNameInConfigSourcenow delegate toexternalcall.ReplaceModelNameInConfig. Remove the method frompredator_upload.go(lines 500-534) and the corresponding test case frompredator_test.go(line 51).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@horizon/internal/predator/handler/predator_upload.go` around lines 500 - 534, Remove the unused helper replaceModelNameInConfigPreservingFormat from predator_upload.go and delete its corresponding unit test in predator_test.go (the test that targets the old formatter, referenced around the earlier test at line 51), since production paths copyConfigToProdConfigSource and copyConfigToNewNameInConfigSource now use externalcall.ReplaceModelNameInConfig; ensure no other references to replaceModelNameInConfigPreservingFormat remain (search for the function name) and run tests to confirm only the redundant test was removed.
🧹 Nitpick comments (1)
horizon/internal/externalcall/gcs_client_test.go (1)
38-43: Consider adding test case for empty destination model name.The
ReplaceModelNameInConfigfunction validates thatdestModelNamecannot be empty, but there's no test case covering this behavior.💡 Suggested additional test case
{ name: "invalid pbtxt returns error", data: []byte(`invalid_field: "x"`), destModelName: "any", expectError: true, }, + { + name: "empty dest model name returns error", + data: []byte(`name: "old_model"` + "\n"), + destModelName: "", + expectError: true, + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@horizon/internal/externalcall/gcs_client_test.go` around lines 38 - 43, Add a test case to hcs/internal/externalcall/gcs_client_test.go's table-driven tests that exercises ReplaceModelNameInConfig with an empty destModelName (e.g., name "empty destModelName returns error"), supply a valid config payload (or the invalid pbtxt already used) and set expectError to true; this verifies ReplaceModelNameInConfig enforces non-empty destModelName. Use the same test runner that iterates over the table so the new entry follows the existing pattern and asserts an error is returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@horizon/internal/predator/handler/predator_upload.go`:
- Around line 500-534: Remove the unused helper
replaceModelNameInConfigPreservingFormat from predator_upload.go and delete its
corresponding unit test in predator_test.go (the test that targets the old
formatter, referenced around the earlier test at line 51), since production
paths copyConfigToProdConfigSource and copyConfigToNewNameInConfigSource now use
externalcall.ReplaceModelNameInConfig; ensure no other references to
replaceModelNameInConfigPreservingFormat remain (search for the function name)
and run tests to confirm only the redundant test was removed.
---
Duplicate comments:
In `@horizon/internal/predator/handler/predator_approval.go`:
- Around line 224-227: The updateInstanceCountInConfigSource function currently
checks modelName but not instanceCount; add a guard at the start of
updateInstanceCountInConfigSource(bucket, basePath, modelName string,
instanceCount int) to validate instanceCount > 0 (or whatever minimum is
required) and return a clear error (e.g., fmt.Errorf("invalid instanceCount:
%d", instanceCount)) when it's zero or negative so an invalid Triton config is
never written to GCS; place this check alongside the existing modelName check
and use the same error-return pattern.
---
Nitpick comments:
In `@horizon/internal/externalcall/gcs_client_test.go`:
- Around line 38-43: Add a test case to
hcs/internal/externalcall/gcs_client_test.go's table-driven tests that exercises
ReplaceModelNameInConfig with an empty destModelName (e.g., name "empty
destModelName returns error"), supply a valid config payload (or the invalid
pbtxt already used) and set expectError to true; this verifies
ReplaceModelNameInConfig enforces non-empty destModelName. Use the same test
runner that iterates over the table so the new entry follows the existing
pattern and asserts an error is returned.
pavan-adari-meesho
left a comment
There was a problem hiding this comment.
Take a pull from develop and fix conflicts if any
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@horizon/internal/predator/handler/predator_approval.go`:
- Around line 224-259: The updateInstanceCountInConfigSource function currently
allows zero or negative instanceCount; add an input guard at the start of
updateInstanceCountInConfigSource to validate instanceCount > 0 and return a
clear error (e.g. fmt.Errorf("instanceCount must be positive, got %d",
instanceCount)) before any GCS reads or parsing; ensure the check sits
immediately after the modelName validation so functions like
p.GcsClient.ReadFile, prototext.Unmarshal, and UploadFile are not executed with
invalid counts.
🔁 Pull Request Template – BharatMLStack
📌 Summary
Adds instance-count edit support in predator (prod):
replaceInstanceCountInConfigPreservingFormatinpredator_helpers.goupdates thecountinsideinstance_groupin config.pbtxtupdateInstanceCountInConfigSourcechanges the config.pbtxt file in config-sourceprocessEditGCSCopyStageinpredator_approval.goapplies it to GCS config-source before transfer.Adds unit tests in
predator_test.gofor the helper.📂 Modules Affected
horizon(Real-time systems / networking)online-feature-store(Feature serving infra)trufflebox-ui(Admin panel / UI)infra(Docker, CI/CD, GCP/AWS setup)docs(Documentation updates)___________✅ Type of Change
___________📊 Benchmark / Metrics (if applicable)
Summary by CodeRabbit
New Features
Bug Fixes
Tests