Skip to content

[RHIDP-12889] Update Lightspeed Deployment and Configs#2645

Merged
openshift-merge-bot[bot] merged 15 commits into
redhat-developer:mainfrom
Jdubrick:update-lightspeed-deployment
Apr 22, 2026
Merged

[RHIDP-12889] Update Lightspeed Deployment and Configs#2645
openshift-merge-bot[bot] merged 15 commits into
redhat-developer:mainfrom
Jdubrick:update-lightspeed-deployment

Conversation

@Jdubrick
Copy link
Copy Markdown
Contributor

@Jdubrick Jdubrick commented Apr 10, 2026

Description

  • Updates Lightspeed Deployment as we are dropping Llama Stack container and running Lightspeed Core in 'library' mode
  • Adds sync script for fetching the config files for Lightspeed from our upstream
    • We will be cutting a 1.10 branch, adding ability to fetch that in the script
  • Adds rhdh-profile.py as a config map for Lightspeed
  • Updates Lightspeed Core and RAG image to our 0.5.0 versions which will be the same or closely resemble our productized images (in process of building those)
  • Removes MCP plugins as they are dev preview
    • Adds MCP config to LCORE config though so if they add the plugin it works without needing to edit the config, this can stay and is skipped if no MCP setup

Which issue(s) does this PR fix or relate to

https://redhat.atlassian.net/browse/RHIDP-12889

PR acceptance criteria

  • Tests
  • Documentation

How to test changes / Special notes to the reviewer

@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge Bot commented Apr 10, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (2) 📎 Requirement gaps (0)

Grey Divider


Action required

1. EXIT trap rm -rf unguarded 📘 Rule violation ☼ Reliability
Description
The script enables set -euo pipefail but the EXIT trap runs rm -rf "$TMP_DIR" without `||
true/set +e`, so a cleanup failure can incorrectly fail the script or alter its exit behavior.
Code

hack/sync-lightspeed-configs.sh[R134-135]

+    TMP_DIR="$(mktemp -d)"
+    trap 'rm -rf "$TMP_DIR"' EXIT
Evidence
PR Compliance ID 3 requires cleanup commands executed in an EXIT trap to be best-effort when `set
-e is enabled. The script enables strict mode and defines an EXIT trap that performs rm -rf`
without guarding against failure.

Rule 3: Cleanup commands in EXIT traps must not cause script failure
hack/sync-lightspeed-configs.sh[6-6]
hack/sync-lightspeed-configs.sh[134-135]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`hack/sync-lightspeed-configs.sh` runs with `set -euo pipefail` and registers an `EXIT` trap that calls `rm -rf "$TMP_DIR"` without guarding against failures. Per compliance, cleanup in EXIT traps must be best-effort and must not cause script failure.

## Issue Context
Even though `rm -rf` often succeeds, it can fail (permissions, transient FS issues). Under strict mode this can incorrectly fail the script or change its exit behavior.

## Fix Focus Areas
- hack/sync-lightspeed-configs.sh[134-135]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Deployment omits spec.replicas note 📘 Rule violation ✧ Quality
Description
The modified Deployment manifest snippet omits spec.replicas without an adjacent comment
explaining that replicas are intentionally managed by autoscaling/another controller, which makes
scaling intent ambiguous.
Code

config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[R5-8]

    spec:
      initContainers:
        - name: init-rag-data
-          image: 'quay.io/redhat-ai-dev/rag-content:release-1.9-lcs'
+          image: quay.io/redhat-ai-dev/rag-content:release-1.9-lls-0.5.0
Evidence
PR Compliance ID 9 requires an explicit nearby comment when spec.replicas is omitted from a
Deployment/StatefulSet manifest. The updated Deployment snippet starts spec: and proceeds directly
to template: without replicas: or an autoscaling explanation.

Rule 9: Document omitted spec.replicas when relying on autoscaling
config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[1-8]
bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml[241-246]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The Lightspeed `Deployment` manifest snippet omits `spec.replicas` but does not include an adjacent comment stating that replicas are intentionally managed by autoscaling (HPA/KEDA) or another controller.

## Issue Context
This file (and its rendered bundle output) is a Kubernetes workload manifest snippet; without a note, users cannot tell if replicas were forgotten or intentionally omitted.

## Fix Focus Areas
- config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[1-8]
- bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml[241-246]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. RAG path mismatch🐞 Bug ≡ Correctness
Description
The init-rag-data container copies the docs vector DB into the volume root (e.g.,
/rag-content/rhdh_product_docs/...), but Lightspeed is configured to read the DB from
/rag-content/vector_db/rhdh_product_docs/..., so the preloaded DB file path won’t exist at runtime.
Code

config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml[R175-177]

+        kv_rag:
          type: kv_sqlite
          db_path: /rag-content/vector_db/rhdh_product_docs/1.9/faiss_store.db
Evidence
Lightspeed config points kv_rag.db_path under /rag-content/vector_db/..., while the init container
copies /rag/vector_db/rhdh_product_docs directly into /data (mounted from the same rag-data-volume),
resulting in /rag-content/rhdh_product_docs/... (no intermediate vector_db directory).

config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml[167-178]
config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[6-16]
config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[26-32]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The RAG init container copies vector DB content into the shared volume in a location that does not match the `kv_rag.db_path` configured for Lightspeed, so Lightspeed won’t be able to find the preloaded DB.

### Issue Context
- Init container copies: `cp -r /rag/vector_db/rhdh_product_docs /data/`
- Runtime config expects: `/rag-content/vector_db/rhdh_product_docs/...`
- `/data` in init container is the same `rag-data-volume` that is later mounted at `/rag-content` in `lightspeed-core`.

### Fix Focus Areas
Update either the init copy destination to include `vector_db/` or update the configured `db_path` to match the copied layout.
- config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[6-16]
- config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml[167-178]
- bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml[241-279] (regenerate if derived)
- dist/rhdh/install.yaml[3010-3045] (regenerate if derived)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. --ref missing value 🐞 Bug ☼ Reliability
Description
sync-lightspeed-configs.sh assigns REF="$2" for --ref without checking that a value is present; with
set -u this fails if --ref is provided as the last argument.
Code

hack/sync-lightspeed-configs.sh[R30-36]

+parse_args() {
+    while [[ $# -gt 0 ]]; do
+        case "$1" in
+            --ref)
+                REF="$2"
+                shift 2
+                ;;
Evidence
The script enables nounset (set -euo pipefail) and then unconditionally reads $2 when parsing
--ref, which will error if the caller forgets to provide the ref value.

hack/sync-lightspeed-configs.sh[6-6]
hack/sync-lightspeed-configs.sh[30-36]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`--ref` parsing assumes a value exists (`$2`). With `set -u`, running `./hack/sync-lightspeed-configs.sh --ref` will terminate due to an unset positional parameter.

### Issue Context
This is a robustness/usability issue for the sync tool.

### Fix Focus Areas
- Add an explicit check that `$2` is set and non-empty before assigning to `REF` (otherwise print usage and exit 1).
- hack/sync-lightspeed-configs.sh[30-46]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread hack/sync-lightspeed-configs.sh Fixed
Comment thread hack/sync-lightspeed-configs.sh Outdated
@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge Bot commented Apr 10, 2026

Review Summary by Qodo

(Agentic_describe updated until commit 07caf2b)

Update Lightspeed Deployment to use Core in library mode with 0.5.0 images

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Removes Llama Stack container, runs Lightspeed Core in library mode
• Updates Lightspeed Core and RAG images to 0.5.0 versions
• Adds sync script for fetching upstream Lightspeed configs
• Adds rhdh-profile.py config with comprehensive prompt templates
• Removes MCP plugins from default install, keeps config for optional use
• Adds resource limits for Lightspeed Core and init-rag-data containers
Diagram
flowchart LR
  A["Llama Stack 0.1.4"] -->|removed| B["Lightspeed Core 0.5.0"]
  C["RAG Content release-1.9-lcs"] -->|updated| D["RAG Content release-1.9-lls-0.5.0"]
  B -->|library mode| E["config.yaml + lightspeed-stack.yaml"]
  E -->|new| F["rhdh-profile.py"]
  G["sync-lightspeed-configs.sh"] -->|fetches from| H["redhat-ai-dev/lightspeed-configs"]
  H -->|updates| E
  I["MCP Plugins"] -->|removed from default| J["kept in config"]
  B -->|resource limits| K["CPU: 100m-1000m, Memory: 512Mi-2Gi"]
Loading

Grey Divider

File Changes

1. hack/sync-lightspeed-configs.sh ✨ Enhancement +178/-0

New script to sync upstream Lightspeed configs

hack/sync-lightspeed-configs.sh


2. config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml ✨ Enhancement +427/-36

Update Lightspeed configs with 0.5.0 and profile

config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml


3. config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml ✨ Enhancement +21/-15

Remove Llama Stack, update images and add resources

config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml


View more (12)
4. config/profile/rhdh/default-config/flavours/lightspeed/dynamic-plugins.yaml ✨ Enhancement +0/-8

Remove MCP plugins from default installation

config/profile/rhdh/default-config/flavours/lightspeed/dynamic-plugins.yaml


5. integration_tests/rhdh-config_test.go 🧪 Tests +29/-5

Update tests for Lightspeed Core and resource limits

integration_tests/rhdh-config_test.go


6. bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml ⚙️ Configuration changes +246/-65

Regenerate bundle with updated Lightspeed configs

bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml


7. bundle/backstage.io/manifests/backstage-operator.clusterserviceversion.yaml ⚙️ Configuration changes +1/-1

Update operator bundle creation timestamp

bundle/backstage.io/manifests/backstage-operator.clusterserviceversion.yaml


8. bundle/rhdh/manifests/backstage-operator.clusterserviceversion.yaml ⚙️ Configuration changes +1/-1

Update operator bundle creation timestamp

bundle/rhdh/manifests/backstage-operator.clusterserviceversion.yaml


9. dist/rhdh/install.yaml ⚙️ Configuration changes +246/-65

Regenerate installer manifests with new configs

dist/rhdh/install.yaml


10. docs/lightspeed.md 📝 Documentation +35/-0

Add documentation for sync script usage

docs/lightspeed.md


11. examples/lightspeed.yaml ⚙️ Configuration changes +14/-9

Update example secret with new environment variables

examples/lightspeed.yaml


12. api/v1alpha3/zz_generated.deepcopy.go Formatting +1/-1

Fix import formatting in generated code

api/v1alpha3/zz_generated.deepcopy.go


13. api/v1alpha4/zz_generated.deepcopy.go Formatting +1/-1

Fix import formatting in generated code

api/v1alpha4/zz_generated.deepcopy.go


14. api/v1alpha5/zz_generated.deepcopy.go Formatting +1/-1

Fix import formatting in generated code

api/v1alpha5/zz_generated.deepcopy.go


15. api/v1alpha6/zz_generated.deepcopy.go Formatting +1/-1

Fix import formatting in generated code

api/v1alpha6/zz_generated.deepcopy.go


Grey Divider

Qodo Logo

Copy link
Copy Markdown
Contributor

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM @Jdubrick

one f/up question (I'm sure not surprising ;-) )

given the conversations we've had with @gazarenkov @johnmcollier and others, and all the good progress you've made today with analyzing the rolling demo deployment as well as some ad hoc installed but inert deployments, do you plan on including the our initial resources req/limits with this PR, or a f/up PR ?

@Jdubrick
Copy link
Copy Markdown
Contributor Author

@gabemontero I actually pushed them right after your comment :D

a03d9ff

@gabemontero
Copy link
Copy Markdown
Contributor

@gabemontero I actually pushed them right after your comment :D

a03d9ff

cool :-)

Copy link
Copy Markdown
Contributor

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @Jdubrick

though admittedly I was focused on the resource stuff

@gazarenkov fyi

@Jdubrick
Copy link
Copy Markdown
Contributor Author

/hold

Co-authored-by: Jdubrick <Jdubrick@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ Files changed in bundle and installer generation!

Those changes to the operator bundle/installer manifests should have been pushed automatically to your PR branch.

NOTE: If the PR checks are stuck after this additional commit, manually close the PR and immediately reopen it to trigger the checks again.

@openshift-ci openshift-ci Bot removed the lgtm label Apr 22, 2026
@rm3l rm3l closed this Apr 22, 2026
@rm3l rm3l reopened this Apr 22, 2026
@rhdh-qodo-merge
Copy link
Copy Markdown

rhdh-qodo-merge Bot commented Apr 22, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (2) 📎 Requirement gaps (0)

Grey Divider


Action required

1. EXIT trap rm -rf unguarded 📘 Rule violation ☼ Reliability
Description
The script enables set -euo pipefail but the EXIT trap runs rm -rf "$TMP_DIR" without `||
true/set +e`, so a cleanup failure can incorrectly fail the script or alter its exit behavior.
Code

hack/sync-lightspeed-configs.sh[R134-135]

+    TMP_DIR="$(mktemp -d)"
+    trap 'rm -rf "$TMP_DIR"' EXIT
Evidence
PR Compliance ID 3 requires cleanup commands executed in an EXIT trap to be best-effort when `set
-e is enabled. The script enables strict mode and defines an EXIT trap that performs rm -rf`
without guarding against failure.

Rule 3: Cleanup commands in EXIT traps must not cause script failure
hack/sync-lightspeed-configs.sh[6-6]
hack/sync-lightspeed-configs.sh[134-135]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`hack/sync-lightspeed-configs.sh` runs with `set -euo pipefail` and registers an `EXIT` trap that calls `rm -rf "$TMP_DIR"` without guarding against failures. Per compliance, cleanup in EXIT traps must be best-effort and must not cause script failure.

## Issue Context
Even though `rm -rf` often succeeds, it can fail (permissions, transient FS issues). Under strict mode this can incorrectly fail the script or change its exit behavior.

## Fix Focus Areas
- hack/sync-lightspeed-configs.sh[134-135]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Deployment omits spec.replicas note 📘 Rule violation ✧ Quality
Description
The modified Deployment manifest snippet omits spec.replicas without an adjacent comment
explaining that replicas are intentionally managed by autoscaling/another controller, which makes
scaling intent ambiguous.
Code

config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[R5-8]

    spec:
      initContainers:
        - name: init-rag-data
-          image: 'quay.io/redhat-ai-dev/rag-content:release-1.9-lcs'
+          image: quay.io/redhat-ai-dev/rag-content:release-1.9-lls-0.5.0
Evidence
PR Compliance ID 9 requires an explicit nearby comment when spec.replicas is omitted from a
Deployment/StatefulSet manifest. The updated Deployment snippet starts spec: and proceeds directly
to template: without replicas: or an autoscaling explanation.

Rule 9: Document omitted spec.replicas when relying on autoscaling
config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[1-8]
bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml[241-246]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The Lightspeed `Deployment` manifest snippet omits `spec.replicas` but does not include an adjacent comment stating that replicas are intentionally managed by autoscaling (HPA/KEDA) or another controller.

## Issue Context
This file (and its rendered bundle output) is a Kubernetes workload manifest snippet; without a note, users cannot tell if replicas were forgotten or intentionally omitted.

## Fix Focus Areas
- config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[1-8]
- bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml[241-246]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. RAG path mismatch🐞 Bug ≡ Correctness
Description
The init-rag-data container copies the docs vector DB into the volume root (e.g.,
/rag-content/rhdh_product_docs/...), but Lightspeed is configured to read the DB from
/rag-content/vector_db/rhdh_product_docs/..., so the preloaded DB file path won’t exist at runtime.
Code

config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml[R175-177]

+        kv_rag:
          type: kv_sqlite
          db_path: /rag-content/vector_db/rhdh_product_docs/1.9/faiss_store.db
Evidence
Lightspeed config points kv_rag.db_path under /rag-content/vector_db/..., while the init container
copies /rag/vector_db/rhdh_product_docs directly into /data (mounted from the same rag-data-volume),
resulting in /rag-content/rhdh_product_docs/... (no intermediate vector_db directory).

config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml[167-178]
config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[6-16]
config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[26-32]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The RAG init container copies vector DB content into the shared volume in a location that does not match the `kv_rag.db_path` configured for Lightspeed, so Lightspeed won’t be able to find the preloaded DB.

### Issue Context
- Init container copies: `cp -r /rag/vector_db/rhdh_product_docs /data/`
- Runtime config expects: `/rag-content/vector_db/rhdh_product_docs/...`
- `/data` in init container is the same `rag-data-volume` that is later mounted at `/rag-content` in `lightspeed-core`.

### Fix Focus Areas
Update either the init copy destination to include `vector_db/` or update the configured `db_path` to match the copied layout.
- config/profile/rhdh/default-config/flavours/lightspeed/deployment.yaml[6-16]
- config/profile/rhdh/default-config/flavours/lightspeed/configmap-files.yaml[167-178]
- bundle/rhdh/manifests/rhdh-flavour-lightspeed-config_v1_configmap.yaml[241-279] (regenerate if derived)
- dist/rhdh/install.yaml[3010-3045] (regenerate if derived)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. --ref missing value 🐞 Bug ☼ Reliability
Description
sync-lightspeed-configs.sh assigns REF="$2" for --ref without checking that a value is present; with
set -u this fails if --ref is provided as the last argument.
Code

hack/sync-lightspeed-configs.sh[R30-36]

+parse_args() {
+    while [[ $# -gt 0 ]]; do
+        case "$1" in
+            --ref)
+                REF="$2"
+                shift 2
+                ;;
Evidence
The script enables nounset (set -euo pipefail) and then unconditionally reads $2 when parsing
--ref, which will error if the caller forgets to provide the ref value.

hack/sync-lightspeed-configs.sh[6-6]
hack/sync-lightspeed-configs.sh[30-36]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`--ref` parsing assumes a value exists (`$2`). With `set -u`, running `./hack/sync-lightspeed-configs.sh --ref` will terminate due to an unset positional parameter.

### Issue Context
This is a robustness/usability issue for the sync tool.

### Fix Focus Areas
- Add an explicit check that `$2` is set and non-empty before assigning to `REF` (otherwise print usage and exit 1).
- hack/sync-lightspeed-configs.sh[30-46]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@rhdh-qodo-merge rhdh-qodo-merge Bot added the documentation Improvements or additions to documentation label Apr 22, 2026
@rm3l
Copy link
Copy Markdown
Member

rm3l commented Apr 22, 2026

Re-adding lgtm label as it has already been approved by 2 people.

/lgtm

Comment thread docs/lightspeed.md
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request lgtm Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants