Secure vLLM runtime (Implementation) by sats-23 · Pull Request #667 · IBM/project-ai-services

sats-23 · 2026-04-24T04:21:32Z

-When Chatbot is loaded, user is prompted with this menu.
-Upon correct API key, user can proceed to use chatbot as usual
-Key remains valid until page refresh
-Wrong API key redirects to same prompt box expecting API key

-Similarly updated Swagger docs as well

-Curl requests with and without auth header when auth is enabled

mkumatag · 2026-04-24T04:35:38Z

 	}

+	// Log vLLM authentication status
+	if instructAPIKey, ok := values["instruct"].(map[string]any); ok {


I think somehow we need to push this info to applications/workload, any application specific content in the orchestration is a big NO!

mkumatag · 2026-04-24T04:35:54Z

 	}

+	// Log vLLM authentication status
+	if instructAPIKey, ok := values["instruct"].(map[string]any); ok {


I think somehow we need to push this info to applications/workload, any application specific content in the orchestration is a big NO!

@yussufsh @mayuka-c ^^

mayuka-c · 2026-04-24T05:42:03Z

+  # @description API key for vLLM instruct service authentication. If empty, authentication is disabled. Provide a key to enable authentication.
+  apiKey: ""


What is the format the key should be in?

Plain text format, there is no validation on length or type of characters as of now

mayuka-c · 2026-04-24T05:45:37Z

              value: ibm-granite/granite-3.3-8b-instruct
            - name: RERANKER_ENDPOINT
-              value: http://reranker-predictor:8080
+              value: http://reranker-predictor:8000


Why bringing this change? without spyre it runs on 8080

mayuka-c · 2026-04-24T05:46:19Z

+            {{- if .Values.instruct.apiKey }}
+            - name: VLLM_INSTRUCT_API_KEY
+              valueFrom:
+                secretKeyRef:
+                  name: vllm-instruct-api-key
+                  key: apiKey
+            {{- end }}


Just a query, in case of OpenShift, dont we need to use Service Account?

I don't think so, this is a static injection similar to opensearch-credentials

No I meant with RHOAI you could use Token-Based Authentication right?
This uses Seervice Account and you mount the token on service whichever needs communicate with models right?

I havent tried, but I do remember an option of enabling auth while deploying models in RHOAI dashboard

Hmm, yes, OpenShift AI does support an additional layer through some operators, such as authorino. Perhaps you could reach out to the OpenShift AI team to learn more about that approach? However, it would certainly involve more work like install those additional operator part of the application deployment with an additional specs etc..

I would vote for get started with this in this release and explore the operator way be next release(or add option shift support next release)

mkumatag · 2026-04-24T12:43:34Z

@sats-23 how are we exposing the api key to the user if they want to consume the key post running?

sats-23 · 2026-04-27T03:57:49Z

@sats-23 how are we exposing the api key to the user if they want to consume the key post running?

Currently we are passing the API key and consuming it both from backend server and vLLM pod side.
Since the /v1/chat/completion or other endpoints to the user are exposed via backend server, the logic will handle passing the api key in auth header by default.

So, currently, there is no exposing/showing api key to user. Only in case they want to consume directly from vLLM pod (pod network interface), they will have to pass the api key which is not the general practice in our app.

dharaneeshvrd · 2026-04-27T04:49:06Z

          value: "{{ .Values.opensearch.auth.username }}"
        - name: OPENSEARCH_PASSWORD
          value: "{{ .Values.opensearch.auth.password }}"
+        {{- if .Values.instruct.apiKey }}


Not required for similarity service, since instruct is not used by this service.

Sure, removed

dharaneeshvrd · 2026-04-27T05:03:30Z

Many methods use /v1/chat/completion API, please analyse all the usage of instruct LLM and add auth header.

Table summary

Summarize (stream & non-stream)

Q&A stream query

Added, PTAL

Now, there is a pass through auth for digitize and summarize API.
Only chatbot UI and backend requires user to mention API Key, in all other cases, the key is automatically propagated

dharaneeshvrd · 2026-04-27T05:08:08Z

how are we exposing the api key to the user if they want to consume the key post running?

There is no neat way to get this now. But it would be great if we can get the values configured for the current deployment via a cli command.

sats-23 · 2026-04-27T06:55:50Z

how are we exposing the api key to the user if they want to consume the key post running?

There is no neat way to get this now. But it would be great if we can get the values configured for the current deployment via a cli command.

I assume we should also explain to the user that they need to use it only in case of access vLLM within pod networking.
Else they may get confused and use it against the backend server

sats-23 · 2026-04-28T09:01:46Z

@dharaneeshvrd @mkumatag @manju956
The approach has been updated, PTAL

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

dharaneeshvrd · 2026-04-30T09:30:17Z

 import uvicorn

-from fastapi import FastAPI, UploadFile, File, HTTPException, BackgroundTasks, Query, status, Request
+from fastapi import FastAPI, UploadFile, File, HTTPException, BackgroundTasks, Query, status, Request, Header


Remove this

dharaneeshvrd · 2026-04-30T09:30:25Z

Remove this

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

dharaneeshvrd · 2026-04-30T09:51:09Z

Sry, please remove the changes here too

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

sats-23 force-pushed the vLLMImpl branch from 20d66ad to f12f550 Compare April 24, 2026 04:37

mkumatag requested changes Apr 24, 2026

View reviewed changes

sats-23 force-pushed the vLLMImpl branch from f12f550 to e99acbe Compare April 24, 2026 04:52

mayuka-c reviewed Apr 24, 2026

View reviewed changes

sats-23 force-pushed the vLLMImpl branch from 277ea20 to e49b045 Compare April 24, 2026 06:47

dharaneeshvrd added the squad/usecases label Apr 24, 2026

sats-23 force-pushed the vLLMImpl branch 2 times, most recently from 733b81c to a22d60c Compare April 24, 2026 08:31

sats-23 requested review from dharaneeshvrd, manju956, mayuka-c and mkumatag April 24, 2026 09:16

sats-23 marked this pull request as ready for review April 24, 2026 09:16

sats-23 force-pushed the vLLMImpl branch 3 times, most recently from 4d0981b to d706a72 Compare April 24, 2026 09:25

sats-23 force-pushed the vLLMImpl branch from d706a72 to 73246b9 Compare April 27, 2026 03:58

manju956 reviewed Apr 27, 2026

View reviewed changes

Comment thread spyre-rag/src/common/llm_utils.py Outdated

dharaneeshvrd reviewed Apr 27, 2026

View reviewed changes

sats-23 requested review from dharaneeshvrd and manju956 April 27, 2026 06:59

sats-23 marked this pull request as draft April 27, 2026 08:16

sats-23 force-pushed the vLLMImpl branch from 9f2a23c to 4383782 Compare April 28, 2026 03:31

sats-23 marked this pull request as ready for review April 28, 2026 09:01

sats-23 added 17 commits April 30, 2026 14:44

Address comments

998d00f

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Yaml Revert

1958bcd

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Backend changes

5f40029

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

UI changes

83191b3

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Allow endpoints

088b892

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Key is valid for session

1cd2ff9

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Fix backend mandating api key if auth enabled

4a1713a

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Pass through for digitize and summarize

87df9ad

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Update Swagger docs and auth req

fe5b474

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Display api key on info command

26b4c77

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Get api_key from settings for Digitize and Summarize methods

0e095ea

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Make endpoints internal

2527d3a

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Remove new endpoints

df8748b

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Fix npm format issues

5e9e222

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Version bump

7b65f90

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Remove /reference & address comments

58f4c8b

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

Fix lint issues

ae6e3c7

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

sats-23 force-pushed the vLLMImpl branch from 4f7d4e2 to ae6e3c7 Compare April 30, 2026 09:15

Bump UI Image Version

1eb3c27

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

dharaneeshvrd reviewed Apr 30, 2026

View reviewed changes

pranithraoibm previously approved these changes Apr 30, 2026

View reviewed changes

Nit

627668d

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

sats-23 dismissed pranithraoibm’s stale review via 627668d April 30, 2026 09:49

sats-23 requested review from dharaneeshvrd and pranithraoibm April 30, 2026 09:50

dharaneeshvrd reviewed Apr 30, 2026

View reviewed changes

mkumatag previously approved these changes Apr 30, 2026

View reviewed changes

Nit 2

805ac4e

Signed-off-by: Sathvik <Sathvik.S@ibm.com>

sats-23 dismissed mkumatag’s stale review via 805ac4e April 30, 2026 14:31

sats-23 requested a review from dharaneeshvrd April 30, 2026 14:31

		# @description API key for vLLM instruct service authentication. If empty, authentication is disabled. Provide a key to enable authentication.
		apiKey: ""

Conversation

sats-23 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mayuka-c Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mkumatag Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mkumatag commented Apr 24, 2026

Uh oh!

sats-23 commented Apr 27, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dharaneeshvrd commented Apr 27, 2026

Uh oh!

sats-23 commented Apr 27, 2026

Uh oh!

sats-23 commented Apr 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

sats-23 commented Apr 24, 2026 •

edited

Loading

mayuka-c Apr 24, 2026 •

edited

Loading

mkumatag Apr 24, 2026 •

edited

Loading