Skip to content

[proposal] - Allow CORS Headers for intranet / VPN purposes#94

Open
WebReflection wants to merge 2 commits into
antirez:mainfrom
WebReflection:allow-cors
Open

[proposal] - Allow CORS Headers for intranet / VPN purposes#94
WebReflection wants to merge 2 commits into
antirez:mainfrom
WebReflection:allow-cors

Conversation

@WebReflection
Copy link
Copy Markdown

@WebReflection WebReflection commented May 12, 2026

This MR has been successfully tested in my local network with a DGX Spark around the WiFi and I need this variant to be able to query via my localhost or any other connected device that Spark so that we can all benefit from this project within my house.

Thanks for considering this change/update.

To be discussed

Ideally there should be a --cors flag when starting the server but I'd like to start with this implementation that "just works" ™️ and hear out from others/maintainer if there's anything else I can improve/change but trust me it works already and I am playing around a tiny library that would let me lurk ds4 from anywhere I am in my own apartment, as long as the DGX is up and running.

P.S. thanks a lot for this project, I will inevitably try to bring it to ROCm once I have my machine around but so far with Spark it's working wonderfully!

@WebReflection
Copy link
Copy Markdown
Author

WebReflection commented May 12, 2026

If anyone is interested around "how can I test this" ?

test.js

import Queue from 'https://esm.run/gen-q';

const { parse, stringify } = JSON;
const decoder = new TextDecoder;

const chatOptions = {
  stream: true,
  role: 'user',
};

export default class DS4 {
  #model;
  #url;

  constructor({
    url = 'http://YOUR_MACHINE_IP:8000',
    model = 'deepseek-v4-flash',
    version = 'v1',
  }) {
    this.#model = model;
    this.#url = new URL(`${url}/${version}`);
  }

  async *chat(content, { stream = true, role = 'user' } = chatOptions) {
    const items = new Queue;

    const { body } = await fetch(`${this.#url}/chat/completions`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      body: stringify({
        model: this.#model,
        messages: [
          { role, content }
        ],
        stream,
      }),
    });

    const reader = body.getReader();

    new ReadableStream({
      async start(controller) {
        (function next() {
          reader.read().then(({ done, value }) => {
            if (done) {
              items.splice(0);
              controller.close();
              return;
            }
            const text = decoder.decode(value);
            if (/^\s*data:\s*(\{[\s\S]+\})\s*$/.test(text)) {
              const { $1: json } = RegExp;
              items.push(...parse(json.trim()).choices);
            }
            next();
          });
        }());
      },
    });

    for await (const item of items) {
      const { finish_reason, delta } = item;

      if (finish_reason != null) break;

      const { content, reasoning_content } = delta;
      if (content || reasoning_content)
        yield { content, reasoning: reasoning_content };
    }
  }

}

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>DS4</title>
    <style>
        body {
          font-family: sans-serif;
          &.thinking {
            opacity: 0.5;
          }
        }
    </style>
    <script type="module">
        import DS4 from './test.js';

        const ds4 = new DS4({
          url: 'http://YOUR_MACHINE_IP:8000',
          model: 'deepseek-v4-flash',
          version: 'v1',
        });

        let thinking = true;
        document.body.classList.add('thinking');
        for await (const { content, reasoning } of ds4.chat('List three Redis design principles.')) {
          if (thinking && reasoning && !content) {
            document.body.textContent += reasoning;
          }
          else if (thinking && content) {
            thinking = false;
            document.body.classList.remove('thinking');
            document.body.textContent = content;
          }
          else if (content) {
            document.body.textContent += content;
          }
        }

        console.log('done');
    </script>
</head>
<body>
    
</body>
</html>

The testing library is a WIP and it will be able to consume all channels and do more but with that, and this patch, you'll see results from a localhost without a sweat 🥳

@WebReflection
Copy link
Copy Markdown
Author

this might be a duplicate of #70 which I've just realized was in already ... my thoughts:

  • that MR is way more permissive but ...
  • one needs to explicit use --host 0.0.0.0 instead of default 127.0.0.1 to make the machine reachable out there
  • accordingly, I am not super sure the --cors flag is even needed ... it might be better, as explicit intent, yet it's impractical if the host is not 0.0.0.0 or bound to something reachable from the intranet, so that we could eventually just merge 70 and call it a day, I wouldn't be upset at all as long as CORS in intranet is possible

thank you!

@calvinrp
Copy link
Copy Markdown

Alternatively, #44

@WebReflection
Copy link
Copy Markdown
Author

@calvinrp answered in here #70 (comment)

@d3y4n
Copy link
Copy Markdown

d3y4n commented May 12, 2026

Why not add a proxy on top for all the shenanigans? IMHO this should be kept as simple as possible.

@WebReflection
Copy link
Copy Markdown
Author

@d3y4n having CORS options backed in is the "as simple as possible" idea indeed ... anything else is not simple anymore.

@d3y4n
Copy link
Copy Markdown

d3y4n commented May 12, 2026

@WebReflection not my call, just saying you're hardcoding values and tomorrow someone might need different ones (even you).
What I suggest is add caddy on top as this is anyways not "real" production server.

api.example.com {

    @preflight method OPTIONS

    handle @preflight {
        header {
            Access-Control-Allow-Origin   "{http.request.header.Origin}"
            Access-Control-Allow-Methods  "GET, POST, PUT, PATCH, DELETE, OPTIONS"
            Access-Control-Allow-Headers  "Authorization, Content-Type, Accept, X-Requested-With, X-CSRF-Token"
            Access-Control-Max-Age        "3600"
            Vary                          "Origin"
        }
        respond "" 204
    }

    header {
        Access-Control-Allow-Origin   "{http.request.header.Origin}"
        Access-Control-Expose-Headers "Content-Length, Content-Range"
        Vary                          "Origin"
    }

    reverse_proxy localhost:8080
}

Cheers

@WebReflection
Copy link
Copy Markdown
Author

WebReflection commented May 13, 2026

@d3y4n PR #70 is easier and simpler, I’ll try that and amend this one.

the issue is that requiring any other tool just to have CORS (3 related PRs already, it’s not something a few needs, it’s something everyone expects as possibility at some point) means goodbye ease of portability and basic flag in docs to add such feature (if we want the flag at all, otherwise it’s about enabling CORS on 0.0.0.0 ‘cause that’s already an explicit intent).

Both devices and automated tests done remotely will need that, if browsers are used instead of curls, so yeah, everyone could fix it in a way or another but changes are so tiny/simple I am not sure why making it any more complicated for users would be desirable

@fry69
Copy link
Copy Markdown

fry69 commented May 13, 2026

FYI/related: There is also the idea to create a separate web proxy that then talks to the inference engine with a generalized protocol:

There are already two competing protocols, classic stateless chat completion and the more modern, stateful "Responses". It might make sense to separate the protocol frontend from the engine, as suggested here -> #91 (comment)

CORS would then only be needed to get implemented in this combined frontend proxy and the engine does not have to worry about all that.

Otherwise I am a bit skeptical about making the current ds4-server more friendly to be used from remote places, as it is extremely bare bones and likely contains a gazillion of security related problems (personal opinion).

@WebReflection
Copy link
Copy Markdown
Author

WebReflection commented May 13, 2026

To whom it might concern, the latest edit of this file does the following:

  • it uses minimal simplified headers from support browser CROS #70
  • it adds a --cors or --CORS argument like in Make CORS support opt-in #44
  • it uses 0.0.0.0 instead of 127.0.0.1 if the flag is passed and the host is localhost
  • it doesn't change anything else if the flag is not passed

Tested via:

# example
./ds4-server --cors --host localhost --ctx 100000 --kv-disk-dir /tmp/ds4-kv --kv-disk-space-mb 8192

It is working like a charm, glad to hear where/how I could improve this MR, thank you.

@WebReflection WebReflection force-pushed the allow-cors branch 2 times, most recently from 433b6bd to a0ab701 Compare May 13, 2026 08:32
This variant uses simpler headers approach from antirez#70 and it adds
`--cors` flag like it was suggested in antirez#44.
@WebReflection
Copy link
Copy Markdown
Author

FYI much cleaner MR, I can't find anything particularly cumbersome + tests pass but I can test only on CUDA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants