Skip to content

Commit 0066c08

Browse files
Bihan  RanaBihan  Rana
authored andcommitted
Add Replica Groups Docs
1 parent b7f637b commit 0066c08

1 file changed

Lines changed: 53 additions & 0 deletions

File tree

docs/docs/concepts/services.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,59 @@ Setting the minimum number of replicas to `0` allows the service to scale down t
164164

165165
> The `scaling` property requires creating a [gateway](gateways.md).
166166

167+
### Replica Groups
168+
169+
Replica groups let you define multiple groups of replicas within a single service. Each group can define its own replica count, autoscaling rules, resource requirements, and commands.
170+
171+
<div editor-title="service.dstack.yml">
172+
173+
```yaml
174+
type: service
175+
name: replica-groups
176+
image: lmsysorg/sglang:latest
177+
178+
env:
179+
- MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-8B
180+
181+
replicas:
182+
- count: 1..2
183+
scaling:
184+
metric: rps
185+
target: 10
186+
commands:
187+
- |
188+
python -m sglang.launch_server \
189+
--model-path $MODEL_ID \
190+
--port 8000 \
191+
--trust-remote-code
192+
193+
resources:
194+
gpu: 48GB
195+
196+
- count: 1..4
197+
scaling:
198+
metric: rps
199+
target: 5
200+
commands:
201+
- |
202+
python -m sglang.launch_server \
203+
--model-path $MODEL_ID \
204+
--port 8000 \
205+
--trust-remote-code
206+
resources:
207+
gpu: 24GB
208+
209+
port: 8000
210+
model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
211+
```
212+
213+
</div>
214+
215+
> Support for configuring `port`, `image`, `env`, `docker`, and other properties is coming soon.
216+
217+
!!! info
218+
Replica groups enable `prefill–decode` disaggregation by running prefill and decode workers as separate replica groups within a single service. This capability is planned for an upcoming release. See the [issue](https://github.com/dstackai/dstack/issues/3363) for more details.
219+
167220
### Model
168221

169222
If the service is running a chat model with an OpenAI-compatible interface,

0 commit comments

Comments
 (0)