[Flyte-7005] Supporting Ray autoscalerOptions#7111
[Flyte-7005] Supporting Ray autoscalerOptions#71110yukali0 wants to merge 15 commits intoflyteorg:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7111 +/- ##
==========================================
+ Coverage 56.95% 56.97% +0.01%
==========================================
Files 931 931
Lines 58188 58283 +95
==========================================
+ Hits 33143 33208 +65
- Misses 22004 22021 +17
- Partials 3041 3054 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
951fbeb to
8a99445
Compare
| string runtime_env_yaml = 5; | ||
| } | ||
|
|
||
| message Resources { |
There was a problem hiding this comment.
Can we use core.Resources?
| message EnvVar { | ||
| string name = 1; | ||
| string value = 2; | ||
| EnvValueFrom valueFrom = 3; |
There was a problem hiding this comment.
Why do we need EnvValueFrom here? Can we use core.KeyValuePair as the type for env?
flyte/flyteidl/protos/flyteidl/core/literals.proto
Lines 192 to 198 in e5905e7
|
|
||
| message AutoscalerOptions { | ||
| // "Default", "Aggressive", "Conservative" | ||
| string upscaling_mode = 1; |
There was a problem hiding this comment.
Could we make this as enum?
enum UpscalingMode {
UPSCALING_MODE_UNSPECIFIED = 0;
UPSCALING_MODE_DEFAULT = 1;
UPSCALING_MODE_AGGRESSIVE = 2;
UPSCALING_MODE_CONSERVATIVE = 3;
}| return rayjob, err | ||
| } | ||
|
|
||
| func NewAutoscalerOptions(options *plugins.AutoscalerOptions) *rayv1.AutoscalerOptions { |
There was a problem hiding this comment.
Let's name it as buildAutoscalerOptions to follow the naming convention (e.g. like buildHeadPodTemplate), and make it private
| }, | ||
| WorkerGroupSpecs: []rayv1.WorkerGroupSpec{}, | ||
| EnableInTreeAutoscaling: &rayJob.RayCluster.EnableAutoscaling, | ||
| AutoscalerOptions: autoScalarOptions, |
There was a problem hiding this comment.
While protobuf generated GetXXX() has nil check, we can directly pass:
AutoscalerOptions: NewAutoscalerOptions(rayJob.GetRayCluster().GetAutoscalerOptions())|
Hi @0yukali0, |
|
Hi @machichima , i updated the setup steps and screenshot in the PR descriptions. |
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
| resourceRequirements := v1.ResourceRequirements{} | ||
| if requests := res.GetRequests(); len(requests) > 0 { | ||
| resourceRequirements.Requests = convertResourceEntriesToResourceList(requests) | ||
| } | ||
| if limits := res.GetLimits(); len(limits) > 0 { | ||
| resourceRequirements.Limits = convertResourceEntriesToResourceList(limits) | ||
| } |
There was a problem hiding this comment.
Could we use flytek8s.ToK8sResourceRequirements here instead?
flyte/flyteplugins/go/tasks/pluginmachinery/flytek8s/utils.go
Lines 52 to 68 in 6e84ac2
| autoScalarOptions.Env = []v1.EnvVar{} | ||
| for _, env := range envs { | ||
| name := env.GetKey() | ||
| if val := env.GetValue(); val != "" { | ||
| autoScalarOptions.Env = append(autoScalarOptions.Env, v1.EnvVar{ | ||
| Name: name, | ||
| Value: val, | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Could we use flytek8s.ToK8sEnvVar instead?
| idleTimeoutTime := options.GetIdleTimeoutSeconds() | ||
| autoScalarOptions.IdleTimeoutSeconds = &idleTimeoutTime |
There was a problem hiding this comment.
If the idleTimeoutTime is not set (= 0), this part will overwrite the default value (60 sec) set in kuberay. we should only set autoScalarOptions.IdleTimeoutSeconds if it's not 0 (not empty)
There was a problem hiding this comment.
Hi @machichima
I will update it and the following test case to check the setting.
| } | ||
|
|
||
| func buildAutoscalerOptions(options *plugins.AutoscalerOptions) *rayv1.AutoscalerOptions { | ||
| var autoScalarOptions *rayv1.AutoscalerOptions |
There was a problem hiding this comment.
| var autoScalarOptions *rayv1.AutoscalerOptions | |
| var autoScalerOptions *rayv1.AutoscalerOptions |
nit
| } | ||
| } | ||
|
|
||
| func TestNewAutoscalerOptions(t *testing.T) { |
There was a problem hiding this comment.
| func TestNewAutoscalerOptions(t *testing.T) { | |
| func TestBuildAutoscalerOptions(t *testing.T) { |
nit
| assert.Equal(t, resource.MustParse("1Gi"), result.Resources.Limits[corev1.ResourceMemory]) | ||
| }) | ||
|
|
||
| t.Run("env literal value", func(t *testing.T) { |
There was a problem hiding this comment.
Could we add a test for env with empty value (should be skipped)?
There was a problem hiding this comment.
flytek8s.ToK8sEnvVar allows empty value, so i will skip the test with empty value.
| assert.Nil(t, buildAutoscalerOptions(nil)) | ||
| }) | ||
|
|
||
| t.Run("idle timeout propagated", func(t *testing.T) { |
There was a problem hiding this comment.
Could we also add a test like this:
t.Run("idle timeout zero should not be set", func(t *testing.T) {
result := buildAutoscalerOptions(&plugins.AutoscalerOptions{})
require.NotNil(t, result)
assert.Nil(t, result.IdleTimeoutSeconds)
})Which is related to my comment above: https://github.com/flyteorg/flyte/pull/7111/changes#r3114611720
If the IdleTimeoutSeconds is not set, it should remain empty so that Ray can apply its default value
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>
Signed-off-by: Yuteng Chen <a08h0283@gmail.com>


Tracking issue
Related #7005
Why are the changes needed?
Ray autoScalerOptions is not supportted in flyte v1
What changes were proposed in this pull request?
How was this patch tested?
Labels
Please add one or more of the following labels to categorize your PR:
This is important to improve the readability of release notes.
Setup process
2.1 flytectl demo start --dev
2.2 make compile && POD_NAMESPACE=flyte ./flyte start --config flyte-single-binary-local.yaml
3.1 uv venv -p 3.11 && source .venv/bin/activate && uv pip install -e ~/flytekit ~/flytekit/plugins/flytekit-ray && uv pip install -e ~/flyte/flyteidl
3.2 "pyflyte run --remote ray_example.py ray_workflow --n 10"
Screenshots
Related PRs
Docs link