For workloads such as AI inference, we want to dynamically scale GPUs according to the load.

The current operator has the ability to attach multiple GPUs to a node when there are no GPU present, or to detach all GPUs at once when there are multiple GPUs attached to a node.
Therefore, I propose a feature that allows you to increase or decrease GPUs one by one.
For workloads such as AI inference, we want to dynamically scale GPUs according to the load.
The current operator has the ability to attach multiple GPUs to a node when there are no GPU present, or to detach all GPUs at once when there are multiple GPUs attached to a node.
Therefore, I propose a feature that allows you to increase or decrease GPUs one by one.