-
Notifications
You must be signed in to change notification settings - Fork 5
Fix shell output for 'verifying partition creation' #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Follow-up to #36 |
gabrpham
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I looked at it more closely, there are a few other updates that should be made. I noted them in the comments below. Thanks!
| information. This is to be expected for security reasons and will be | ||
| addressed in a later feature update to ``amd-smi``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can also leave it at 'This is to be expected for security reasons.' and chop the rest off, I think that would be completely correct. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Upon a successful set, AMD SMI will then initiate an action to restart AMD GPU driver. | ||
| This action will change all GPU's in the hive to the requested memory (NPS) partition mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've changed this too. Users must initiate their own sudo modprobe -r amdgpu to unload the driver and then sudo modprobe amdgpu to reload the driver. We've removed the automatic reset since that was interfering with the user's already running workloads. Users should now initiate the driver reset as stated above when they are ready to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| GPU: 5 | ||
| MEMORY_PARTITION: Successfully set memory partition to NPS4 | ||
| Trying again - Updating memory partition for gpu 0: [██████████████..........................] 50/140 secs remain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You won't be seeing this particular progress bar anymore since the driver reset is not user initiated. Here's the output for setting memory partitions now:
$ sudo amd-smi set -M NPS4
******WARNING****** After changing memory (NPS) partition modes, users MUST restart (reload) the AMD GPU driver. This command NO LONGER AUTOMATICALLY reloads the driver, see `amd-smi reset -h` and `sudo amd-smi reset -r` for more information. This change is intended to allow users the ability to control when is the best time to restart the AMD GPU driver, as it may not be desired to restart the AMD GPU driver immediately after changing the memory (NPS) partition mode. Please use `sudo amd-smi reset -r` AFTER successfully changing the memory (NPS) partition mode. A successful driver reload is REQUIRED in order to complete updating ALL GPUs in the hive to the requested partition mode. ******REMINDER****** In order to reload the AMD GPU driver, users MUST quit all GPU workloads across all devices.Do you accept these terms? [Y/N] y
GPU: 0
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 1
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 2
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 3
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 4
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 5
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 6
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when readyGPU: 7
MEMORY_PARTITION: Successfully set memory partition to NPS4, reload driver when ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(cherry picked from commit 6c9cc02)
(cherry picked from commit 6c9cc02)
(cherry picked from commit 6c9cc02)
(cherry picked from commit 6c9cc02)
Motivation
Technical Details
Test Plan
Test Result
Submission Checklist