[Newton] Migrate more envs and mdps to warp#4690
Draft
hujc7 wants to merge 9 commits intoisaac-sim:dev/newtonfrom
Draft
[Newton] Migrate more envs and mdps to warp#4690hujc7 wants to merge 9 commits intoisaac-sim:dev/newtonfrom
hujc7 wants to merge 9 commits intoisaac-sim:dev/newtonfrom
Conversation
Introduces ManagerBasedEnvWarp and ManagerBasedRLEnvWarp with ManagerCallSwitch for per-manager stable/warp/captured mode selection. Includes Newton articulation data extensions, RL wrapper adaptations, and train script integration.
Warp-first implementations of all 7 managers (action, observation, reward, termination, event, command, recorder) with mask-based reset for CUDA graph compatibility. Includes MDP term library (observations, rewards, terminations, events, actions), IO descriptors, and utility modules (noise, modifiers, circular buffers, warp kernels).
Cartpole env config for the warp manager-based RL env, registered as Isaac-Cartpole-Warp-v0. Includes warp-first custom reward term (joint_pos_target_l2).
… updates - Add manager_call_max_mode field for per-env capture ceiling (min(mode, cap)) - Support dict input for manager_call_config (in addition to JSON string) - Add "Scene" to MANAGER_NAMES for configurable Scene_write_data_to_sim mode - Remove hardcoded WARP_NOT_CAPTURED override from Scene_write_data_to_sim - Add warp_capturable decorator and is_warp_capturable check for mode=2 fallback - Update managers: action, observation, event with warp-first improvements - Update scene_entity_cfg with body_ids_wp resolution - Update train.py CLI arg handling
Warp-first observation, reward, termination, event, and action terms referenced by the 14 verified training-parity envs. Observations: base_pos_z, base_lin_vel, base_ang_vel, projected_gravity, joint_pos, joint_pos_rel, joint_pos_limit_normalized, joint_vel, joint_vel_rel, last_action, generated_commands Rewards: is_alive, is_terminated, lin_vel_z_l2, ang_vel_xy_l2, flat_orientation_l2, joint_torques_l2, joint_vel_l1, joint_vel_l2, joint_acc_l2, joint_deviation_l1, joint_pos_limits, action_rate_l2, action_l2, undesired_contacts, track_lin_vel_xy_exp, track_ang_vel_z_exp Terminations: time_out, root_height_below_minimum, joint_pos_out_of_manual_limit, illegal_contact Events: randomize_rigid_body_com, apply_external_force_torque, reset_root_state_uniform, reset_joints_by_scale, reset_joints_by_offset, push_by_setting_velocity Actions: JointPositionAction, JointEffortAction Terms accessing lazy TimestampedWarpBuffer properties (Tier 2) are marked @warp_capturable(False) to prevent stale data under CUDA graph capture.
Env configs and task-local MDP terms for 14 training-parity verified envs: - Classic: Cartpole, Humanoid, Ant - Locomotion velocity (flat): Anymal-B/C/D, G1-v0/v1, H1, Cassie, Unitree A1/Go1/Go2 - Manipulation: Reach-Franka Per-robot config registrations (gym IDs) and flat env cfgs for all tested locomotion and reach variants. Task-specific MDP terms: - Humanoid: base_yaw_roll, base_up_proj, base_heading_proj, base_angle_to_target, progress_reward, upright_posture_bonus, move_to_target_bonus, power_consumption, joint_pos_limits_penalty_ratio - Velocity: feet_air_time, feet_air_time_positive_biped, feet_slide, track_lin_vel_xy_yaw_frame_exp, track_ang_vel_z_world_exp, stand_still_joint_deviation_l1, terrain_out_of_bounds, terrain_levels_vel - Reach: position_command_error, position_command_error_tanh, orientation_command_error Also includes: - Warp parity tests (3 test files) - WARP_MIGRATION_GAP_ANALYSIS.md (MDP term catalog and per-task usage) - MANAGER_TEST_COVERAGE.md (capturability analysis) - GRAPH_CAPTURE_MIGRATION.md (ArticulationData Tier 1/2/3 property analysis)
Contributor
|
Too many files changed for review. ( |
Author
|
Not sure if there's a way to only show the fils changes not in PR #4480. Probably best to merge the dependency first |
Author
|
Migrated envs are showing similar training results. For convergence speed, it seems not relevant here as it's not consistent due to noise added. Final training stats — Warp-only vs baseline
Final training stats — Warp-capture vs baseline
Convergence speed — Warp-only vs baseline
Convergence speed — Warp-capture vs baseline
|
Author
Time performance gainWarp-capture vs baseline (repeat=5 average, timer-only
|
| Task | Base env_step (us) | Capture env_step (avg us) | % change |
|---|---|---|---|
| Isaac-Ant-Warp-v0 (0) | 12450.25 | 5384.53 | -56.8% |
| Isaac-Cartpole-Warp-v0 (1) | 9038.00 | 1357.52 | -85.0% |
| Isaac-Humanoid-Warp-v0 (2) | 20600.74 | 13653.34 | -33.7% |
| Isaac-Reach-Franka-Warp-v0 (3) | 12202.51 | 5863.21 | -52.0% |
| Isaac-Reach-UR10-Warp-v0 (4) | - | - | - |
| Isaac-Velocity-Flat-Anymal-B-Warp-v0 (5) | 38029.59 | 27247.39 | -28.4% |
| Isaac-Velocity-Flat-Anymal-C-Warp-v0 (6) | 37881.29 | 27281.55 | -28.0% |
| Isaac-Velocity-Flat-Anymal-D-Warp-v0 (7) | 39227.52 | 27860.87 | -29.0% |
| Isaac-Velocity-Flat-Cassie-Warp-v0 (8) | 22765.51 | 11213.25 | -50.7% |
| Isaac-Velocity-Flat-G1-Warp-v0 (9) | 39951.73 | 27201.89 | -31.9% |
| Isaac-Velocity-Flat-G1-Warp-v1 (10) | 55177.31 | 42330.15 | -23.3% |
| Isaac-Velocity-Flat-H1-Warp-v0 (11) | 28866.51 | 16818.43 | -41.7% |
| Isaac-Velocity-Flat-Unitree-A1-Warp-v0 (12) | 20112.75 | 10467.07 | -48.0% |
| Isaac-Velocity-Flat-Unitree-Go1-Warp-v0 (13) | 20738.22 | 11996.80 | -42.2% |
| Isaac-Velocity-Flat-Unitree-Go2-Warp-v0 (14) | 18656.75 | 9831.00 | -47.3% |
| Isaac-Velocity-Rough-Anymal-D-Warp-v0 (15) | - | - | - |
- Rewrite obs/reward kernels to consume Tier 1 compound types directly, bypassing lazy Tier 2 properties that break CUDA graph capture - Update GRAPH_CAPTURE_MIGRATION.md and WARP_MIGRATION_GAP_ANALYSIS.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Warp-first manager-based RL environment infrastructure and MDP term migration for Newton.
Infrastructure (commits 1-5, from dependency branch)
ManagerBasedRLEnvWarpwithManagerCallSwitchfor per-manager execution mode control (stable / warp / warp-captured)SceneEntityCfgwithbody_ids_wp,joint_ids_wp,joint_maskfor warp kernel dispatchwarp_capturabledecorator andis_warp_capturablecheck for automatic CUDA graph capture fallbackmanager_call_max_modeper-env capture ceiling (min(configured_mode, cap))Scene_write_data_to_simcapture mode (was hardcoded non-captured)MDP terms (commit 6)
Warp-first observation, reward, termination, event, and action terms verified against torch baselines:
Terms accessing
ArticulationDatalazyTimestampedWarpBufferproperties (Tier 2) are marked@warp_capturable(False)to prevent stale data under CUDA graph capture.Tested env configs (commit 7)
14 envs with training parity verified (warp-only and warp-capture vs torch baseline):
Per-robot gym registrations, flat env cfgs, and task-specific MDP terms (humanoid observations/rewards, velocity rewards/terminations/curriculums, reach rewards).
Disabled envs (included but registration commented out)
Isaac-Velocity-Rough-Anymal-D-Warp-v0: requiresisaaclab_physx(not yet ondev/newton)Isaac-Reach-UR10-Warp-v0: USD asset composition error (broken asset)Documentation
WARP_MIGRATION_GAP_ANALYSIS.md: Full MDP term catalog, per-task usage matrix, migration patternsGRAPH_CAPTURE_MIGRATION.md: ArticulationData Tier 1/2/3 property analysis, capture failure mechanism, proposedmaterialize_derived()fixMANAGER_TEST_COVERAGE.md: Capturability analysisTest plan
@warp_capturable(False)fixisaaclab_physxdependency)Dependencies
dev-newton-warp-mig-manager-based(pending merge intodev/newton)