feat(deployment): centerpoint deployment integration by vividf · Pull Request #181 · tier4/AWML

vividf · 2026-02-02T10:24:31Z

Summary

Integrates CenterPoint into the unified deployment framework, enabling deployment and evaluation of ONNX and TensorRT models.

Note, this PR include changes in #180

Changes

Integrated CenterPoint with deployment framework:
- Moved deployment code from projects/CenterPoint to deployment/projects/centerpoint
- Implemented component-based export pipeline for ONNX and TensorRT
- Added runtime inference support with PyTorch, ONNX Runtime, and TensorRT backends
Deployment capabilities:
- Export CenterPoint models to ONNX format
- Export CenterPoint models to TensorRT engines
- Component-based architecture (voxel encoder, backbone+head) for flexible deployment
Evaluation capabilities:
- Evaluate ONNX models using ONNX Runtime
- Evaluate TensorRT engines
- Integrated metrics evaluation with deployment pipeline
Updated CLI: Replaced old deploy.py script with new unified CLI (deployment.cli.main)
Added Docker support: Created Dockerfile for deployment environment with TensorRT dependencies
Updated documentation: Added deployment and evaluation instructions in README

Migration Notes

Old deployment script (projects/CenterPoint/scripts/deploy.py) is removed
Use new CLI: python -m deployment.cli.main centerpoint <deploy_config> <model_config>
ONNX model variants are now registered via deployment.projects.centerpoint.onnx_models

How to run

python -m deployment.cli.main centerpoint deployment/projects/centerpoint/config/deploy_config.py   projects/CenterPoint/configs/t4dataset/Centerpoint/second_secfpn_8xb16_121m_j6gen2_base_amp_t4metric_v2.py   --rot-y-axis-reference

Exported ONNX (Same)

Voxel Encoder

Backbone Head

KSeangTan

Done the first round of reviewing, please consider to use dataclass and pydantic for configs, and do type checking there.

Therefore, we can remove all the type checking in the code

deployment/projects/centerpoint/config/deploy_config.py

KSeangTan · 2026-03-02T03:14:40Z

deployment/projects/centerpoint/config/deploy_config.py

 verification = dict(
    enabled=False,
-    tolerance=1e-1,
+    tolerance=1,


Explain what is tolerance here, and why updating from 0.1 to 1

The value was originally set for calibration classification and later copied to CenterPoint, but it does not work correctly for CenterPoint.

INFO:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) latency: 205.08 ms INFO:deployment.core.evaluation.verification_mixin: output[heatmap]: shape=(1, 5, 510, 510), max_diff=0.070197, mean_diff=0.007674 INFO:deployment.core.evaluation.verification_mixin: output[reg]: shape=(1, 2, 510, 510), max_diff=0.007944, mean_diff=0.001120 INFO:deployment.core.evaluation.verification_mixin: output[height]: shape=(1, 1, 510, 510), max_diff=0.025401, mean_diff=0.002122 INFO:deployment.core.evaluation.verification_mixin: output[dim]: shape=(1, 3, 510, 510), max_diff=0.031920, mean_diff=0.001143 INFO:deployment.core.evaluation.verification_mixin: output[rot]: shape=(1, 2, 510, 510), max_diff=0.075215, mean_diff=0.004582 INFO:deployment.core.evaluation.verification_mixin: output[vel]: shape=(1, 2, 510, 510), max_diff=0.221999, mean_diff=0.004940 INFO:deployment.core.evaluation.verification_mixin: Overall Max difference: 0.221999 INFO:deployment.core.evaluation.verification_mixin: Overall Mean difference: 0.004347 WARNING:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) verification FAILED ✗ (max diff: 0.221999 > tolerance: 0.100000) INFO:deployment.core.evaluation.verification_mixin:

Do you know any reason why it fail? Since it seems like a verification, it's always better to check the reason rather than update the tolerance

It doesn't necessarily indicate a failure.
When converting from PyTorch to TensorRT, some numerical differences are expected due to different kernels, precision handling, and TensorRT optimizations.

The verification is mainly used as a safeguard to detect major issues (e.g., incorrect conversion settings) rather than to enforce exact numerical equivalence.

Since 1e-1 is when we set for resnet18 for calibration classification, it is different in the cases.

Btw, this is the verification result in tensorrt fp16 right? If that's the case, it makes sense

Anyway, 5e-1 can be a better value

Running onnx (cuda:0) reference... 2026-03-10 15:20:07.511273431 [V:onnxruntime:, execution_steps.cc:103 Execute] stream 0 activate notification with index 0 2026-03-10 15:20:07.567219724 [V:onnxruntime:, execution_steps.cc:47 Execute] stream 0 wait on Notification with id: 0 INFO:deployment.core.evaluation.verification_mixin: onnx (cuda:0) latency: 1423.80 ms INFO:deployment.core.evaluation.verification_mixin: Running tensorrt (cuda:0) test... INFO:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) latency: 1141.26 ms INFO:deployment.core.evaluation.verification_mixin: output[heatmap]: shape=(1, 5, 510, 510), max_diff=0.464849, mean_diff=0.056135 INFO:deployment.core.evaluation.verification_mixin: output[reg]: shape=(1, 2, 510, 510), max_diff=0.056639, mean_diff=0.006198 INFO:deployment.core.evaluation.verification_mixin: output[height]: shape=(1, 1, 510, 510), max_diff=0.227012, mean_diff=0.065522 INFO:deployment.core.evaluation.verification_mixin: output[dim]: shape=(1, 3, 510, 510), max_diff=0.336713, mean_diff=0.028087 INFO:deployment.core.evaluation.verification_mixin: output[rot]: shape=(1, 2, 510, 510), max_diff=0.515039, mean_diff=0.023962 INFO:deployment.core.evaluation.verification_mixin: output[vel]: shape=(1, 2, 510, 510), max_diff=0.932002, mean_diff=0.034206 INFO:deployment.core.evaluation.verification_mixin: Overall Max difference: 0.932002 INFO:deployment.core.evaluation.verification_mixin: Overall Mean difference: 0.037279 WARNING:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) verification FAILED ✗ (max diff: 0.932002 > tolerance: 0.500000)

On a different computer, it can have different values.
I will leave 1 for now

Did you set any random seed to set this validation since the randomness (for example, shuffling pointclouds) significantly affects the results. Otherwise, i believe the difference between computer is too huge

Note that the reported difference corresponds to the maximum deviation; the mean difference is actually quite small.

Additionally, the magnitude of the difference depends heavily on the hardware. For example, on Blackwell GPUs (ONNX CUDA vs. TensorRT), the discrepancy is minimal. In contrast, on my laptop, the difference between ONNX CUDA and TensorRT is around 1. Even when forcing ONNX Runtime to use CUDA only, it still initializes a default CPU executor and executes some operations on the CPU, which can introduce discrepancies.

Interestingly, when comparing ONNX CPU with TensorRT on my laptop, the difference becomes very small. However, on Blackwell, the ONNX CPU vs. TensorRT comparison shows a larger gap.

deployment/projects/centerpoint/export/component_extractor.py

deployment/projects/centerpoint/onnx_models/__init__.py

deployment/projects/centerpoint/pipelines/centerpoint_pipeline.py

deployment/projects/centerpoint/pipelines/onnx.py

KSeangTan · 2026-03-11T03:55:52Z

Some of the modules, for example, dataloader should be able to be reused for the same detection3d tasks right?

KSeangTan · 2026-03-11T03:57:21Z

deployment/projects/centerpoint/entrypoint.py

+    model_cfg = Config.fromfile(args.model_cfg)
+    config = BaseDeploymentConfig(deploy_cfg)
+
+    _validate_required_components(config.components_cfg)


move _validate_required_components to BaseDeploymentConfig

This only validates the needed name for Centerpoint

KSeangTan · 2026-03-11T03:59:37Z

deployment/projects/centerpoint/entrypoint.py

+
+    context = CenterPointExportContext(rot_y_axis_reference=bool(getattr(args, "rot_y_axis_reference", False)))
+    runner.run(context=context)
+    return 0


Do we need to return status code here?

run() is annotated as -> int and documented as returning an exit code for the unified CLI (main.py)

KSeangTan · 2026-03-11T04:01:08Z

deployment/projects/centerpoint/pipelines/tensorrt.py

+    def _release_gpu_resources(self) -> None:
+        """Release TensorRT resources (engines and contexts) and CUDA events."""
+        # Destroy CUDA events
+        if hasattr(self, "_backbone_start_event"):


Use for-loop to achieve this

fixed in f48f5f7

KSeangTan · 2026-03-11T04:03:34Z

deployment/projects/centerpoint/pipelines/tensorrt.py

+        }
+
+        for component_name, engine_path in engine_files.items():
+            if not osp.exists(engine_path):


This error validation should be done in resolve_artifact_path

thanks, it is actually duplicated code! fixed in 90d1404

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

…erpoint Signed-off-by: vividf <yihsiang.fang@tier4.jp>

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

vividf · 2026-03-25T16:25:40Z

@KSeangTan
Thanks for the detailed review!!

Some of the modules, for example, dataloader should be able to be reused for the same detection3d tasks right?

Regarding this, I would like to change those names that can be reused for bevfusion in other PR

vividf changed the title ~~Feat/centerpoint deployment integration~~ feat(deployment): centerpoint deployment integration Feb 2, 2026

vividf mentioned this pull request Feb 2, 2026

feat(centerpoint): integrate CenterPoint into unified deployment pipeline #161

Closed

vividf requested review from KSeangTan and yamsam February 2, 2026 16:33

vividf self-assigned this Feb 2, 2026

vividf marked this pull request as ready for review February 3, 2026 04:31

vividf force-pushed the feat/centerpoint_deployment_integration branch 2 times, most recently from bfb778f to 441d06e Compare February 16, 2026 06:08

KSeangTan requested changes Mar 2, 2026

View reviewed changes

vividf force-pushed the feat/centerpoint_deployment_integration branch from caa92a6 to 93e5558 Compare March 5, 2026 17:24

vividf changed the base branch from feat/new_deployment_and_evaluation_pipeline to main March 5, 2026 17:27

vividf changed the base branch from main to feat/new_deployment_and_evaluation_pipeline March 5, 2026 17:27

vividf force-pushed the feat/centerpoint_deployment_integration branch 3 times, most recently from de7020e to 6470ac5 Compare March 10, 2026 14:40

KSeangTan reviewed Mar 11, 2026

View reviewed changes

vividf requested a review from KSeangTan March 11, 2026 04:01

KSeangTan requested changes Mar 11, 2026

View reviewed changes

KSeangTan reviewed Mar 11, 2026

View reviewed changes

vividf force-pushed the feat/new_deployment_and_evaluation_pipeline branch from 5256306 to 2b28f60 Compare March 11, 2026 04:27

vividf force-pushed the feat/centerpoint_deployment_integration branch from 1ca0e1c to a6b9840 Compare March 11, 2026 04:28

vividf force-pushed the feat/centerpoint_deployment_integration branch 2 times, most recently from 715bf79 to a209d2b Compare March 25, 2026 13:38

vividf added 4 commits March 26, 2026 00:23

feat: refactor config

3bead5b

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix property

be339d8

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean code

69240de

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: temp remove centerpoint files

0a55450

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

vividf and others added 26 commits March 26, 2026 00:23

chore: temp remove centerpoint files

7395ecc

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add files back

1a3fe6a

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

ci(pre-commit): autofix

46dcaae

chore: clean code

2d4c117

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: update threshold for centerpoint

62dea71

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean up code

e04d330

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: refactor base config - centerpoint

f33dc2b

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: clean up code: device spec, remove unused fucntion .etc - cent…

2db8ae1

…erpoint Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix Any

aa89f98

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add docstring

32d30bc

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: refactor export compenent - centerpoint

8c57199

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix more Device spec - centerpoint

3c0e412

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix

907ffe9

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add more docstring

1831dd7

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: change file name

e5ea06b

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: remove redundant check

a2cfd19

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: orangize directory

2adc492

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: rename sample file

d7084b6

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: remove init

53d49d8

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: add init back

04cba0c

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix trt verification

3b5eb2b

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: update deploy config

f14d7da

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: fix more deploy config

3e477b6

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: update deploy config

13e7397

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: for loop for clean code

f48f5f7

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

chore: remove duplicate code

90d1404

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

vividf force-pushed the feat/centerpoint_deployment_integration branch from a209d2b to 90d1404 Compare March 25, 2026 15:23

chore: clean up sample adapter

23afb89

Signed-off-by: vividf <yihsiang.fang@tier4.jp>

vividf requested a review from KSeangTan March 25, 2026 16:23

Conversation

vividf commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Migration Notes

How to run

Exported ONNX (Same)

Uh oh!

KSeangTan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vividf Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KSeangTan commented Mar 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vividf commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vividf commented Feb 2, 2026 •

edited

Loading

KSeangTan left a comment •

edited

Loading

vividf Mar 4, 2026 •

edited

Loading