Skip to content

Add FPS benchmark metric to influx#214

Open
tjb7 wants to merge 19 commits intomainfrom
tjb_fps_graphs
Open

Add FPS benchmark metric to influx#214
tjb7 wants to merge 19 commits intomainfrom
tjb_fps_graphs

Conversation

@tjb7
Copy link
Copy Markdown
Collaborator

@tjb7 tjb7 commented Apr 2, 2026

Adds influx db pushing of fps results, is BEST EFFORT, depends on oakctl installed on running system.

@tjb7 tjb7 requested a review from a team as a code owner April 2, 2026 05:27
@tjb7 tjb7 requested review from conorsim, klemen1999, kozlov721 and tersekmatija and removed request for a team April 2, 2026 05:27
@tjb7 tjb7 marked this pull request as draft April 2, 2026 05:27
@tjb7
Copy link
Copy Markdown
Collaborator Author

tjb7 commented Apr 2, 2026

NOTE: it's not really ready for review yet.

Basic idea is just to get FPS metrics into influx so we can watch them across all versions of everything, so nothing slips by

@tjb7 tjb7 requested a review from danilo-pejovic April 2, 2026 05:33
@tjb7 tjb7 force-pushed the tjb_fps_graphs branch from e5bc2d2 to 9ae4b83 Compare April 7, 2026 07:40
@tjb7 tjb7 force-pushed the tjb_fps_graphs branch from 69f7149 to 45768b9 Compare April 8, 2026 11:00
@@ -0,0 +1,85 @@
name: HIL FPS Benchmark Influx
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danilo-pejovic not sure if this is needed, would you rather have a seperate workflow or instrument all the tests directly (in rvc4_test.yaml) ... it is my preference to instrument all tests/gather all data, but up to your choice.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that are running on our hil are bom-test.yaml ones, not rvc4_test - those seem internal tests to ml team. I would just add influx thing to bom-tests and not create a new workflow.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are still creating a new unnecessary workflow. Can we edit this one that is already part of the bom to also pass info to influx?

https://github.com/luxonis/modelconverter/blob/main/.github/workflows/bom-test.yaml

@tjb7 tjb7 marked this pull request as ready for review April 8, 2026 11:10
# Cache device metadata once for the whole run using oakctl. If oakctl is not
# available, keep the benchmark runnable and record explicit placeholder values
# for the missing oakctl-derived metadata.
oakctl_output=$(oakctl list --format json 2>/dev/null || printf '')
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oakctl is not part of these tests. Why are we adding more moving parts that can fail?

Hil framework has built in functions to account get all necessary info from camera:

danilo@danilo:~/Downloads$ camera -t slo316-4pro -n test oak4_pro info -v +----------+----------+------------+------------+------------+------------+------------+--------------+----------+----------------+-------------------------------------------------+----------+ | name | model | platform | revision | mxid | username | password | hostname | adb | state | os_version | adb_id | +==========+==========+============+============+============+============+============+==============+==========+================+=================================================+==========+ | oak4_pro | oak4_pro | rvc4 | 9 | 1583798758 | root | oelinux123 | 10.12.143.89 | 5e66d9e6 | setup-complete | 1.30.0+955791f9014cfe160419deae2f64980f180f4d9c | 5e66d9e6 | +----------+----------+------------+------------+------------+------------+------------+--------------+----------+----------------+-------------------------------------------------+----------+ danilo@danilo:~/Downloads$ camera -t slo316-4pro -n test oak4_pro info -v -j [{"name": "oak4_pro", "model": "oak4_pro", "platform": "rvc4", "revision": 9, "mxid": "1583798758", "username": "root", "password": "oelinux123", "hostname": "10.12.143.89", "adb": "5e66d9e6", "state": "setup-complete", "os_version": "1.30.0+955791f9014cfe160419deae2f64980f180f4d9c", "adb_id": "5e66d9e6"}]

Alternatively - camera object in python can give all the necessary info in even more readable/maintainable way. That is probably the way we should supply that info.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will sub-in the json version of the above:

(.hil) hil@slo130-fl:~$ camera -t slo316-4pro -n test oak4_pro info -j
[{"name": "oak4_pro", "model": "oak4_pro", "platform": "rvc4", "revision": 9, "mxid": "1583798758", "hostname": "10.12.143.89", "os_version": "1.27.1+9950fe7353d76fec53a99019be9e1d6130e8f69a", "adb_id": "5e66d9e6"}]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: how to determine the camera used if there are multiple cameras?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I will just use: camera -t ${HIL_TESTBED} -n test all info -j and take the first:

(.hil) hil@slo238-4s:~$ camera -t ${HIL_TESTBED} -n test all info -j use_ssh 2026-04-13 07:21:36,832 [INFO] ssh command failed with the following output: 2026-04-13 07:21:36,832 [INFO] 2026-04-13 07:21:36,832 [INFO] Error message (stderr): 2026-04-13 07:21:36,832 [INFO] 2026-04-13 07:21:36,832 [INFO] Exit code: 5 [{"name": "oak4_s", "model": "oak4_s", "platform": "rvc4", "revision": 9, "mxid": "2671960531", "hostname": "10.12.140.235", "os_version": "1.27.1+9950fe7353d76fec53a99019be9e1d6130e8f69a", "adb_id": "9f42e1d3"}]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stated in dms via example provided and in first comment here I would prefer camera metadata to be gathered inside the script:

Alternatively - camera object in python can give all the necessary info in even more readable/maintainable way. That is probably the way we should supply that info.

It would be more readable, consistent and maintainable.

return f'"{escaped}"'


def _write_fps_result_to_influx(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be implemented in hil_framework? We can probably hardcode influx variables there and then create simple function like camera/testbed.log_to_influx(arguments) and it will be pretty similar to this implementation and how sanity checks work for hil already. That way it is more maintainable - if we for example change where we host out database, it will be one change instead of changing every repo.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, but let's do it in a follow up ticket, just so I can get this work finished and immediately useful.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to do things properly from the start if we know it is something we are going to be building on - it shouldnt take long.

@@ -0,0 +1,85 @@
name: HIL FPS Benchmark Influx
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that are running on our hil are bom-test.yaml ones, not rvc4_test - those seem internal tests to ml team. I would just add influx thing to bom-tests and not create a new workflow.

default="rvc4",
help="Target platform to benchmark (default: rvc4).",
)
parser.addoption(
Copy link
Copy Markdown
Collaborator

@danilo-pejovic danilo-pejovic Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments above/below adds more info.

Probably should be implemented in more maintainable way to avoid piping info we already have multiple times.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand correctly, I should just remove this here, and fail hard if the env is not properly configured. I will make the change, please comment again if you mean something else here.

Copy link
Copy Markdown
Collaborator

@danilo-pejovic danilo-pejovic Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are passing camera mxid, os versions,... as arguments to the script when there is no need to do that and that info is readily available inside the script via hil framework.

@danilo-pejovic
Copy link
Copy Markdown
Collaborator

danilo-pejovic commented Apr 10, 2026

Also, here is a screenshot of the final data in grafana:
image

And the direct link: http://10.12.139.254:3000/d/ad4vt7g/fps-model-converter-tests-v3?orgId=1&from=2026-04-07T10:26:51.553Z&to=2026-04-07T17:48:13.745Z&timezone=browser&var-test_name=luxonis%2Fyolov8-nano-pose-estimation:coco-512x288&var-test_name=luxonis%2Ffastsam-s:512x288&var-test_name=luxonis%2Fyolov6-nano:r2-coco-512x288&var-depthai_version=3.3.0&var-depthai_version=3.4.0&var-depthai_version=3.5.0&var-camera_os_version=$__all&var-run_id=$__all&var-camera_mxid=$__all&var-testbed_name=$__all&var-min_fps_threshold=

In the graph max and min values for models dont seem to be consistent? IIRC they are just some expected value with buffer on both sides so it should be always the same for the same model. Are we maybe mixing up models in tables?

Also as long as those min and max values are set constants we probably dont need to track them for every test. It just adds clutter

@tjb7
Copy link
Copy Markdown
Collaborator Author

tjb7 commented Apr 13, 2026

Changes made, ready for review again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants