Skip to content

update autin tag#139

Merged
yasen5 merged 73 commits intomainfrom
update-austin-tag
Apr 4, 2026
Merged

update autin tag#139
yasen5 merged 73 commits intomainfrom
update-austin-tag

Conversation

@charliehuang09
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings March 21, 2026 23:13
@charliehuang09 charliehuang09 changed the title builds, not need to test and fix bugs update autin tag Mar 21, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the third_party/971apriltag GPU AprilTag pipeline by renaming the public namespace/API (frc971::apriltagfrc::apriltag), restructuring the CUDA build (moving from .cu sources to .cc compiled as CUDA), and introducing new abstractions/types for thresholding and sizing.

Changes:

  • Replaced the old 971apriltag.h/apriltag.c-style interface with a new C++ wrapper header apriltag.h and updated downstream includes/usages.
  • Introduced a Threshold interface (threshold.h + threshold.cc) and updated the detector to use it (including a --use_neon flag path).
  • Updated CUDA utilities (cuda.h/.cc) and build rules (third_party/971apriltag/CMakeLists.txt) to support the new compilation/linking structure.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
third_party/971apriltag/transform_output_iterator.h Include guard + namespace rename to frc::apriltag.
third_party/971apriltag/threshold.h New thresholding interface + vision::ImageFormat + factory declarations.
third_party/971apriltag/threshold.cu Removed legacy CUDA threshold implementation.
third_party/971apriltag/threshold.cc New templated threshold implementation compiled as CUDA.
third_party/971apriltag/postscript_utils.h Removed debug postscript rendering helper.
third_party/971apriltag/points.h Include guard + namespace rename; changed blob limit constant.
third_party/971apriltag/points.cc Namespace rename and formatting adjustments.
third_party/971apriltag/line_fit_filter.h Include guard + namespace rename; changed numeric types/precision.
third_party/971apriltag/line_fit_filter.cc Namespace rename + formatting + minor logic cleanup.
third_party/971apriltag/labeling_allegretti_2019_BKE.h Include guard + modernized integer includes.
third_party/971apriltag/labeling_allegretti_2019_BKE.cc Switched to absl CHECK/LOG and apriltag types.
third_party/971apriltag/gpu_image.h Include guard + include <cstddef>.
third_party/971apriltag/cuda.h Namespace rename + new memory helpers and event/stream additions.
third_party/971apriltag/cuda.cc Added --sync flag and CudaStream::Wait.
third_party/971apriltag/apriltag_types.h New apriltag_size_t typedef for GPU code.
third_party/971apriltag/apriltag_detect.cc Updated decoding flow + distortion model updates + namespace rename.
third_party/971apriltag/apriltag.h New public GPU detector wrapper API.
third_party/971apriltag/apriltag.cc Updated detector implementation to use new thresholding and memory/event structure.
third_party/971apriltag/apriltag.c Removed legacy C implementation.
third_party/971apriltag/CMakeLists.txt Reworked build to compile .cc as CUDA and split dependencies.
third_party/971apriltag/971apriltag.h Removed old header/API.
src/utils/constants_from_json.cc Updated JSON conversions to new frc::apriltag types.
src/localization/nvidia_apriltag_detector.h Removed unused include of deleted 971apriltag header.
src/localization/gpu_apriltag_detector.h Updated include and detector type to new namespace/API.
src/localization/gpu_apriltag_detector.cc Updated construction + call sites to new GpuDetector::Detect API.
src/localization/CMakeLists.txt Commented-out alternative library line.
Comments suppressed due to low confidence (1)

third_party/971apriltag/cuda.h:121

  • HostMemory::MemcpyTo takes const T* other as the destination and passes it to memcpy, which requires a non-const destination pointer. This should be T* other (and ideally MemcpyTo(void* dst, size_t bytes)/span overloads) so the header compiles and the API matches intent.
  // Copies data to other (host memory) from this's memory.
  void MemcpyTo(const T* other) {
    memcpy(other, span_.data(), sizeof(T) * size());
  }

Comment on lines +18 to 23
frc::apriltag::DistCoeffs distortion_coefficients = {.k1 = intrinsics["k1"],
.k2 = intrinsics["k2"],
.p1 = intrinsics["p1"],
.p2 = intrinsics["p2"],
.k3 = intrinsics["k3"]};

Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DistCoeffs now includes k4..k6 and num_params, but the JSON conversion only initializes k1..k3,p1,p2. The new fields will be left uninitialized and are used by UnDistort/ReDistort, leading to undefined behavior. Initialize the remaining fields (e.g. to 0) and set num_params based on what the JSON provides.

Suggested change
frc::apriltag::DistCoeffs distortion_coefficients = {.k1 = intrinsics["k1"],
.k2 = intrinsics["k2"],
.p1 = intrinsics["p1"],
.p2 = intrinsics["p2"],
.k3 = intrinsics["k3"]};
frc::apriltag::DistCoeffs distortion_coefficients{}; // Zero-initialize all fields, including k4..k6 and num_params.
// Required coefficients: assume these are always provided in the JSON.
distortion_coefficients.k1 = intrinsics["k1"];
distortion_coefficients.k2 = intrinsics["k2"];
distortion_coefficients.p1 = intrinsics["p1"];
distortion_coefficients.p2 = intrinsics["p2"];
distortion_coefficients.k3 = intrinsics["k3"];
// Start with the 5 standard distortion parameters.
int num_params = 5;
// Optional higher-order radial distortion coefficients.
if (intrinsics.contains("k4")) {
distortion_coefficients.k4 = intrinsics["k4"];
++num_params;
}
if (intrinsics.contains("k5")) {
distortion_coefficients.k5 = intrinsics["k5"];
++num_params;
}
if (intrinsics.contains("k6")) {
distortion_coefficients.k6 = intrinsics["k6"];
++num_params;
}
distortion_coefficients.num_params = num_params;

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +26
add_library(971apriltag_threshold_dep STATIC threshold.cc)
target_link_libraries(971apriltag_threshold_dep PRIVATE apriltag absl::log absl::check)

add_library(971apriltag STATIC
apriltag.cc
apriltag_detect.cc
labeling_allegretti_2019_BKE.cc
line_fit_filter.cc
points.cc
)
set_target_properties(971apriltag_dep PROPERTIES LINKER_LANGUAGE C)

target_compile_options(971apriltag PRIVATE -w)
target_link_libraries(971apriltag PRIVATE apriltag 971apriltag_dep absl::log absl::check)
target_link_libraries(971apriltag PRIVATE apriltag 971apriltag_cuda_dep 971apriltag_threshold_dep absl::log absl::check)
set_source_files_properties(labeling_allegretti_2019_BKE.cc apriltag_detect.cc apriltag.cc line_fit_filter.cc points.cc PROPERTIES LANGUAGE CUDA)
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

971apriltag_threshold_dep uses symbols from cuda.cc (e.g. MaybeCheckAndSynchronize), but it does not link against 971apriltag_cuda_dep, and the final target_link_libraries(971apriltag ...) lists 971apriltag_cuda_dep before 971apriltag_threshold_dep. With static libraries, link order matters on many toolchains and this can lead to unresolved symbols. Link 971apriltag_threshold_dep against 971apriltag_cuda_dep (or reorder / use an OBJECT library) to make linking robust.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,349 @@
#include "threshold.h"

#include <stdint.h>
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file uses std::min/std::max but doesn't include <algorithm>. It may compile accidentally due to transitive includes, but that's not guaranteed; explicitly include <algorithm> to make the TU self-contained.

Suggested change
#include <stdint.h>
#include <stdint.h>
#include <algorithm> // For std::min/std::max

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +170
CHECK_CUDA(cudaMemcpyAsync(memory_, host_memory,
sizeof(T) * host_memory->size(),
cudaMemcpyHostToDevice, stream->get()));
}
void MemcpyAsyncFrom(const HostMemory<T>* host_memory, const size_t size,
CudaStream* stream) {
CHECK_LE(size, size_);
CHECK_CUDA(cudaMemcpyAsync(memory_, host_memory, sizeof(T) * size,
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GpuMemory::MemcpyAsyncFrom(const HostMemory<T>* ...) passes the HostMemory object pointer directly to cudaMemcpyAsync instead of host_memory->get(), so it will copy from the wrong address and likely crash/corrupt data. Use host_memory->get() (and apply the same fix to the sized overload).

Suggested change
CHECK_CUDA(cudaMemcpyAsync(memory_, host_memory,
sizeof(T) * host_memory->size(),
cudaMemcpyHostToDevice, stream->get()));
}
void MemcpyAsyncFrom(const HostMemory<T>* host_memory, const size_t size,
CudaStream* stream) {
CHECK_LE(size, size_);
CHECK_CUDA(cudaMemcpyAsync(memory_, host_memory, sizeof(T) * size,
CHECK_CUDA(cudaMemcpyAsync(memory_, host_memory->get(),
sizeof(T) * host_memory->size(),
cudaMemcpyHostToDevice, stream->get()));
}
void MemcpyAsyncFrom(const HostMemory<T>* host_memory, const size_t size,
CudaStream* stream) {
CHECK_LE(size, size_);
CHECK_CUDA(cudaMemcpyAsync(memory_, host_memory->get(), sizeof(T) * size,

Copilot uses AI. Check for mistakes.
#include <cassert>
#include <chrono>
#include <iostream>
#include <span>
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header uses memcpy and std::vector but does not include <cstring> or <vector>. Relying on transitive includes is brittle and can break compilation depending on include order; add the missing standard headers here.

Suggested change
#include <span>
#include <span>
#include <cstring> // For memcpy
#include <vector> // For std::vector

Copilot uses AI. Check for mistakes.
Comment on lines 167 to 171
// The blob index is 12 bits, the angle is 28 bits, and the point is 24 bits.
struct IndexPoint {
// Max number of blob IDs we can hold.
static constexpr size_t kMaxBlobs = 4096; //2048; RJS
static constexpr size_t kMaxBlobs = 2048;

Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexPoint stores a 12-bit blob index (blob_index & 0xfff), and the comment states the blob index is 12 bits, but kMaxBlobs was reduced to 2048 (11 bits). Either keep kMaxBlobs aligned with the 12-bit encoding (4096) or update the encoding/comment to avoid silent truncation/invalid assumptions elsewhere.

Copilot uses AI. Check for mistakes.
Comment on lines +85 to +89
class GpuDetector {
public:
// The number of blobs we will consider when counting april tags.
static constexpr size_t kMaxBlobs = 64; // TODO Charlie

Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GpuDetector::kMaxBlobs is set to 64, but selected_extents_device_, peak_extents_device_, and fit_quads_device_ are sized based on this constant while num_quads is computed from the image at runtime. There are no checks ensuring num_quads <= kMaxBlobs, so an image with >64 qualifying blobs can cause device memory overwrites. Add a hard check before using num_quads, or allocate these buffers based on the computed count.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +50
gpu_detector_ = std::make_unique<frc::apriltag::GpuDetector>(
image_width, image_height, apriltag_detector_,
utils::CameraMatrixFromJson<frc971::apriltag::CameraMatrix>(intrinsics),
utils::DistortionCoefficientsFromJson<frc971::apriltag::DistCoeffs>(
intrinsics));
utils::CameraMatrixFromJson<frc::apriltag::CameraMatrix>(intrinsics),
utils::DistortionCoefficientsFromJson<frc::apriltag::DistCoeffs>(
intrinsics),
vision::ImageFormat::BGR8);
}
auto GPUAprilTagDetector::GetTagDetections(
camera::timestamped_frame_t& timestamped_frame)
-> std::vector<tag_detection_t> {
try {
if (timestamped_frame.frame.channels() == 1) {
gpu_detector_->DetectGrayHost(
(unsigned char*)timestamped_frame.frame.ptr());
gpu_detector_->Detect((unsigned char*)timestamped_frame.frame.ptr(),
nullptr);
} else if (timestamped_frame.frame.channels() == 3) {
cv::Mat gray;
cv::cvtColor(timestamped_frame.frame, gray, cv::COLOR_BGR2GRAY);
gpu_detector_->DetectGrayHost((unsigned char*)gray.ptr());
gpu_detector_->Detect((unsigned char*)gray.ptr(), nullptr);
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detector is constructed with vision::ImageFormat::BGR8, but both call sites pass a single-channel grayscale buffer to GpuDetector::Detect. The thresholding code will treat the input as 3 bytes/pixel and read out-of-bounds / produce incorrect results. Either pass the original BGR frame when using BGR8, or construct the detector with MONO8 (and skip the extra cvtColor).

Copilot uses AI. Check for mistakes.
Comment on lines 4 to 9
#include <stdint.h>

#include "apriltag_types.h"
#include "cuda.h"
// #include "vision_generated.h"

Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

threshold.h declares std::unique_ptr but does not include <memory>, which will fail to compile in translation units that don't already include it. Add #include <memory> (and consider including <cstddef> explicitly if you want to rely less on transitive includes for size_t).

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +70
const u_int32_t in_row = out_row * 2;
const u_int32_t in_col = out_col * 2;
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InternalCudaToGreyscaleAndDecimate uses u_int32_t for in_row/in_col, which is non-standard and not guaranteed to be defined on all platforms/toolchains. Use uint32_t or apriltag_size_t here for portability/consistency.

Suggested change
const u_int32_t in_row = out_row * 2;
const u_int32_t in_col = out_col * 2;
const apriltag_size_t in_row = out_row * 2;
const apriltag_size_t in_col = out_col * 2;

Copilot uses AI. Check for mistakes.
yasen5 and others added 28 commits March 28, 2026 00:59
Signed-off-by: Charlie Huang <charliehuang09@gmail.com>
Signed-off-by: Charlie Huang <charliehuang09@gmail.com>
Signed-off-by: Charlie Huang <charliehuang09@gmail.com>
Signed-off-by: Charlie Huang <charliehuang09@gmail.com>
Signed-off-by: Charlie Huang <charliehuang09@gmail.com>
Signed-off-by: Charlie Huang <charliehuang09@gmail.com>
@yasen5 yasen5 merged commit 06a9db0 into main Apr 4, 2026
1 check passed
@yasen5 yasen5 deleted the update-austin-tag branch April 4, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants