Skip to content

Dev/uncertainty#2121

Draft
servantftransperfect wants to merge 2 commits intodevelopfrom
dev/uncertainty
Draft

Dev/uncertainty#2121
servantftransperfect wants to merge 2 commits intodevelopfrom
dev/uncertainty

Conversation

@servantftransperfect
Copy link
Copy Markdown
Contributor

Description

Features list

Implementation remarks

Comment thread meshroom/aliceVision/ComputeUncertainty.py Fixed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the previous external UncertaintyTE-based uncertainty computation with a new in-repo aliceVision_uncertainty CUDA/MAGMA implementation, wiring it into aliceVision_computeUncertainty and adding a Meshroom node.

Changes:

  • Add new aliceVision_uncertainty library (CUDA kernels + MAGMA-based Hessian inversion) and a CUDA unit test for kernels.
  • Update aliceVision_computeUncertainty to build a BA Jacobian via Ceres and call the new uncertainty API.
  • Remove UncertaintyTE build plumbing and add MAGMA discovery in top-level CMake.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
src/software/utils/main_computeUncertainty.cpp Switched to new uncertainty library + Jacobian generation; CLI pared down.
src/software/utils/CMakeLists.txt Link aliceVision_computeUncertainty against new aliceVision_uncertainty.
src/cmake/FindUncertaintyTE.cmake Removed UncertaintyTE find-module.
src/cmake/ConfigureOptionsMerger.cmake Removed UncertaintyTE configure option forwarding.
src/aliceVision/utils/scope.hpp New ScopeGuard helper used for cleanup in uncertainty code.
src/aliceVision/uncertainty/uncertainty.hpp New public uncertainty API + Schur/Hessian construction templates.
src/aliceVision/uncertainty/uncertainty.cpp MAGMA-based camera Hessian inverse via Neumann-series correction.
src/aliceVision/uncertainty/uncertainty_kernels.* CUDA reduction + conditional AXPY used for GPU-side convergence guard.
src/aliceVision/uncertainty/uncertainty_kernels_test.cpp New Boost test coverage for CUDA kernels.
src/aliceVision/uncertainty/magma.hpp RAII wrapper for MAGMA device allocations.
src/aliceVision/uncertainty/CMakeLists.txt Build new uncertainty library and unit test.
src/aliceVision/sfm/bundle/BundleAdjustmentCeres.{hpp,cpp} Change Jacobian creation API to return pose/landmark positioning info.
src/aliceVision/CMakeLists.txt Add uncertainty/ subdir when CUDA is available.
src/CMakeLists.txt Add MAGMA dependency discovery; remove UncertaintyTE option & include dirs.
meshroom/aliceVision/ComputeUncertainty.py Add Meshroom node wrapper for the CLI tool.
INSTALL.md Update dependency/options documentation to remove UncertaintyTE references.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +13 to +19
SOURCES ${uncertainty_files_headers} ${uncertainty_files_sources}
PUBLIC_LINKS
aliceVision_sfmData
${MAGMA_LIBRARIES}
PRIVATE_LINKS
aliceVision_system
CUDA::cudart
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aliceVision_uncertainty compiles CUDA sources (uncertainty_kernels.cu) but the target isn't created with USE_CUDA and CUDA::cudart isn't a PUBLIC link. This can cause missing CUDA target properties and (for static libs) unresolved cudart symbols in downstream targets. Consider adding USE_CUDA and making CUDA::cudart a PUBLIC link, consistent with other CUDA libraries in this repo.

Suggested change
SOURCES ${uncertainty_files_headers} ${uncertainty_files_sources}
PUBLIC_LINKS
aliceVision_sfmData
${MAGMA_LIBRARIES}
PRIVATE_LINKS
aliceVision_system
CUDA::cudart
USE_CUDA
SOURCES ${uncertainty_files_headers} ${uncertainty_files_sources}
PUBLIC_LINKS
aliceVision_sfmData
${MAGMA_LIBRARIES}
CUDA::cudart
PRIVATE_LINKS
aliceVision_system

Copilot uses AI. Check for mistakes.
Comment on lines 86 to 88
// command-line parameters
std::string verboseLevel = system::EVerboseLevel_enumToString(system::Logger::getDefaultVerboseLevel());
std::string sfmDataFilename;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verboseLevel is initialized but not registered as a CLI option anymore, yet the Meshroom node passes verboseLevel. Add the option back (and apply it) or remove it from Meshroom to prevent runtime failures.

Copilot uses AI. Check for mistakes.
Comment thread meshroom/aliceVision/ComputeUncertainty.py
Comment thread src/software/utils/main_computeUncertainty.cpp
Comment thread src/aliceVision/utils/scope.hpp
Comment on lines +63 to +118
// Conditional axpy kernel
// ---------------------------------------------------------------------------
// Thread 0 in block 0 decides whether to execute the axpy.
// The decision is broadcast via shared memory so all threads in the grid
// read it from L1 cache rather than global memory.
//
// Layout: one kernel launch with enough blocks to cover n elements.
// Block 0 additionally runs the comparison and update logic in thread 0.

__global__ void conditionalAxpyKernel(double k,
const double* __restrict__ src,
double* __restrict__ dst,
const double* d_cur_maxabs,
double* d_prev_norm,
double lambda,
int n)
{
// Block 0, thread 0: compare and update d_prev_norm
__shared__ int do_axpy;
if (blockIdx.x == 0 && threadIdx.x == 0)
{
double cur = lambda * (*d_cur_maxabs);
do_axpy = (cur <= *d_prev_norm) ? 1 : 0;
if (do_axpy)
{
*d_prev_norm = cur;
}
}

// All threads in block 0 wait for the decision; other blocks read it
// from global memory via the shared flag after block 0 has written it.
// Since blocks can execute in any order we use a device-side broadcast:
// block 0 writes do_axpy to d_cur_maxabs[1] (we repurpose a spare slot)
// — but that requires an extra device buffer.
//
// Simpler correct approach: re-read d_prev_norm comparison in every block.
// (One extra global read per block, negligible vs the axpy bandwidth.)
__syncthreads(); // sync within block 0

// Non-block-0 threads recompute the flag from global memory
int local_do_axpy;
if (blockIdx.x == 0)
{
local_do_axpy = do_axpy;
}
else
{
// d_prev_norm was updated by block 0 only if do_axpy; check the
// pre-update value by comparing directly (re-read is safe: block 0
// has already written d_prev_norm before this block can read it
// because grids execute sequentially per stream).
local_do_axpy = (lambda * (*d_cur_maxabs) <= *d_prev_norm) ? 1 : 0;
}

if (!local_do_axpy) return;

Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within a single kernel launch, blocks can execute in any order; using block 0/thread 0 to decide and update *d_prev_norm does not guarantee other blocks see the updated value before they compute local_do_axpy. This can make some blocks execute the axpy when it should be skipped. Consider a separate decision kernel that writes a device flag, then a second kernel for the axpy.

Suggested change
// Conditional axpy kernel
// ---------------------------------------------------------------------------
// Thread 0 in block 0 decides whether to execute the axpy.
// The decision is broadcast via shared memory so all threads in the grid
// read it from L1 cache rather than global memory.
//
// Layout: one kernel launch with enough blocks to cover n elements.
// Block 0 additionally runs the comparison and update logic in thread 0.
__global__ void conditionalAxpyKernel(double k,
const double* __restrict__ src,
double* __restrict__ dst,
const double* d_cur_maxabs,
double* d_prev_norm,
double lambda,
int n)
{
// Block 0, thread 0: compare and update d_prev_norm
__shared__ int do_axpy;
if (blockIdx.x == 0 && threadIdx.x == 0)
{
double cur = lambda * (*d_cur_maxabs);
do_axpy = (cur <= *d_prev_norm) ? 1 : 0;
if (do_axpy)
{
*d_prev_norm = cur;
}
}
// All threads in block 0 wait for the decision; other blocks read it
// from global memory via the shared flag after block 0 has written it.
// Since blocks can execute in any order we use a device-side broadcast:
// block 0 writes do_axpy to d_cur_maxabs[1] (we repurpose a spare slot)
// — but that requires an extra device buffer.
//
// Simpler correct approach: re-read d_prev_norm comparison in every block.
// (One extra global read per block, negligible vs the axpy bandwidth.)
__syncthreads(); // sync within block 0
// Non-block-0 threads recompute the flag from global memory
int local_do_axpy;
if (blockIdx.x == 0)
{
local_do_axpy = do_axpy;
}
else
{
// d_prev_norm was updated by block 0 only if do_axpy; check the
// pre-update value by comparing directly (re-read is safe: block 0
// has already written d_prev_norm before this block can read it
// because grids execute sequentially per stream).
local_do_axpy = (lambda * (*d_cur_maxabs) <= *d_prev_norm) ? 1 : 0;
}
if (!local_do_axpy) return;
// Conditional axpy kernels
// ---------------------------------------------------------------------------
// The decision whether to execute the axpy must be made in a separate kernel,
// because blocks within a single kernel launch cannot synchronize with each
// other. Launch decideConditionalAxpyKernel<<<1, 1, 0, stream>>> first, then
// launch conditionalAxpyKernel<<<grid, block, 0, stream>>> in the same stream.
__global__ void decideConditionalAxpyKernel(const double* d_cur_maxabs,
double* d_prev_norm,
double lambda,
int* d_do_axpy)
{
if (blockIdx.x == 0 && threadIdx.x == 0)
{
const double cur = lambda * (*d_cur_maxabs);
const int do_axpy = (cur <= *d_prev_norm) ? 1 : 0;
*d_do_axpy = do_axpy;
if (do_axpy)
{
*d_prev_norm = cur;
}
}
}
__global__ void conditionalAxpyKernel(double k,
const double* __restrict__ src,
double* __restrict__ dst,
const int* d_do_axpy,
int n)
{
if (*d_do_axpy == 0) return;

Copilot uses AI. Check for mistakes.
Comment thread src/software/utils/main_computeUncertainty.cpp
Comment on lines +371 to +375
magma_queue_create(err, &queue);
if (err)
{
ALICEVISION_LOG_ERROR("Error on magma_magma_queue_createinit: " << magma_strerror(err));
return false;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magma_queue_create is called without capturing its return status, and the error check reuses 'err' from magma_init. Capture the return value from magma_queue_create (and use an explicit device id, typically 0) so failures are correctly detected and logged.

Copilot uses AI. Check for mistakes.
Comment thread src/CMakeLists.txt Outdated
Comment thread src/aliceVision/sfm/bundle/BundleAdjustmentCeres.hpp Outdated
Co-authored-by: Copilot <copilot@github.com>
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants