Skip to content

fix computation of ν in TRDH and LMTR#295

Open
MaxenceGollier wants to merge 4 commits intoJuliaSmoothOptimizers:masterfrom
MaxenceGollier:patch-step
Open

fix computation of ν in TRDH and LMTR#295
MaxenceGollier wants to merge 4 commits intoJuliaSmoothOptimizers:masterfrom
MaxenceGollier:patch-step

Conversation

@MaxenceGollier
Copy link
Collaborator

It looks like it was a bug.

See the last point of #231.

Copilot AI review requested due to automatic review settings February 20, 2026 18:12
@MaxenceGollier
Copy link
Collaborator Author

@dpo, I am sure for TRDH but is this correct for LMTR ?

To make things clear for you this PR fixes the bug that we discovered here.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a bug in the computation of the step length parameter ν in the TRDH (Trust Region with Diagonal Hessian) and LMTR (Levenberg-Marquardt Trust Region) algorithms, as referenced in issue #231. The formulas were previously incorrect and have been updated to use a new pattern that properly incorporates the trust region radius and regularization terms.

Changes:

  • Updated ν computation in TRDH to use 1 / (DNorm + 1 / (α * Δk)) instead of (α * Δk)/(DNorm + 1)
  • Updated ν computation in LMTR to use 1 / (σmax^2 + 1 / (α * Δk)) instead of α * Δk / (1 + σmax^2 * (α * Δk + 1))

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/TRDH_alg.jl Updated both occurrences of ν computation (lines 329 and 479) to use the corrected formula pattern with proper scaling by trust region radius and diagonal norm
src/LMTR_alg.jl Updated both occurrences of ν computation (lines 278 and 450) to use the corrected formula pattern with proper scaling by trust region radius and Jacobian operator norm squared

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

for (mod, mod_name) ∈ ((SpectralGradientModel, "spg"),)
# ((DiagonalPSBModel, "psb"),(DiagonalAndreiModel, "andrei")) work but do not always terminate
for (h, h_name) ∈ ((NormL0(λ), "l0"), (NormL1(λ), "l1"))
continue # FIXME
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because the TRDH tests with bounds now failed, see this run: https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl/actions/runs/22235640816.
@dpo, I am not sure what to do here. should we just accept that TRDH fails ? I think #281 might fix this actually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the value α = 1 / eps(T) is too extreme?! Do the tests pass with, e.g., α = 1 / √eps(T)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to investigate, i found another bug:
The documentation says:

Alternatively, if `reduce_TR = true`, then ξₖ₁ := f(xₖ) + h(xₖ) - φ(sₖ₁; xₖ) - ψ(sₖ₁; xₖ) is used instead of ξₖ, where sₖ₁ is the Cauchy point.

where the phi is defined as
where φ(s ; xₖ) = f(xₖ) + ∇f(xₖ)ᵀs + ½ sᵀ Dₖ s is a quadratic approximation of f about xₖ,

but then we compute xi as

if reduce_TR
prox!(s, ψ, mν∇fk, ν)
mks = mk1(s)
ξ1 = hk - mks + max(1, abs(hk)) * 10 * eps()

i.e with a first order model and not the quadratic model.
I am confused. This whole reduce_TR thing is a bit odd in my opinion too, shouldn't we decide which is best and just keep one like in R2DH ?

@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.62%. Comparing base (e0f214d) to head (0118a02).
⚠️ Report is 272 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master     #295       +/-   ##
===========================================
+ Coverage   61.53%   86.62%   +25.08%     
===========================================
  Files          11       13        +2     
  Lines        1292     1652      +360     
===========================================
+ Hits          795     1431      +636     
+ Misses        497      221      -276     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MaxenceGollier MaxenceGollier requested a review from dpo February 23, 2026 14:54
DNorm = norm(D.d, Inf)

ν = (α * Δk)/(DNorm + one(T))
ν = 1 / (DNorm + 1 / (α * Δk))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thank you. In addition, DNorm can be computed as norm(max.(D.d, 0), Inf) (although that allocates). That was a late update to the convergence theory; it looks like it didn't make it into the paper.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be or should be ?

@MaxenceGollier MaxenceGollier requested a review from dpo February 27, 2026 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants