Skip to content

[AI] Add RAW denoise (RawNIND, Bayer + Linear) to neural restore module#20854

Merged
TurboGit merged 9 commits intodarktable-org:masterfrom
andriiryzhkov:nr_rawdenoise
Apr 28, 2026
Merged

[AI] Add RAW denoise (RawNIND, Bayer + Linear) to neural restore module#20854
TurboGit merged 9 commits intodarktable-org:masterfrom
andriiryzhkov:nr_rawdenoise

Conversation

@andriiryzhkov
Copy link
Copy Markdown
Collaborator

@andriiryzhkov andriiryzhkov commented Apr 23, 2026

Heads up – this is a pretty big PR. It adds an AI-based raw denoiser that runs before the rest of the pipeline and is pretty fast. Based on the pixls.us threads asking for a real pre-demosaic AI denoiser, this seems like a well-expected feature: existing RGB denoisers run late in the pipeline after demosaic / tonemapping / lens correction, which limits how much noise can be modeled correctly. A sensor-space denoiser can see the original photon-shot-noise distribution and clean it up before any of those lossy steps.

Requires companion model package update: darktable-org/darktable-ai#21

Collaboration with the model author

I reached out to Benoit Brummer (author of NIND, RawNIND, and the UtNet2 model family) in this work, and he has been actively supporting the implementation.

What the feature does

  • A new tab in the neural_restore module: raw denoise.
  • Processes selected raws through a RawNIND UtNet2 model and writes a denoised DNG next to the source file.
  • Output DNG variant depends on sensor type:
    • Bayer sensors → Bayer CFA DNG (mosaiced; darktable re-runs your normal demosaic + color stack on import)
    • X-Trans / niche sensors → LinearRaw DNG (demosaic baked in; only color + tone stack re-applies)
  • The denoised DNG gets auto-imported into the same filmroll and can be edited like any other raw.
  • Inline preview shows a before/after crop with a user-placeable patch picker. Switching images, tabs, or the patch is debounced and non-blocking.
  • Per-image strength slider (0–100%) blends source ↔ denoised at preview time without re-running inference.

Benoit has also indicated that a dedicated X-Trans model can also be trained – separate weights trained specifically for Fuji's 6×6 CFA pattern rather than the generic-demosaic fallback used today. That's not part of this release: the xtrans_v1 contract label and the dt_restore_load_rawdenoise_xtrans loader are reserved in this PR so the dedicated model can plug in later with minimal (or no) darktable code changes.

Why it's fast

  • Full-image inference on a 24 MP Bayer raw: ~2–5 s on decent GPU, ~10–20 s on CPU.
  • Preview: single centered tile, sub-second on GPU.
  • The model itself is ~16 MB. RawNIND is CFA-native (4ch R/G1/G2/B packed at half-resolution), so inference sees a quarter the pixel count of equivalent-quality RGB denoisers.

Where it sits in the pipeline

Conceptually: not in the pipeline at all. The denoised result is a new DNG that replaces the noisy source. Users pick the denoised DNG as their working file, apply any modules they'd normally apply, and never touch the pre-denoise path. This differs from post-pipeline RGB denoisers which stack a second noise-reduction pass on already-processed pixels.

Model packaging

Models are distributed as .dtmodel packages via the existing AI model catalog (data/ai_models.json). RawNIND ships with two ONNX variants:

  • variants.bayer – for standard Bayer CFAs (RGGB / BGGR / GRBG / GBRG)
  • variants.linear – for X-Trans + anything else without a dedicated path

The manifest declares preprocessing policy explicitly (WB normalization, input colorspace, output-scale handling, exposure target, channel orientation, edge-padding) so darktable doesn't bake RawNIND-specific assumptions into the C code. Future models can swap in with manifest-only changes, or add new contract labels (e.g. a forthcoming dedicated X-Trans model) with a small code patch.

Architecture

src/common/ai/
├── restore.h              Public API + per-variant input contracts (bayer_v1, linear_v1, xtrans_v1 reserved)
├── restore_common.h       Internal struct defs + policy enums + inline helpers (_mirror, black/range)
├── restore.c              Env + model loader. Reads variant attributes from the manifest
│                          and fails fast on contract mismatches.
├── restore_rgb.c/h        RGB denoise + upscale (existing, lightly refactored out of restore.c)
├── restore_raw_bayer.c/h  NEW – RawNIND Bayer pipeline (CFA-native, 4ch packed half-res)
└── restore_raw_linear.c/h NEW – RawNIND linear pipeline (3ch lin_rec2020, used for X-Trans + Foveon)

src/common/dng_writer.c/h  NEW – minimal DNG writer for denoised raws (CFA + LinearRaw variants)

src/libs/neural_restore.c  Existing lighttable module, significantly extended:
                           - new raw denoise tab
                           - parallel preview/batch paths for raw
                           - 3-way sensor classifier (Bayer / X-Trans / generic-demosaic)
                           - per-task preview cache + debounced tab/selection switching

Sensor coverage

Sensor class Supported Path
Bayer RGGB / BGGR / GRBG / GBRG Native Bayer pipeline
Fuji X-Trans (filters == 9u) Linear pipeline (dedicated model reserved for future release)
Foveon / monochrome-with-pattern Linear pipeline
Pure monochrome Rejected with dt_control_log
Non-raw (JPEG/TIFF) Rejected

Verification

  • Built clean with both Release and --asan. No heap-use-after-free, no buffer overflows.
  • End-to-end tested with Canon R7 (RGGB), Nikon D7200 (RGGB), Nikon D200 (GRBG), OM System OM-1 (RGGB).
  • Non-RGGB packing verified by direct byte-level comparison against Benoit Brummer's reference Python implementation in RawNIND: EXACT MATCH for tile interiors, corner-tile discrepancy traced to mirror-padding convention and fixed via the edge_pad: mirror_cropped default.
  • Preview/batch parity: both paths share _compute_bayer_prep, _pack_bayer_tile, _bayer_gain_match, _bayer_remosaic_raw, _resolve_linear_wb, _build_cam_matrices, _linear_exposure_boost, _linear_gain_match_3ch. No separate implementations to drift.

Credits

  • Benoit Brummer (UCLouvain) – author of RawNIND dataset + UtNet2 model, author of the ONNX conversion script, and reviewed/validated darktable's preprocessing pipeline against training.
  • Published paper: Noise-Aware Raw-to-RGB Restoration.
  • RawNIND training code: https://github.com/trougnouf/rawnind_jddc (GPL-3.0).
  • Training data: RawNIND (CC BY 4.0 / CC0 per-image, hosted on Wikimedia Commons + UCLouvain Dataverse).

@TurboGit
Copy link
Copy Markdown
Member

TurboGit commented Apr 23, 2026

@andriiryzhkov : Nice work! Do you have some screenshots before/after to share to see where we stand with this? TIA.

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

andriiryzhkov commented Apr 23, 2026

Sure.

It basically adds new tab "raw denoise" in neural restore model. With similar semantics.
rawdenoise

In the "denoise" tab there's also change in slider - consistent naming and direction of action. Algorithm behind is the same, but a bit stronger.
denoise

Before it looked like this:
denoise-before

Comment thread src/imageio/imageio_dng.c
@da-phil
Copy link
Copy Markdown
Contributor

da-phil commented Apr 24, 2026

Nice work, thank you!

I was reading the following paragraph

Where it sits in the pipeline

Conceptually: not in the pipeline at all. The denoised result is a new DNG that replaces the noisy source.

And I was wondering why you did not want to make it an iop such as the other raw denoising module. I have to say I'm not a big fan of having auxiliary DNG files flying around on my HDD, if they are strictly not needed.
Or asked differently: what makes this denoiser different from the "raw denoise" module conceptionally? Both would use "linear RAW" data as input and the same as output.
As this module would be very early in the pipeline, its output would be cached for most of the workflows after the initial pipeline run, right?

@KarlMagnusLarsson
Copy link
Copy Markdown

KarlMagnusLarsson commented Apr 25, 2026

Thank you @andriiryzhkov for the PR. It works for me. neural restore -> raw denoise + strength produce denosed dng with different suppression of noise.

I have one issue. If I use this file (the poor mallard subjected to so many tests ...), Canon CR3 raw from Canon R5mk2, then I get a dng file which is too large. The dng is padded with black rectangles.
https://drive.google.com/file/d/1JU9srElIri_l19l7eQ6DvqjtuFQDlbNw/view?usp=drive_link

If I use the same file for the non-raw neural restore -> denoise then I do not have this issue.

Example: neural restore -> raw denoise
Screenshot From 2026-04-25 08-12-40-mark

If I check the raw file in image information I get two sizes:

  • width: 8222 (8480)
  • size: 5488 (5650)
Screenshot From 2026-04-25 08-31-52-mark

I seems that the neural restore -> raw denoise dng output use the (8480) and (5650) dimensions.

This is the image information statement for the output dng file.

Screenshot From 2026-04-25 08-37-04-mark

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@KarlMagnusLarsson, fix pushed. The CFA DNG writer was setting DEFAULTCROPSIZE = full buffer, so the importer didn't crop the optical-black margins (the 258×162 difference between your 8480×5650 raw and 8222×5488 visible). It now writes proper ACTIVEAREA + DEFAULTCROP* tags from the source image's visible-area metadata.

Could you retest with the same Canon R5II raw? Should now produce a clean DNG without the black padding.

Thanks for the catch.

@KarlMagnusLarsson
Copy link
Copy Markdown

@KarlMagnusLarsson, fix pushed. The CFA DNG writer was setting DEFAULTCROPSIZE = full buffer, so the importer didn't crop the optical-black margins (the 258×162 difference between your 8480×5650 raw and 8222×5488 visible). It now writes proper ACTIVEAREA + DEFAULTCROP* tags from the source image's visible-area metadata.

Could you retest with the same Canon R5II raw? Should now produce a clean DNG without the black padding.

Thanks for the catch.

Works for me. Thank you @andriiryzhkov .

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@da-phil : Thanks for the question – fair point, worth explaining the design choice.

Every task in neural_restore (RGB denoise, raw denoise, upscale) is an AI reconstruction that produces a new image, not an in-pipeline filter. Inference takes seconds to tens of seconds even on a decent GPU – orders of magnitude slower than a typical IOP, and not the kind of operation that can re-run on every slider tweak. So each task wraps as "produce a new file once, edit it from then on", and the new file enters the catalog as a normal raw you can work with using all the standard IOPs.

It could be an IOP architecturally – the inputs/outputs are pipeline-shaped. What's missing is infrastructure to make a 10–30 s inference pass practical inside the pipeline:

  • Persistent inference cache. Pipeline cache is in-memory and per-session, so reopening an image would re-trigger inference every time. Needs an on-disk cache keyed on source-raw checksum + model id + version + preprocessing params.
  • Cache invalidation rules. Upstream changes (e.g. WB) would invalidate – in the current DNG design they don't, the result is frozen. Strength slider changes also need a re-blend path that doesn't re-run the model.
  • Backend coexistence. Inference runs on ONNX Runtime (CPU / CUDA / CoreML / DirectML); the pipeline runs OpenCL or CPU. Sharing a GPU between the two during a single render needs care.
  • Tiling alignment. AI tile size + overlap is tuned to the model's receptive field; the pipeline tiles differently. Mismatched seams could produce artifacts that don't show today because we hand back a finished image.

The "produce a DNG" wrapper is the pragmatic 5.6 shape – the architectural questions above would need to land before AI tasks could become first-class IOPs.

@da-phil
Copy link
Copy Markdown
Contributor

da-phil commented Apr 25, 2026

@da-phil : Thanks for the question – fair point, worth explaining the design choice.

Every task in neural_restore (RGB denoise, raw denoise, upscale) is an AI reconstruction that produces a new image, not an in-pipeline filter. Inference takes seconds to tens of seconds even on a decent GPU – orders of magnitude slower than a typical IOP, and not the kind of operation that can re-run on every slider tweak. So each task wraps as "produce a new file once, edit it from then on", and the new file enters the catalog as a normal raw you can work with using all the standard IOPs.

It could be an IOP architecturally – the inputs/outputs are pipeline-shaped. What's missing is infrastructure to make a 10–30 s inference pass practical inside the pipeline:

* **Persistent inference cache.** Pipeline cache is in-memory and per-session, so reopening an image would re-trigger inference every time. Needs an on-disk cache keyed on source-raw checksum + model id + version + preprocessing params.

* **Cache invalidation rules.** Upstream changes (e.g. WB) would invalidate – in the current DNG design they don't, the result is frozen. Strength slider changes also need a re-blend path that doesn't re-run the model.

* **Backend coexistence.** Inference runs on ONNX Runtime (CPU / CUDA / CoreML / DirectML); the pipeline runs OpenCL or CPU. Sharing a GPU between the two during a single render needs care.

* **Tiling alignment.** AI tile size + overlap is tuned to the model's receptive field; the pipeline tiles differently. Mismatched seams could produce artifacts that don't show today because we hand back a finished image.

The "produce a DNG" wrapper is the pragmatic 5.6 shape – the architectural questions above would need to land before AI tasks could become first-class IOPs.

All valid points, indeed, thanks for addressing them.

Will try out the functionality soon.

@KarlMagnusLarsson
Copy link
Copy Markdown

KarlMagnusLarsson commented Apr 25, 2026

I get artifacts with neural restore -> raw denoise -> strength = 100% in some cases.

These artifacts are not present in the raw file after add to library and they are not present if I do the non-raw neural restore -> denoise -> strength = 100% .

If I use this Canon R5mk2 CR3 raw file (102A6405.CR3):
https://drive.google.com/file/d/1ljl4JQwuJTSaQlcWQJtx6fMT94UmJf7F/view?usp=drive_link

Test:

  1. lighttable -> add to library
  2. no other edits
  3. neural restore -> raw denoise -> strength = 100%

then I get:

  1. A vertical line indicating a block.
  2. Some color bleeding or chromatic aberration effect in high contrast area between the bird and the background
Screenshot From 2026-04-25 18-18-57-mark
I also get the same color bleeding or chromatic aberration effect in high contrast area between the hand and the background. Screenshot From 2026-04-25 18-26-39-mark
These artifacts are not present in the raw file after add to library and they are not present if I do the non-raw `neural restore -> denoise -> strength = 100%` .

EDIT: Nvidia CUDA, NVIDIA Quadro RTX 4000 8 GB

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

andriiryzhkov commented Apr 25, 2026

@KarlMagnusLarsson :
For the first issue you reported

A vertical line indicating a block.

fix is already pushed with the last commit.

As for the second one

Some color bleeding or chromatic aberration effect in high contrast area between the bird and the background

This needs a bit more time to investigate. Probably it will require another variant of the model.

Anyway, thank you for testing and reporting. This is very important finding.

@KarlMagnusLarsson
Copy link
Copy Markdown

fix is already pushed with the last commit.

Thanks. Works.

This needs a bit more time to investigate. Probably it will require another variant of the model.

Yes, the effect is rather pronounced. I mean strength = 100 % is perhaps pushing it in many cases, but in this picture, the color bleeding or chromatic aberration effect is there, also at strength = 50 % and even lower.

@KarlMagnusLarsson
Copy link
Copy Markdown

My focus in the frame was the blue tit. In the final rendition most of the hand was not part of the crop.

I do not know if it matters for the raw denoise nind model, but darkroom -> raw overexposure shows that there are a lot of saturated raw data. The bird is fine though. The overexposure of the bird is minimal.

Screenshot From 2026-04-25 21-50-39

@TurboGit TurboGit added this to the 5.6 milestone Apr 26, 2026
@TurboGit TurboGit added priority: medium core features are degraded in a way that is still mostly usable, software stutters feature: new new features to add difficulty: hard big changes across different parts of the code base scope: image processing correcting pixels documentation: pending a documentation work is required release notes: pending labels Apr 26, 2026
@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@KarlMagnusLarsson I've updated the model package and pushed a few changes that should significantly reduce the chromatic fringing:

  1. Switched the active Bayer checkpoint to the "preupsample" variant (Benoit's DenoiserTrainingBayerToProfiledRGB_4ch_2024-03-11-...-preupsample-...-2). It runs the convolutions at full sensor resolution instead of packed half-res, so R/G/B stay spatially coupled through the network – eliminates most of the channel-decoupling artifacts at high-contrast edges.

  2. Aligned preprocessing to the upstream RawNIND inference scripts (simple_denoiser.py, denoise_image.py): no WB pre-applied, scalar match_gain, no exposure boost. The model was trained on raw black/white-normalized RGGB; applying a daylight WB multiplier before inference was pushing inputs ~2× outside the training distribution and produced exactly the chromatic edge artifacts you reported. This turned out to be the dominant fix – the preupsample model on its own gives only a marginal additional improvement.

You'll need to re-fetch the rawdenoise-nind model package manually to pick up the new Bayer checkpoint.

Testing on your 102A6405.CR3 I still see a faint flare around the bottom finger, but it's much less pronounced. Would appreciate if you can rerun on your end and confirm.

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

Pushed a refactor that consolidates the DNG writers under src/imageio/.

What changed

Three DNG writers existed in two places – the legacy header-only float-CFA writer at src/imageio/imageio_dng.h (used by HDR merge) and the libtiff-based Bayer + LinearRaw pair I added in src/common/dng_writer.{c,h} (used by the raw denoise round-trip). All three now live in a single TU at src/imageio/imageio_dng.{c,h}, with the public API renamed to follow the dt_imageio_<fmt>_<verb> convention used by imageio_png_*, imageio_tiff_*, etc.:

Before After
dt_imageio_write_dng dt_imageio_dng_write_float
dt_dng_write_cfa_bayer dt_imageio_dng_write_cfa_bayer
dt_dng_write_linear dt_imageio_dng_write_linear

What didn't change

  • The legacy float writer body is byte-for-byte identical – only the public name changed and the function moved from static inline in the header to extern in the .c. HDR merge uses the same code path it always has.
  • The two libtiff writers' implementations are unchanged; only the names and file location moved.

Side benefits

  • TIFF type-code macros (BYTE/ASCII/SHORT/LONG/RATIONAL/SRATIONAL/HEADBUFFSIZE) and the byte-assembly helpers are now file-static in the .c, so they no longer leak into every TU that includes the DNG header.
  • imageio_dng.c is unconditional now (was AI-gated), matching the rest of imageio/. Verified USE_AI=ON and USE_AI=OFF both build clean on Linux.
  • Public header pulls in common/dttypes.h instead of the much heavier common/darktable.h (only dt_aligned_pixel_t is needed at the API surface).

@KarlMagnusLarsson
Copy link
Copy Markdown

Hello @andriiryzhkov,

The new model does exactly what you state. I see the same thing. There is faint flare around the bottom finger, but it's much less pronounced.

You'll need to re-fetch the rawdenoise-nind model package manually to pick up the new Bayer checkpoint.

OK

testing on your 102A6405.CR3 I still see a faint flare around the bottom finger, but it's much less pronounced. Would appreciate if you can rerun on your end and confirm.

New model: neural restore -> raw denoise -> strength = 100%
bild

Old model: neural restore -> raw denoise -> strength = 100%
Screenshot From 2026-04-27 20-57-24

I still see a faint flare around the bottom finger, but it's much less pronounced. Would appreciate if you can rerun on your end and confirm.

I see the same thing.

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

The new model does exactly what you state. I see the same thing. There is faint flare around the bottom finger, but it's much less pronounced.

Right now, I don't see what else can be done without touching models itself. I will continue testing, but I would need more examples of such behavior to generalize the case better.

Model improvement is also possible, but requires much more time.

@TurboGit
Copy link
Copy Markdown
Member

@andriiryzhkov : If you by you I'd recommend to merge now to get more field testing.

Copy link
Copy Markdown
Member

@TurboGit TurboGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@TurboGit : I just noticed one possible bug in raw denoise preview of X-Trans image. Let's me check it and we can merge after.

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@TurboGit : Preview is fixed. Ready to merge.

Copy link
Copy Markdown
Member

@TurboGit TurboGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@TurboGit TurboGit merged commit 45ca7c9 into darktable-org:master Apr 28, 2026
5 checks passed
@andriiryzhkov andriiryzhkov deleted the nr_rawdenoise branch April 28, 2026 18:51
@da-phil
Copy link
Copy Markdown
Contributor

da-phil commented Apr 28, 2026

I finally had time to test it on one of my astro photos, after it was merged to master.

I'm able to use the new function and generate a DNG file, however I cannot read it back into darktable, see log:

[dt starting] as : darktable -d ai
     0,2350 [ai_models] initialized: models_dir=/home/phil/.local/share/darktable/models, cache_dir=/home/phil/.cache/darktable/ai_downloads
     0,2351 [ai_models] using repository: darktable-org/darktable-ai
     0,2351 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
     0,2351 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
     0,2351 [ai_models] registered model: denoise nind (denoise-nind)
     0,2351 [ai_models] registered model: denoise nafnet small (denoise-nafnet)
     0,2351 [ai_models] registered model: raw denoise nind (rawdenoise-nind)
     0,2351 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
     0,2351 [ai_models] registry loaded: 6 models from /opt/darktable/share/darktable/ai_models.json
     2.4087 [darktable_ai] dt_ai_env_init start.
     2.4088 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
     2.4089 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
     2.4089 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
     2.4089 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
     2.4090 [darktable_ai] discovered: raw denoise nind (rawdenoise-nind, backend=onnx)
     2.4091 [darktable_ai] discovered: denoise nafnet small (denoise-nafnet, backend=onnx)
    51.0321 [neural_restore] raw preview: imgid=62955 bayer patch=(0.500,0.500) widget=752x556 filters=0x94949494
    51.0322 [restore] variant 'bayer': file=model_bayer.onnx input_kind=bayer_v1
    51.0323 [restore] model rawdenoise-nind: coreml_cpu_only=true (ep_flags=1)
    51.0334 [darktable_ai] loaded ORT 1.23.2 from '/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2'
The requested API version [24] is not available, only API versions [1, 23] are supported in this build. Current ORT Version is: 1.23.2
    51.0334 [darktable_ai] ORT 1.23.2: using API version 23 (compiled for 24)
    51.0334 [darktable_ai] execution provider: MIGraphX
    51.0334 [darktable_ai] MIOpen cache: /home/phil/.cache/darktable/ai/amd/miopen
    51.0334 [darktable_ai] MIGraphX cache: /home/phil/.cache/darktable/ai/amd/migraphx
    51.0451 [darktable_ai] loading: /home/phil/.local/share/darktable/models/rawdenoise-nind/model_bayer.onnx
    51.0451 [darktable_ai] attempting to enable AMD MIGraphX...
    51.1381 [darktable_ai] AMD MIGraphX enabled successfully.
2026-04-28 22:15:19.506092189 [W:onnxruntime:DarktableAI, migraphx_execution_provider.cc:167 MIGraphXExecutionProvider] [MIGraphX EP] MIGraphX ENV Override Variables Set:
    60.2867 [neural_restore] raw preview: full=9728x6656 ori=0x0 patch_center=(0.500,0.500) -> sensor=(4488,3050 752x556) bayer
    61.5713 [neural_restore] raw preview: inference returned err=0 src=0x78d769a6e010 denoised=0x78d769f37010 requested=752x556 actual=752x556
    86.4780 [neural_restore] job started: task=raw denoise, scale=1, images=1
    86.4781 [neural_restore] processing imgid 62955 -> /home/phil/Pictures/DSC01857_raw-denoise.dng
    86.4782 [neural_restore] imgid 62955: flags=0x10640 channels=1 filters=0x94949494 (bayer)
    86.6649 [restore_raw_bayer] 9728x6656 sensor (CFA origin 0,0), working 4864x3328 packed, tile T=2048, 3x2 grid (6 tiles)
    86.6697 [restore_raw_bayer] raw CFA range [0.0, 16383.0], black=[512,512,512,512] white=15360 wb_coeffs=[2328.000,1024.000,2248.000,0.000] wb_norm=[1.000,1.000,1.000]
    86.7331 [restore_raw_bayer] tile0 model_input range R=[-0.027,0.266] G1=[-0.026,1.069] G2=[-0.017,1.069] B=[-0.027,0.852]
    87.7319 [restore_raw_bayer] tile0 model_output range R=[-0.002,0.300] G=[-0.026,1.091] B=[-0.011,0.796] in_mean=0.003 out_mean=-2753298.879 gain=-1.241e-09
    92.5519 [restore_raw_bayer] cfa_out u16 range [0, 15360] mean=545 (DNG will advertise black~512 white=15360)
    92.5912 [neural_restore] embedded JPEG preview from source 9504x6336 (5800694 bytes)
    92.5925 [tiff_open] error: TIFFSetField: /home/phil/Pictures/DSC01857_raw-denoise.dng: Unknown tag 33421
    92.5926 [tiff_open] error: TIFFSetField: /home/phil/Pictures/DSC01857_raw-denoise.dng: Unknown tag 33422
    95.4548 [neural_restore] imported imgid=63591: /home/phil/Pictures/DSC01857_raw-denoise.dng
   103.5559 [rawspeed] DSC01857_raw-denoise.dng corrupt: rawspeed::RawImage rawspeed::RawDecoder::decodeRaw(), line 334: rawspeed::TiffEntry* rawspeed::TiffIFD::getEntry(rawspeed::TiffTag) const, line 316: Entry 0x828d not found.
   103.6485 [rawspeed] DSC01857_raw-denoise.dng corrupt: rawspeed::RawImage rawspeed::RawDecoder::decodeRaw(), line 334: rawspeed::TiffEntry* rawspeed::TiffIFD::getEntry(rawspeed::TiffTag) const, line 316: Entry 0x828d not found.

Is there anything I'm missing?

You can try it for yourself with this image:
https://raw.pixls.us/data/Sony/ILCE-7RM5/7RM5-LosslessCompressedLarge.ARW

When I'm trying Olmypus RAWs I occasionally get crashes:

[dt starting] as : darktable -d ai
     0,2469 [ai_models] initialized: models_dir=/home/phil/.local/share/darktable/models, cache_dir=/home/phil/.cache/darktable/ai_downloads
     0,2470 [ai_models] using repository: darktable-org/darktable-ai
     0,2471 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
     0,2471 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
     0,2471 [ai_models] registered model: denoise nind (denoise-nind)
     0,2471 [ai_models] registered model: denoise nafnet small (denoise-nafnet)
     0,2471 [ai_models] registered model: raw denoise nind (rawdenoise-nind)
     0,2471 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
     0,2471 [ai_models] registry loaded: 6 models from /opt/darktable/share/darktable/ai_models.json
     2.4171 [darktable_ai] dt_ai_env_init start.
     2.4172 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
     2.4172 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
     2.4172 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
     2.4172 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
     2.4173 [darktable_ai] discovered: raw denoise nind (rawdenoise-nind, backend=onnx)
     2.4174 [darktable_ai] discovered: denoise nafnet small (denoise-nafnet, backend=onnx)

(org.darktable.darktable:1833999): GVFS-WARNING **: 23:05:42.580: can't init metadata tree /home/phil/.local/share/gvfs-metadata/home: open: Permission denied

(org.darktable.darktable:1833999): GVFS-WARNING **: 23:05:42.580: can't init metadata tree /home/phil/.local/share/gvfs-metadata/home: open: Permission denied
    21.9851 [neural_restore] raw preview: imgid=63596 bayer patch=(0.500,0.500) widget=752x556 filters=0x94949494
    21.9852 [restore] variant 'bayer': file=model_bayer.onnx input_kind=bayer_v1
    21.9852 [restore] model rawdenoise-nind: coreml_cpu_only=true (ep_flags=1)
    21.9866 [darktable_ai] loaded ORT 1.23.2 from '/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2'
The requested API version [24] is not available, only API versions [1, 23] are supported in this build. Current ORT Version is: 1.23.2
    21.9866 [darktable_ai] ORT 1.23.2: using API version 23 (compiled for 24)
    21.9866 [darktable_ai] execution provider: MIGraphX
    21.9866 [darktable_ai] MIOpen cache: /home/phil/.cache/darktable/ai/amd/miopen
    21.9866 [darktable_ai] MIGraphX cache: /home/phil/.cache/darktable/ai/amd/migraphx
    21.9982 [darktable_ai] loading: /home/phil/.local/share/darktable/models/rawdenoise-nind/model_bayer.onnx
    21.9982 [darktable_ai] attempting to enable AMD MIGraphX...
    22.0870 [darktable_ai] AMD MIGraphX enabled successfully.
2026-04-28 23:05:45.936194779 [W:onnxruntime:DarktableAI, migraphx_execution_provider.cc:167 MIGraphXExecutionProvider] [MIGraphX EP] MIGraphX ENV Override Variables Set:
    30.5445 [neural_restore] raw preview: full=5240x3912 ori=0x0 patch_center=(0.500,0.500) -> sensor=(2244,1678 752x556) bayer
    31.7270 [neural_restore] raw preview: inference returned err=0 src=0x7b7ec8765010 denoised=0x7b7ec604b010 requested=752x556 actual=752x556
    36.4724 [neural_restore] job started: task=raw denoise, scale=1, images=1
    36.4725 [neural_restore] processing imgid 63596 -> /home/phil/Downloads/PB290154_raw-denoise.dng
    36.4725 [neural_restore] imgid 63596: flags=0x641 channels=1 filters=0x94949494 (bayer)
    36.5320 [restore_raw_bayer] 5240x3912 sensor (CFA origin 0,0), working 2620x1956 packed, tile T=2048, 2x1 grid (2 tiles)
    36.5363 [restore_raw_bayer] raw CFA range [253.0, 953.0], black=[253,253,253,253] white=3996 wb_coeffs=[560.000,256.000,420.000,0.000] wb_norm=[1.000,1.000,1.000]
    36.6061 [restore_raw_bayer] tile0 model_input range R=[-0.001,0.139] G1=[0.001,0.311] G2=[0.000,0.400] B=[0.000,0.206]
    37.5618 [restore_raw_bayer] tile0 model_output range R=[0.000,0.104] G=[0.001,0.209] B=[0.000,0.115] in_mean=0.033 out_mean=-26818124.916 gain=-1.220e-09
    38.5691 [restore_raw_bayer] cfa_out u16 range [255, 1036] mean=372 (DNG will advertise black~253 white=3996)
    38.5863 [neural_restore] embedded JPEG preview from source 3200x2400 (978222 bytes)
darktable: magick/error.c:1048: ThrowLoggedException: Assertion `exception->signature == MagickSignature' failed.
[1]    1833999 IOT instruction (core dumped)  darktable -d ai

I was also able to crash it under gdb and got the following backtrace:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff744527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff74288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff742881b in __assert_fail_base
    (fmt=0x7ffff75d01e8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7ffff4a492b0 "exception != (ExceptionInfo *) NULL", file=file@entry=0x7ffff4a3d77e "magick/error.c", line=line@entry=1046, function=function@entry=0x7ffff4a6af60 <__PRETTY_FUNCTION__.0.lto_priv.11> "ThrowLoggedException") at ./assert/assert.c:96
#6  0x00007ffff743b517 in __assert_fail
    (assertion=assertion@entry=0x7ffff4a492b0 "exception != (ExceptionInfo *) NULL", file=file@entry=0x7ffff4a3d77e "magick/error.c", line=line@entry=1046, function=function@entry=0x7ffff4a6af60 <__PRETTY_FUNCTION__.0.lto_priv.11> "ThrowLoggedException") at ./assert/assert.c:105
#7  0x00007ffff4899592 in ThrowLoggedException
    (exception=0x0, severity=CorruptImageError, reason=0x7fff8e227fb0 "/home/phil/Downloads/PB290154_raw-denoise_1.dng: Unknown tag 33421.", description=0x7ffff58e8048 "TIFFSetField", module=0x7ffff4a47b76 "coders/tiff.c", function=0x7ffff4aac760 <__func__.5.lto_priv.5> "TIFFReadErrors", line=952) at magick/error.c:1046
#8  0x00007ffff49f7012 in TIFFReadErrors (module=0x7ffff58e8048 "TIFFSetField", format=0x7ffff58e8033 "%s: Unknown %stag %u", warning=0x7fff8e2287f0) at coders/tiff.c:952
#9  0x00007ffff58a88c7 in TIFFErrorExtR () at /lib/x86_64-linux-gnu/libtiff.so.6
#10 0x00007ffff589a0d7 in TIFFVSetField () at /lib/x86_64-linux-gnu/libtiff.so.6
#11 0x00007ffff589a18a in TIFFSetField () at /lib/x86_64-linux-gnu/libtiff.so.6
#12 0x00007ffff7b62587 in dt_imageio_dng_write_cfa_bayer
    (filename=filename@entry=0x7fff8e22b670 "/home/phil/Downloads/PB290154_raw-denoise_1.dng", cfa=cfa@entry=0x7ffd590e6010, width=width@entry=5240, height=height@entry=3912, img=img@entry=0x7fff8e228e10, exif_blob=0x7--Type <RET> for more, q to quit, c to continue without paging--c
fff80102cc0, exif_len=17144, preview=0x7fff8e228b70) at /home/phil/code/darktable/src/imageio/imageio_dng.c:211
#13 0x00007fff60329792 in _process_raw_denoise_bayer
    (img_meta=0x7fff8e228e10, src_path=0x7fff8e229670 "/home/phil/Downloads/PB290154.ORF", out_filename=0x7fff8e22b670 "/home/phil/Downloads/PB290154_raw-denoise_1.dng", imgid=63596, j=<optimized out>)
    at /home/phil/code/darktable/src/libs/neural_restore.c:1136
#14 _process_raw_denoise_one (src_path=0x7fff8e229670 "/home/phil/Downloads/PB290154.ORF", out_filename=0x7fff8e22b670 "/home/phil/Downloads/PB290154_raw-denoise_1.dng", imgid=63596, j=<optimized out>)
    at /home/phil/code/darktable/src/libs/neural_restore.c:1250
#15 _process_job_run (job=0x555558a1e4e0) at /home/phil/code/darktable/src/libs/neural_restore.c:1427
#16 0x00007ffff79fce4b in _control_job_execute (job=job@entry=0x555558a1e4e0) at /home/phil/code/darktable/src/control/jobs.c:321
#17 0x00007ffff79fdd68 in _control_run_job (control=0x555555cfedb0) at /home/phil/code/darktable/src/control/jobs.c:336
#18 _control_work (ptr=<optimized out>) at /home/phil/code/darktable/src/control/jobs.c:565
#19 0x00007ffff749caa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#20 0x00007ffff7529c6c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

I could reproduce it with this Olympus RAW file:
https://raw.pixls.us/data/OLYMPUS/E-M5%20Mark%20III/PB290154.ORF

This also happens on the CPU path:

[dt starting] as : darktable -d ai
     0,2381 [ai_models] initialized: models_dir=/home/phil/.local/share/darktable/models, cache_dir=/home/phil/.cache/darktable/ai_downloads
     0,2381 [ai_models] using repository: darktable-org/darktable-ai
     0,2382 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
     0,2382 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
     0,2382 [ai_models] registered model: denoise nind (denoise-nind)
     0,2382 [ai_models] registered model: denoise nafnet small (denoise-nafnet)
     0,2382 [ai_models] registered model: raw denoise nind (rawdenoise-nind)
     0,2382 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
     0,2382 [ai_models] registry loaded: 6 models from /opt/darktable/share/darktable/ai_models.json
     2.4124 [darktable_ai] dt_ai_env_init start.
     2.4125 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
     2.4125 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
     2.4126 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
     2.4126 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
     2.4127 [darktable_ai] discovered: raw denoise nind (rawdenoise-nind, backend=onnx)
     2.4127 [darktable_ai] discovered: denoise nafnet small (denoise-nafnet, backend=onnx)
     7.8452 [ai_models] found compatible release: 5.5.0.13
     9.2575 [preferences_ai] refreshing model list, count=6
     9.2575 [preferences_ai] adding model: mask-object-sam21-small
     9.2576 [preferences_ai] adding model: mask-object-segnext-b2hq
     9.2576 [preferences_ai] adding model: denoise-nind
     9.2576 [preferences_ai] adding model: denoise-nafnet
     9.2577 [preferences_ai] adding model: rawdenoise-nind
     9.2577 [preferences_ai] adding model: upscale-bsrgan
    30.8636 [neural_restore] raw preview: imgid=63596 bayer patch=(0.500,0.500) widget=752x556 filters=0x94949494
    30.8637 [restore] variant 'bayer': file=model_bayer.onnx input_kind=bayer_v1
    30.8638 [restore] tile size 2048 (scale=1, need 4896MB, budget 5468MB)
    30.8638 [restore] model rawdenoise-nind: coreml_cpu_only=true (ep_flags=1)
    30.8648 [darktable_ai] loaded ORT 1.23.2 from '/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2'
The requested API version [24] is not available, only API versions [1, 23] are supported in this build. Current ORT Version is: 1.23.2
    30.8648 [darktable_ai] ORT 1.23.2: using API version 23 (compiled for 24)
    30.8648 [darktable_ai] execution provider: CPU
    30.8648 [darktable_ai] MIOpen cache: /home/phil/.cache/darktable/ai/amd/miopen
    30.8648 [darktable_ai] MIGraphX cache: /home/phil/.cache/darktable/ai/amd/migraphx
    30.8743 [darktable_ai] loading: /home/phil/.local/share/darktable/models/rawdenoise-nind/model_bayer.onnx
    30.8743 [darktable_ai] using CPU only (no hardware acceleration)
    31.4773 [neural_restore] raw preview: full=5240x3912 ori=0x0 patch_center=(0.500,0.500) -> sensor=(2244,1678 752x556) bayer
    60.4117 [neural_restore] raw preview: inference returned err=0 src=0x7a9579ccc010 denoised=0x7a95775b2010 requested=752x556 actual=752x556
    66.0921 [neural_restore] job started: task=raw denoise, scale=1, images=1
    66.0924 [neural_restore] processing imgid 63596 -> /home/phil/Downloads/PB290154_raw-denoise_2.dng
    66.0924 [neural_restore] imgid 63596: flags=0x641 channels=1 filters=0x94949494 (bayer)
    66.1455 [restore_raw_bayer] 5240x3912 sensor (CFA origin 0,0), working 2620x1956 packed, tile T=2048, 2x1 grid (2 tiles)
    66.1489 [restore_raw_bayer] raw CFA range [253.0, 953.0], black=[253,253,253,253] white=3996 wb_coeffs=[560.000,256.000,420.000,0.000] wb_norm=[1.000,1.000,1.000]
    66.2084 [restore_raw_bayer] tile0 model_input range R=[-0.001,0.139] G1=[0.001,0.311] G2=[0.000,0.400] B=[0.000,0.206]
   118.4737 [restore_raw_bayer] tile0 model_output range R=[0.000,0.104] G=[0.001,0.209] B=[0.000,0.115] in_mean=0.033 out_mean=-26818139.773 gain=-1.220e-09
   144.8590 [restore_raw_bayer] cfa_out u16 range [255, 1036] mean=372 (DNG will advertise black~253 white=3996)
   144.8930 [neural_restore] embedded JPEG preview from source 3200x2400 (978222 bytes)
darktable: magick/error.c:1046: ThrowLoggedException: Assertion `exception != (ExceptionInfo *) NULL' failed.
[1]    1836158 IOT instruction (core dumped)  darktable -d ai

@dtrtuser
Copy link
Copy Markdown

I gave it a try and the results are very promising, thank you Andrii!

But the processing time is very long (3+ minutes for a 24 megapixel DNG image from Ricoh GR IV camera). The other neural restore denoise only takes a couple of seconds on the same image.

I checked that it was running on GPU. As you can see, CPU is idle and GPU at 100% for more than 3 minutes. I must have something configured wrongly I guess.

Screenshot 2026-04-28 195050

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@da-phil :

I'm able to use the new function and generate a DNG file, however I cannot read it back into darktable, see log:

This is interesting. I was not able to reproduce this neither on my build from master, nor on last nightly build. I have couple of questions to understand situation better:

  • what OS are you running?
  • are you building master or you used nightly build?
  • if you are building, can you try on nightly build?
  • can you check libtiff version pkg-config --modversion libtiff-4?

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@dtrtuser :

I checked that it was running on GPU

What execution provider are you using?
On Windows I'd recommend starting from DirectML.

@da-phil
Copy link
Copy Markdown
Contributor

da-phil commented Apr 29, 2026

@da-phil :

I'm able to use the new function and generate a DNG file, however I cannot read it back into darktable, see log:

This is interesting. I was not able to reproduce this neither on my build from master, nor on last nightly build. I have couple of questions to understand situation better:

* what OS are you running?

* are you building master or you used nightly build?

* if you are building, can you try on nightly build?

* can you check libtiff version `pkg-config --modversion libtiff-4`?

Sorry, should have mentioned that in the first place...

  • OS: Ubuntu 24.04.4 LTS
  • building from master, the results from above were with tag 5.5.0+1104~gac64457fcf-dirty (I rebased my touchscreen PR on top of latest master).
  • libtiff version: 4.5.1

I'll give the nightly a shot too later.

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@da-phil : Please, also try PR #20898

@da-phil
Copy link
Copy Markdown
Contributor

da-phil commented Apr 29, 2026

FYI: On my Laptop with an AMD Ryzen 7 8845HS with a Radeon 780M iGPU I needed three runs to eventually be able to compile the model on the first run, the first run crashed my wayland session and the second session failed like that:

[dt starting] as : darktable -d ai
     0,8606 [ai_models] initialized: models_dir=/home/phil/.local/share/darktable/models, cache_dir=/home/phil/.cache/darktable/ai_downloads
     0,8634 [ai_models] using repository: darktable-org/darktable-ai
     0,8634 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
     0,8635 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
     0,8635 [ai_models] registered model: denoise nind (denoise-nind)
     0,8635 [ai_models] registered model: denoise nafnet small (denoise-nafnet)
     0,8635 [ai_models] registered model: raw denoise nind (rawdenoise-nind)
     0,8635 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
     0,8635 [ai_models] registry loaded: 6 models from /opt/darktable/share/darktable/ai_models.json
     7.4071 [darktable_ai] dt_ai_env_init start.
     7.4073 [darktable_ai] discovered: raw denoise nind (rawdenoise-nind, backend=onnx)
     7.4074 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
     7.4075 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
     7.4076 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
     7.4076 [darktable_ai] discovered: denoise nafnet small (denoise-nafnet, backend=onnx)
     7.4077 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
    21.2567 [neural_restore] raw preview: imgid=31758 bayer patch=(0.500,0.500) widget=741x556 filters=0x94949494
    21.2570 [restore] variant 'bayer': file=model_bayer.onnx input_kind=bayer_v1
    21.2571 [restore] model rawdenoise-nind: coreml_cpu_only=true (ep_flags=1)
    21.2648 [darktable_ai] loaded ORT 1.23.2 from '/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2'
The requested API version [24] is not available, only API versions [1, 23] are supported in this build. Current ORT Version is: 1.23.2
    21.2649 [darktable_ai] ORT 1.23.2: using API version 23 (compiled for 24)
    21.2649 [darktable_ai] execution provider: MIGraphX
    21.2652 [darktable_ai] MIOpen cache: /home/phil/.cache/darktable/ai/amd/miopen
    21.2652 [darktable_ai] MIGraphX cache: /home/phil/.cache/darktable/ai/amd/migraphx
    21.2863 [darktable_ai] loading: /home/phil/.local/share/darktable/models/rawdenoise-nind/model_bayer.onnx
    21.2863 [darktable_ai] attempting to enable AMD MIGraphX...
    21.5321 [darktable_ai] AMD MIGraphX enabled successfully.
2026-04-29 17:31:41.466596677 [W:onnxruntime:DarktableAI, migraphx_execution_provider.cc:167 MIGraphXExecutionProvider] [MIGraphX EP] MIGraphX ENV Override Variables Set:
2026-04-29 17:31:51.820432260 [W:onnxruntime:DarktableAI, migraphx_execution_provider.cc:1309 compile_program] Model Compile: Begin
Error gpu::compile_ops: /longer_pathname_so_that_rpms_can_support_packaging_the_debug_info_for_all_os_profiles/src/AMDMIGraphX/src/targets/gpu/include/migraphx/gpu/context.hpp:153: wait: Failed to wait: unspecified launch failure
Dump: "/tmp/migraphx/gpu::compile_ops157723120009636.mxr"
migraphx_program_compile: Error: /longer_pathname_so_that_rpms_can_support_packaging_the_debug_info_for_all_os_profiles/src/AMDMIGraphX/src/targets/gpu/include/migraphx/gpu/context.hpp:153: wait: Failed to wait: unspecified launch failure
2026-04-29 17:34:42.286451257 [E:onnxruntime:, inference_session.cc:2544 operator()] Exception during initialization: Failed to call function
   202.8756 [darktable_ai] session failed: Exception during initialization: Failed to call function - retrying with provider + basic opt
   202.8756 [darktable_ai] attempting to enable AMD MIGraphX...
   202.8756 [darktable_ai] AMD MIGraphX enabled successfully.
   202.8958 [darktable_ai] session failed: /onnxruntime/onnxruntime/core/providers/migraphx/migraphx_call.cc:66 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::RocmCall(ERRTYPE, std::string_view, std::string_view, ERRTYPE, std::string_view, std::string_view, int) [with ERRTYPE = hipError_t; bool THRW = true; std::conditional_t<THRW, void, common::Status> = void; std::string_view = std::basic_string_view<char>] /onnxruntime/onnxruntime/core/providers/migraphx/migraphx_call.cc:59 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::RocmCall(ERRTYPE, std::string_view, std::string_view, ERRTYPE, std::string_view, std::string_view, int) [with ERRTYPE = hipError_t; bool THRW = true; std::conditional_t<THRW, void, common::Status> = void; std::string_view = std::basic_string_view<char>] HIP failure 719: unspecified launch failure; GPU=30948; hostname=phil-InfinityBook
 - retrying with CPU + full opt

@da-phil
Copy link
Copy Markdown
Contributor

da-phil commented Apr 29, 2026

On the same laptop I keep getting these out-of-memory errors when running the model:

2026-04-29 18:48:44.500202732 [E:onnxruntime:, inference_session.cc:2544 operator()] 
Exception during initialization: /onnxruntime/onnxruntime/core/providers/migraphx/
migraphx_call.cc:66 std::conditional_t<THRW, void, onnxruntime::common::Status> 
onnxruntime::RocmCall(ERRTYPE, std::string_view, std::string_view, ERRTYPE, 
std::string_view, std::string_view, int) [with ERRTYPE = hipError_t; bool THRW = true; 
std::conditional_t<THRW, void, common::Status> = void; std::string_view = 
std::basic_string_view<char>] /onnxruntime/onnxruntime/core/providers/migraphx/
migraphx_call.cc:59 std::conditional_t<THRW, void, onnxruntime::common::Status> 
onnxruntime::RocmCall(ERRTYPE, std::string_view, std::string_view, ERRTYPE, 
std::string_view, std::string_view, int) [with ERRTYPE = hipError_t; bool THRW = true; 
std::conditional_t<THRW, void, common::Status> = void; std::string_view = 
std::basic_string_view<char>] HIP failure 2: out of memory; GPU=0; hostname=phil-InfinityBook

I have assigned 10 GB RAM as GTT (pageable GPU memory). When running the model on a 24 MP Olympus E-M1 Mk2 camera RAW darktable consumes 26 out of my 32 GB of available RAM. Something is really weird 🤔

image

I get this behavior also with the fix PR you mentioned above.

@dtrtuser
Copy link
Copy Markdown

dtrtuser commented Apr 29, 2026

@dtrtuser :

I checked that it was running on GPU

What execution provider are you using? On Windows I'd recommend starting from DirectML.

Yes this is what I am using:

Screenshot 2026-04-29 113510

Using CPU, I am getting significantly better performance.

and here is the output of -d ai with GPU:

`========================================
version: darktable 5.5.0+1120~g76988b1e45
start: 2026-04-29 11:51:33

darktable 5.5.0+1120~g76988b1e45
Copyright (C) 2012-2026 Johannes Hanika and other contributors.

Compile options:
Bit depth -> 64 bit
Exiv2 -> 0.28.8
Lensfun -> 0.3.4
Debug -> DISABLED
SSE2 optimizations -> ENABLED
OpenMP -> ENABLED
OpenCL -> ENABLED
Lua -> DISABLED
Colord -> DISABLED
gPhoto2 -> DISABLED - Camera tethering is NOT available
OSMGpsMap -> DISABLED - Map view is NOT available
GMIC -> DISABLED - Compressed LUTs are NOT supported
GraphicsMagick -> DISABLED
ImageMagick -> DISABLED
libavif -> DISABLED
libheif -> DISABLED
libjxl -> DISABLED
LibRaw -> ENABLED - Version 0.22.0-Release
OpenJPEG -> DISABLED
OpenEXR -> DISABLED
WebP -> DISABLED
AI -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

[dt starting] as : bin\darktable.exe -d ai
0.5225 [ai_models] initialized: models_dir=C:\Users\dtrtuser\AppData\Local\darktable\models, cache_dir=C:\Users\dtrtuser\AppData\Local\Microsoft\Windows\INetCache\darktable\ai_downloads
0.5227 [ai_models] using repository: darktable-org/darktable-ai
0.5227 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
0.5227 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
0.5227 [ai_models] registered model: denoise nind (denoise-nind)
0.5227 [ai_models] registered model: denoise nafnet small (denoise-nafnet)
0.5227 [ai_models] registered model: raw denoise nind (rawdenoise-nind)
0.5227 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
0.5227 [ai_models] registry loaded: 6 models from C:\Program Files\darktable\share\darktable\ai_models.json
0.5239 [ai_models] discovered local model: embed openclip vitb32 (embed-openclip-vitb32)
0.5240 [ai_models] discovered local model: mask sam2.1 hiera base plus (mask-object-sam21-base-plus)
12.8817 [darktable_ai] dt_ai_env_init start.
12.8823 [darktable_ai] discovered: denoise nafnet small (denoise-nafnet, backend=onnx)
12.8826 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
12.8828 [darktable_ai] discovered: embed openclip vitb32 (embed-openclip-vitb32, backend=onnx)
12.8830 [darktable_ai] discovered: mask sam2.1 hiera base plus (mask-object-sam21-base-plus, backend=onnx)
12.8831 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
12.8833 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
12.8835 [darktable_ai] discovered: raw denoise nind (rawdenoise-nind, backend=onnx)
12.8837 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
18.2591 [neural_restore] raw preview: imgid=95225 bayer patch=(0.500,0.500) widget=560x575 filters=0x94949494
18.2592 [restore] variant 'bayer': file=model_bayer.onnx input_kind=bayer_v1
18.2593 [restore] model rawdenoise-nind: coreml_cpu_only=true (ep_flags=1)
18.2594 [darktable_ai] loaded ORT 1.24.4 (bundled)
18.2594 [darktable_ai] execution provider: DirectML
18.2881 [darktable_ai] loading: C:\Users\dtrtuser\AppData\Local\darktable\models\rawdenoise-nind\model_bayer.onnx
18.2882 [darktable_ai] attempting to enable Windows DirectML...
18.4462 [darktable_ai] Windows DirectML enabled successfully.
19.5258 [neural_restore] raw preview: full=6112x4064 ori=0x0 patch_center=(0.500,0.500) -> sensor=(2776,1744 560x574) bayer
85.1696 [neural_restore] raw preview: inference returned err=0 src=000001d5c5f9c040 denoised=000001d5c5bdb040 requested=560x574 actual=560x574
107.8540 [neural_restore] job started: task=raw denoise, scale=1, images=1
107.8545 [neural_restore] processing imgid 95225 -> H:\Pictures\2026\2026-03-18/20260318-GX000474_raw-denoise.dng
107.8545 [neural_restore] imgid 95225: flags=0x641 channels=1 filters=0x94949494 (bayer)
107.8969 [restore_raw_bayer] 6112x4064 sensor (CFA origin 0,0), working 3056x2032 packed, tile T=2048, 2x2 grid (4 tiles)
107.9005 [restore_raw_bayer] raw CFA range [0.0, 16383.0], black=[1024,1024,1024,1024] white=15330 wb_coeffs=[2.633,1.000,1.824,0.000] wb_norm=[1.000,1.000,1.000]
107.9515 [restore_raw_bayer] tile0 model_input range R=[-0.029,1.074] G1=[-0.026,1.074] G2=[-0.035,1.074] B=[-0.030,1.074]
170.5406 [restore_raw_bayer] tile0 model_output range R=[-0.010,1.350] G=[-0.080,1.507] B=[-0.018,1.385] in_mean=0.017 out_mean=-14935454.608 gain=-1.160e-009
358.4885 [restore_raw_bayer] cfa_out u16 range [599, 15330] mean=1217 (DNG will advertise black~1024 white=15330)
358.5024 [neural_restore] no embedded preview in source (rc=1) — writing DNG without thumbnail
358.6665 [neural_restore] imported imgid=95687: H:\Pictures\2026\2026-03-18/20260318-GX000474_raw-denoise.dng

end: 2026-04-29 11:51:33
========================================`

And here is the log with CPU:

`[dt starting] as : C:\Program Files\darktable\bin\darktable.exe -d ai
0.5351 [ai_models] initialized: models_dir=C:\Users\dtrtuser\AppData\Local\darktable\models, cache_dir=C:\Users\dtrtuser\AppData\Local\Microsoft\Windows\INetCache\darktable\ai_downloads
0.5354 [ai_models] using repository: darktable-org/darktable-ai
0.5354 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
0.5354 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
0.5354 [ai_models] registered model: denoise nind (denoise-nind)
0.5354 [ai_models] registered model: denoise nafnet small (denoise-nafnet)
0.5354 [ai_models] registered model: raw denoise nind (rawdenoise-nind)
0.5354 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
0.5354 [ai_models] registry loaded: 6 models from C:\Program Files\darktable\share\darktable\ai_models.json
0.5366 [ai_models] discovered local model: embed openclip vitb32 (embed-openclip-vitb32)
0.5367 [ai_models] discovered local model: mask sam2.1 hiera base plus (mask-object-sam21-base-plus)
11.5755 [darktable_ai] dt_ai_env_init start.
11.5763 [darktable_ai] discovered: denoise nafnet small (denoise-nafnet, backend=onnx)
11.5765 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
11.5767 [darktable_ai] discovered: embed openclip vitb32 (embed-openclip-vitb32, backend=onnx)
11.5770 [darktable_ai] discovered: mask sam2.1 hiera base plus (mask-object-sam21-base-plus, backend=onnx)
11.5771 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
11.5773 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
11.5774 [darktable_ai] discovered: raw denoise nind (rawdenoise-nind, backend=onnx)
11.5775 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
24.1765 [restore] variant 'bayer': file=model_bayer.onnx input_kind=bayer_v1
24.1766 [restore] model rawdenoise-nind: coreml_cpu_only=true (ep_flags=1)
24.1768 [darktable_ai] loaded ORT 1.24.4 (bundled)
24.1768 [darktable_ai] execution provider: CPU
24.2156 [darktable_ai] loading: C:\Users\dtrtuser\AppData\Local\darktable\models\rawdenoise-nind\model_bayer.onnx
24.2157 [darktable_ai] using CPU only (no hardware acceleration)
24.2750 [neural_restore] job started: task=raw denoise, scale=1, images=1
24.2754 [neural_restore] processing imgid 95225 -> H:\Pictures\2026\2026-03-18/20260318-GX000474_raw-denoise.dng
24.2755 [neural_restore] imgid 95225: flags=0x641 channels=1 filters=0x94949494 (bayer)
24.3105 [restore_raw_bayer] 6112x4064 sensor (CFA origin 0,0), working 3056x2032 packed, tile T=2048, 2x2 grid (4 tiles)
24.3139 [restore_raw_bayer] raw CFA range [0.0, 16383.0], black=[1024,1024,1024,1024] white=15330 wb_coeffs=[2.633,1.000,1.824,0.000] wb_norm=[1.000,1.000,1.000]
24.3651 [restore_raw_bayer] tile0 model_input range R=[-0.029,1.074] G1=[-0.026,1.074] G2=[-0.035,1.074] B=[-0.030,1.074]
40.1077 [restore_raw_bayer] tile0 model_output range R=[-0.010,1.350] G=[-0.080,1.507] B=[-0.018,1.385] in_mean=0.017 out_mean=-14935454.224 gain=-1.160e-009
90.2847 [restore_raw_bayer] cfa_out u16 range [599, 15330] mean=1217 (DNG will advertise black~1024 white=15330)
90.2994 [neural_restore] no embedded preview in source (rc=1) — writing DNG without thumbnail
90.4785 [neural_restore] imported imgid=95690: H:\Pictures\2026\2026-03-18/20260318-GX000474_raw-denoise.dng

end: 2026-04-29 22:49:20
========================================`

@AndDiSa
Copy link
Copy Markdown

AndDiSa commented Apr 30, 2026

I do get an "out of memory error" when using it on my GPU (nVidia RTX 3060 Laptop-GPU with 6GB VRAM):

 34.6930 [neural_restore] raw preview: full=6036x4020 ori=0x0 patch_center=(0.500,0.500) -> sensor=(2782,1910 470x200) bayer
2026-04-30 09:35:11.860348978 [E:onnxruntime:, sequential_executor.cc:572 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/convs2/convs2.2/Conv' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:358 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*) Failed to allocate memory for requested buffer of size 604127488

    35.0168 [darktable_ai] run error: Non-zero status code returned while running Conv node. Name:'/convs2/convs2.2/Conv' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:358 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*) Failed to allocate memory for requested buffer of size 604127488

    35.0187 [neural_restore] raw preview: inference returned err=1 src=(nil) denoised=(nil) requested=470x200 actual=0x0

Is there anything I can do here?

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

andriiryzhkov commented Apr 30, 2026

FYI: On my Laptop with an AMD Ryzen 7 8845HS with a Radeon 780M iGPU I needed three runs to eventually be able to compile the model on the first run

AMD iGPUs are not officially supported by ROCm. So the fact that it eventually works is a huge benefit itself.

I have assigned 10 GB RAM as GTT (pageable GPU memory). When running the model on a 24 MP Olympus E-M1 Mk2 camera RAW darktable consumes 26 out of my 32 GB of available RAM. Something is really weird 🤔

Neural restore tasks run inference by tiles of image. Size of the tile is defined of the amount of available VRAM. It is very hard to calculate how much VRAM is actually available, so try-and-fail mechanism is implemented with tile size ladder defined in model config. If initialization of model with larger tile size fails with OOM error, smaller size is selected and new attempt is taken. Final tile size is cached for the next runs. Given that, you may see OOM logs as a part of this tile size selection mechanism.

@AndDiSa
Copy link
Copy Markdown

AndDiSa commented Apr 30, 2026

For me it's not only the OOM, but also no preview is shown:
"preview initialization failed"
and nvidia-smi shows that darktable is already using 5656 MB of VRAM

Thu Apr 30 13:28:02 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.142                Driver Version: 580.142        CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   43C    P8             11W /   60W |    5779MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            3928      G   /usr/lib/xorg/Xorg                       83MiB |
|    0   N/A  N/A           39250      C   /opt/darktable/bin/darktable           5656MiB |
+-----------------------------------------------------------------------------------------+

@AndDiSa
Copy link
Copy Markdown

AndDiSa commented Apr 30, 2026

@andriiryzhkov
looks as if the automatic adaptation of the tile size is missing. I've implemented a fix on https://github.com/AndDiSa/darktable/tree/cuda-oom-fix and it's working for me now.

@andriiryzhkov
Copy link
Copy Markdown
Collaborator Author

@AndDiSa : PR #20907 should fix OOM retry on preview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

difficulty: hard big changes across different parts of the code base documentation: pending a documentation work is required feature: new new features to add priority: medium core features are degraded in a way that is still mostly usable, software stutters release notes: pending scope: image processing correcting pixels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants