fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen by LarocheC · Pull Request #1486 · fastmachinelearning/hls4ml

LarocheC · 2026-06-04T10:30:04Z

Description

I have been converting a network that uses depthwise convolutions and ran into a problem I would like to share, along with the local patch I am using, in case it is useful and to learn whether there is a better way to handle it.

I expressed the depthwise layers as Conv1D/Conv2D with groups == in_channels (which is what Keras produces for a depthwise conv written that way). The model converted and compiled without any error, but the hardware output did not match Keras at all (correlation near zero). After digging in, the cause is in the keras_v3 ConvHandler: a grouped Conv falls through to the dense Conv branch, so the (*kernel_size, 1, n_chan) depthwise kernel is loaded into a dense matmul that expects (*kernel_size, n_chan, n_chan) weights. It runs, but it computes the wrong thing. As far as I can tell this is not specific to HGQ, a plain Keras grouped Conv hits it too.

The patch I am proposing detects the depthwise case (groups == in_channels == filters, i.e. depth_multiplier == 1), reshapes the kernel to the (*kernel_size, n_chan, 1) layout the existing depthwise codegen expects, and emits a DepthwiseConv. With that change my model matches Keras. For grouped shapes that I could not find a correct hls4ml kernel for, I made it raise NotImplementedError at conversion instead of silently producing a wrong result (see Known limitations).

I am not sure this is the best place or the best way to fix it, so I would really appreciate feedback:

Is routing the grouped Conv to DepthwiseConv in the converter the right approach, or is there a preferred pattern in hls4ml for this that I missed?
Is raising on the unsupported shapes the behavior you want, or would you rather warn, or is there appetite for a real grouped convolution kernel down the line?

I also hit a related issue in the bit_exact precision-propagation pass while doing this, which I opened separately as #1485. This PR is the converter side and is independent: it is enough on its own for a plain Keras depthwise conv, and a quantized (HGQ2) depthwise model needs both. They touch different files and do not conflict.

Type of change

Bug fix (non-breaking change that fixes an issue)

Known limitations

These are the cases my patch does not handle. I could not find a correct hls4ml kernel for them, so rather than emit a wrong result they now raise NotImplementedError at conversion. Happy to revisit any of these if there is a better option:

depth_multiplier > 1 (more than one filter per input channel). The depthwise kernels seem to support only a depth multiplier of 1, and I saw the same behavior with a native DepthwiseConv (io_parallel asserts, io_stream gives a wrong result), so this looks like a pre-existing limitation rather than something this change introduces.
General grouped convolutions with 1 < groups < in_channels. I did not find a grouped convolution kernel in hls4ml.

Two more things I noticed but did not change here, in case they are useful context:

For a quantized (HGQ2) depthwise model the full path also needs the bit_exact fix in fix(bit_exact): support grouped/depthwise Conv1D/2D in produce_kif #1485; without it the precision-propagation pass raises before reaching codegen.
HGQ2 depthwise on io_stream is separately blocked by the existing "heterogeneous quantization for activations is only supported with IOType=io_parallel" limitation.

Tests

Added test/pytest/test_depthwise_via_groups.py:

A grouped Conv1D/Conv2D with groups == in_channels is emitted as DepthwiseConv1D/DepthwiseConv2D and matches Keras, across the Vivado and Vitis backends, io_parallel and io_stream, and same/valid/causal padding.
General grouped convolutions and depth_multiplier > 1 raise NotImplementedError.

I also ran the existing keras_v3 convolution tests to check I did not regress the standard or native depthwise/separable paths:

pytest test/pytest/test_depthwise_via_groups.py            # 19 passed
pytest test/pytest/test_keras_v3_api.py -k "conv or depthwise or sepconv"   # Vivado/Vitis: 30 passed

Test Configuration: Keras 3.14 (torch backend), Python 3.13, g++ csim.

Checklist

I have read the guidelines for contributing.
I have commented my code, particularly in hard-to-understand areas.
I have made corresponding changes to the documentation. (not applicable, converter bug fix)
My changes generate no new warnings.
I have installed and run pre-commit on the files I edited or added.
I have added tests that prove my fix is effective or that my feature works.

…hwise codegen A keras Conv1D/Conv2D with groups == in_channels is a depthwise convolution, but ConvHandler routed it to the dense Conv path: it loaded the (*kernel_size, 1, n_chan) depthwise kernel into a dense matmul that expects (*kernel_size, n_chan, n_chan) weights, so the layer converted and compiled but produced wrong output. Detect groups == in_channels == filters (depthwise, depth_multiplier 1), reshape the kernel to the (*kernel_size, n_chan, 1) layout the depthwise codegen expects, and emit a DepthwiseConv. General grouped convolutions (1 < groups < in_channels) and depth_multiplier > 1 have no correct hls4ml kernel (the latter is unsupported for a native DepthwiseConv too) and now raise NotImplementedError instead of being silently miscomputed. Standard and native depthwise/separable convs are unchanged. Add test/pytest/test_depthwise_via_groups.py covering the depthwise routing and the numerical match against Keras for Conv1D and Conv2D, io_parallel and io_stream, same/valid/causal padding, plus rejection of the unsupported grouped shapes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen#1486

fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen#1486
LarocheC wants to merge 1 commit into
fastmachinelearning:mainfrom
LarocheC:fix/depthwise-conv-via-groups

LarocheC commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LarocheC commented Jun 4, 2026

Description

Type of change

Known limitations

Tests

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant