Skip to content

fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen#1486

Open
LarocheC wants to merge 1 commit into
fastmachinelearning:mainfrom
LarocheC:fix/depthwise-conv-via-groups
Open

fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen#1486
LarocheC wants to merge 1 commit into
fastmachinelearning:mainfrom
LarocheC:fix/depthwise-conv-via-groups

Conversation

@LarocheC
Copy link
Copy Markdown
Contributor

@LarocheC LarocheC commented Jun 4, 2026

Description

I have been converting a network that uses depthwise convolutions and ran into a problem I would like to share, along with the local patch I am using, in case it is useful and to learn whether there is a better way to handle it.

I expressed the depthwise layers as Conv1D/Conv2D with groups == in_channels (which is what Keras produces for a depthwise conv written that way). The model converted and compiled without any error, but the hardware output did not match Keras at all (correlation near zero). After digging in, the cause is in the keras_v3 ConvHandler: a grouped Conv falls through to the dense Conv branch, so the (*kernel_size, 1, n_chan) depthwise kernel is loaded into a dense matmul that expects (*kernel_size, n_chan, n_chan) weights. It runs, but it computes the wrong thing. As far as I can tell this is not specific to HGQ, a plain Keras grouped Conv hits it too.

The patch I am proposing detects the depthwise case (groups == in_channels == filters, i.e. depth_multiplier == 1), reshapes the kernel to the (*kernel_size, n_chan, 1) layout the existing depthwise codegen expects, and emits a DepthwiseConv. With that change my model matches Keras. For grouped shapes that I could not find a correct hls4ml kernel for, I made it raise NotImplementedError at conversion instead of silently producing a wrong result (see Known limitations).

I am not sure this is the best place or the best way to fix it, so I would really appreciate feedback:

  • Is routing the grouped Conv to DepthwiseConv in the converter the right approach, or is there a preferred pattern in hls4ml for this that I missed?
  • Is raising on the unsupported shapes the behavior you want, or would you rather warn, or is there appetite for a real grouped convolution kernel down the line?

I also hit a related issue in the bit_exact precision-propagation pass while doing this, which I opened separately as #1485. This PR is the converter side and is independent: it is enough on its own for a plain Keras depthwise conv, and a quantized (HGQ2) depthwise model needs both. They touch different files and do not conflict.

Type of change

  • Bug fix (non-breaking change that fixes an issue)

Known limitations

These are the cases my patch does not handle. I could not find a correct hls4ml kernel for them, so rather than emit a wrong result they now raise NotImplementedError at conversion. Happy to revisit any of these if there is a better option:

  • depth_multiplier > 1 (more than one filter per input channel). The depthwise kernels seem to support only a depth multiplier of 1, and I saw the same behavior with a native DepthwiseConv (io_parallel asserts, io_stream gives a wrong result), so this looks like a pre-existing limitation rather than something this change introduces.
  • General grouped convolutions with 1 < groups < in_channels. I did not find a grouped convolution kernel in hls4ml.

Two more things I noticed but did not change here, in case they are useful context:

  • For a quantized (HGQ2) depthwise model the full path also needs the bit_exact fix in fix(bit_exact): support grouped/depthwise Conv1D/2D in produce_kif #1485; without it the precision-propagation pass raises before reaching codegen.
  • HGQ2 depthwise on io_stream is separately blocked by the existing "heterogeneous quantization for activations is only supported with IOType=io_parallel" limitation.

Tests

Added test/pytest/test_depthwise_via_groups.py:

  • A grouped Conv1D/Conv2D with groups == in_channels is emitted as DepthwiseConv1D/DepthwiseConv2D and matches Keras, across the Vivado and Vitis backends, io_parallel and io_stream, and same/valid/causal padding.
  • General grouped convolutions and depth_multiplier > 1 raise NotImplementedError.

I also ran the existing keras_v3 convolution tests to check I did not regress the standard or native depthwise/separable paths:

pytest test/pytest/test_depthwise_via_groups.py            # 19 passed
pytest test/pytest/test_keras_v3_api.py -k "conv or depthwise or sepconv"   # Vivado/Vitis: 30 passed

Test Configuration: Keras 3.14 (torch backend), Python 3.13, g++ csim.

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation. (not applicable, converter bug fix)
  • My changes generate no new warnings.
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

…hwise codegen

A keras Conv1D/Conv2D with groups == in_channels is a depthwise convolution, but
ConvHandler routed it to the dense Conv path: it loaded the (*kernel_size, 1, n_chan)
depthwise kernel into a dense matmul that expects (*kernel_size, n_chan, n_chan)
weights, so the layer converted and compiled but produced wrong output.

Detect groups == in_channels == filters (depthwise, depth_multiplier 1), reshape the
kernel to the (*kernel_size, n_chan, 1) layout the depthwise codegen expects, and emit
a DepthwiseConv. General grouped convolutions (1 < groups < in_channels) and
depth_multiplier > 1 have no correct hls4ml kernel (the latter is unsupported for a
native DepthwiseConv too) and now raise NotImplementedError instead of being silently
miscomputed. Standard and native depthwise/separable convs are unchanged.

Add test/pytest/test_depthwise_via_groups.py covering the depthwise routing and the
numerical match against Keras for Conv1D and Conv2D, io_parallel and io_stream,
same/valid/causal padding, plus rejection of the unsupported grouped shapes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant