fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen#1486
Open
LarocheC wants to merge 1 commit into
Open
fix(keras_v3): route depthwise Conv1D/2D (groups == channels) to depthwise codegen#1486LarocheC wants to merge 1 commit into
LarocheC wants to merge 1 commit into
Conversation
…hwise codegen A keras Conv1D/Conv2D with groups == in_channels is a depthwise convolution, but ConvHandler routed it to the dense Conv path: it loaded the (*kernel_size, 1, n_chan) depthwise kernel into a dense matmul that expects (*kernel_size, n_chan, n_chan) weights, so the layer converted and compiled but produced wrong output. Detect groups == in_channels == filters (depthwise, depth_multiplier 1), reshape the kernel to the (*kernel_size, n_chan, 1) layout the depthwise codegen expects, and emit a DepthwiseConv. General grouped convolutions (1 < groups < in_channels) and depth_multiplier > 1 have no correct hls4ml kernel (the latter is unsupported for a native DepthwiseConv too) and now raise NotImplementedError instead of being silently miscomputed. Standard and native depthwise/separable convs are unchanged. Add test/pytest/test_depthwise_via_groups.py covering the depthwise routing and the numerical match against Keras for Conv1D and Conv2D, io_parallel and io_stream, same/valid/causal padding, plus rejection of the unsupported grouped shapes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
I have been converting a network that uses depthwise convolutions and ran into a problem I would like to share, along with the local patch I am using, in case it is useful and to learn whether there is a better way to handle it.
I expressed the depthwise layers as
Conv1D/Conv2Dwithgroups == in_channels(which is what Keras produces for a depthwise conv written that way). The model converted and compiled without any error, but the hardware output did not match Keras at all (correlation near zero). After digging in, the cause is in the keras_v3ConvHandler: a groupedConvfalls through to the dense Conv branch, so the(*kernel_size, 1, n_chan)depthwise kernel is loaded into a dense matmul that expects(*kernel_size, n_chan, n_chan)weights. It runs, but it computes the wrong thing. As far as I can tell this is not specific to HGQ, a plain Keras grouped Conv hits it too.The patch I am proposing detects the depthwise case (
groups == in_channels == filters, i.e.depth_multiplier == 1), reshapes the kernel to the(*kernel_size, n_chan, 1)layout the existing depthwise codegen expects, and emits aDepthwiseConv. With that change my model matches Keras. For grouped shapes that I could not find a correct hls4ml kernel for, I made it raiseNotImplementedErrorat conversion instead of silently producing a wrong result (see Known limitations).I am not sure this is the best place or the best way to fix it, so I would really appreciate feedback:
DepthwiseConvin the converter the right approach, or is there a preferred pattern in hls4ml for this that I missed?I also hit a related issue in the bit_exact precision-propagation pass while doing this, which I opened separately as #1485. This PR is the converter side and is independent: it is enough on its own for a plain Keras depthwise conv, and a quantized (HGQ2) depthwise model needs both. They touch different files and do not conflict.
Type of change
Known limitations
These are the cases my patch does not handle. I could not find a correct hls4ml kernel for them, so rather than emit a wrong result they now raise
NotImplementedErrorat conversion. Happy to revisit any of these if there is a better option:depth_multiplier > 1(more than one filter per input channel). The depthwise kernels seem to support only a depth multiplier of 1, and I saw the same behavior with a nativeDepthwiseConv(io_parallel asserts, io_stream gives a wrong result), so this looks like a pre-existing limitation rather than something this change introduces.1 < groups < in_channels. I did not find a grouped convolution kernel in hls4ml.Two more things I noticed but did not change here, in case they are useful context:
io_streamis separately blocked by the existing "heterogeneous quantization for activations is only supported with IOType=io_parallel" limitation.Tests
Added
test/pytest/test_depthwise_via_groups.py:Conv1D/Conv2Dwithgroups == in_channelsis emitted asDepthwiseConv1D/DepthwiseConv2Dand matches Keras, across the Vivado and Vitis backends,io_parallelandio_stream, andsame/valid/causalpadding.depth_multiplier > 1raiseNotImplementedError.I also ran the existing keras_v3 convolution tests to check I did not regress the standard or native depthwise/separable paths:
Test Configuration: Keras 3.14 (torch backend), Python 3.13, g++ csim.
Checklist
pre-commiton the files I edited or added.