XLS backend#1475
Conversation
… pass, merge of dense_relu written as an opt pass
This fixes two test cases in test_softmax.py (one of them still fails due to another error) TODO: check layer.class_name == 'Input' instead of taking layers[0]?
This fixes DSLX compilation error in test_softmax.py
# Conflicts: # docs/requirements.txt # hls4ml/backends/__init__.py # hls4ml/model/graph.py # hls4ml/report/__init__.py # test/pytest/test_activations.py # test/pytest/test_keras_api.py # test/pytest/test_softmax.py
These things were removed in fastmachinelearning#1321
See fastmachinelearning#1443 Setting 'strategy' for Softmax layer did not affect anything, and the code always chose the default implementation=stable. TODO: all backends fail when implementation=latency (low accuracy, probably due to overflow).
Fixes test_sepconv2d.py for XLS
…mpile(). This avoids reparsing .opt.ir file on subsequent model.predict() calls.
Updated by running docs/attr_doc_gen.py Added XLS backend and other things added to hls4ml since the last update of attributes.rst (Dec 2024): - Libero backend - New layers, e.g.: BipolarQuant, Cropping1D, Cropping2D - New attributes, e.g.: n_inner and n_outer for Softmax.
… speed up tests. This should fix timeout failure on CI: https://gitlab.cern.ch/fastmachinelearning/hls4ml/-/jobs/75309630 Note that XLS tests can be slow due to big (fully unrolled) IR size.
This fixes test failure introduced by b689c7b (fix for fastmachinelearning#1443), see e.g. https://gitlab.cern.ch/fastmachinelearning/hls4ml/-/jobs/75627941
Now each XLS test case takes less than 2 minutes on my machine. All XLS tests take ~1hr. Previously, some test cases could take more than 30 minutes. Note that XLS tests are slower than C++ due to big (fully unrolled) IR size.
|
I reduced XLS dimensions in many tests to make them fast enough, but the test https://gitlab.cern.ch/fastmachinelearning/hls4ml/-/jobs/75768174 failed due to 2hr timeout. That's surprising - on my laptop, |
|
This timeout looks like CI issue, not something real in the code. |
|
If it is not doing synthesis (and it seems that it skips that path), what does it do that takes a long time? |
I feel like that the slowness is not due to the tests themselves, more likely the runner itself stalls on random tests one in a while... https://gitlab.cern.ch/fastmachinelearning/hls4ml/-/jobs/75680057 |
|
Hmm, seems that all tests always run only on -05, other nodes are never used. Gitlab reports them as "stale". Turns out they are not configured right, always reporting as -05, but only the real -05 is used. I've reconfigured 06 and 08 to be separate, they are in the pool now. Sadly 07 doesn't respond, will need to ping an openlab admin to configure that one. Hopefully then it will be a bit less crowded and tests will run faster |
Description
This PR adds XLS backend. It is based on PR #1343, with most of the code rewritten and new features added.
Google XLS is an open-source (Apache 2) High Level Syntesis toolchain that produces an RTL (Verilog or SystemVerilog) design from a high-level description (DSLX or C++).
Adding XLS as a new hls4ml backend allows to generate RTL without vendor-specific dependencies and benefit from the developments that XLS brings to HLS field.
XLS workflow
XLS backend performs the following transformations:
write(): hls4ml representation -> DSLX projectcompile(): DSLX -> XLS IR -> Optimized XLS IRbuild(): Optimized XLS IR -> (System)Verilog -> IPDSLX -> IR -> (System)Verilog conversion is done by XLS.
IP is generated by Vivado. One can choose another vendor and generate IP from Verilog file manually.
XLS features
XLS backend supports the following layers:
Input,ApplyAlpha,BatchNormalization,Dense,Conv1D,DepthwiseConv1D,Conv2D,DepthwiseConv2D,Pooling1D,Pooling2D,GlobalPooling1D,GlobalPooling2D,Merge,Concatenate,Dot,Activation,HardActivation,ParametrizedActivation,PReLU,Reshape,Softmax,Transpose,TernaryTanh.You can override default codegen options as follows:
DSLX standard library has only signed FixedPoint type (similar to
ap_fixed). Thus, unsigned types are not supported.Currently, XLS backend implements only
IOType: io_parallel.Strategyis ignored.All operations are fully unrolled.
io_streamcould be implemented via DSLX procs. @calad0i and I are going to work on that after finishing this PR.Other changes
I made some minor changes in non-XLS code:
test_softmax.pydoes not testargmaxandlatencyimplementations;latencyfails #1443, since it was needed to test all softmax implementations in XLS.ModelGraphto call custombackend.get_top_function(). This is needed for XLS because it uses optimized XLS IR file instead of.solibrary generated by other backends.docs/ir/attributes.rst. Aside from adding XLS, this commit some other missing layers and attributes.Dependencies
XLS backend uses xls-python to access XLS API. It is enabled by dependency group
xls:xls-pythoncomes with batteries (libxls.soand DSLX standard library) included, no separate XLS installation is required.The code has been tested for the version
xls-python=0.1.9875.Known issues
XLS doesn't work with
Denselayer imported from PyTorchLinearlayer because of shape mismatch: PyTorch storesLinearweights as(out_features, in_features), while hls4mlDenselayers use the Keras-style layout(in_features, out_features).Repro: add XLS backend to test_pytorch_api.py/test_squeeze and run the test.
Note that the weights in this test are constant, and other backends flatten them without checking shape.
So, it is unclear whether they handle this situation correctly or not.
Type of change
Tests
XLS has been added to the following tests:
test_activations.py,test_auto_precision.py,test_binary_cnn.py,test_causalpadding.py,test_depthconv1d.py,test_depthconv2d.py,test_keras_api.py,test_keras_v3_api.py,test_merge.py,test_multi_dense.py,test_pointwiseconv.py,test_pooling.py,test_pytorch_api.py,test_reshape.py,test_sepconv1d.py,test_sepconv2d.py,test_softmax.py.Test Configuration
Add
xlsdependency, e.g.and run tests, e.g.:
Notes on performance
Some test cases are very slow for XLS (e.g. ~30 minutes vs ~10 seconds on other backends).
This happens because XLS generates (in
model.compile()) and uses (inmodel.predict()) an optimized XLS IR code, where all loops are fully unrolled. The resulting file can be huge and thus slow for the likes ofConv2D.During development, I made test faster by reducing dimensions in some tests.
For example, in test_keras_api.py/test_conv2d I replaced
with
I haven't pushed such changes, but that could be one of the ways of speeding things up.
UPD: I reduced XLS dimensions in many tests, see 8578a62 and 75a04fb.
Checklist
pre-commiton the files I edited or added.