Releases: MyrtleSoftware/vollo-sdk
Releases · MyrtleSoftware/vollo-sdk
Vollo SDK 26.2.0
- Support for FP8 (E4M3) weights on Versal devices using
vollo_torch.Fp8Weightscontext manager - Support for
torch.exp,torch.exp2at FP32 precision usingvollo_torch.Fp32Activationscontext manager - Support for
matmuloperations where both inputs are dynamic (non-constant) tensors, when usingallow_dynamic_weights - Optimize accumulations on Versal devices, improving performance of layers such as
LayerNormandRMSNorm, andLinearlayers with small output features - Add support for multiple state tensors in
vollo_torch.nn.Scan - Add
allow_unserializableflag tovollo_compiler.NNIR.to_programfor testing programs which can't be serialized - Fix multi-model programs that use dynamic weights
Vollo SDK 26.1.2
- Optimize handling of biases in
Linearlayers when usingallow_dynamic_weights - Speed up model compilation
- Add
random_seedsargument tovollo_compiler.NNIR.to_program
Vollo SDK 26.1.1
- Fix V80LL initialization bitstream so that the V80LL memory can be flashed over JTAG
- Optimize handling of biases in
Linearlayers when usingallow_dynamic_weights - Add support for multiple outputs to
vollo_torch.nn.Scan - Speed up loading
.volloprograms
Vollo SDK 26.1.0
- Fix DMA bug introduced on V80 in Vollo SDK 26.0.0
- Add Alveo V80LL bitstream and
vollo_compiler.Config.v80ll_c6b32hardware config - Add support for
Linearlayers where the contracted dimension is not the data dimension via theallow_dynamic_weightsflag forvollo_compiler.NNIR.to_program - Add support for multiple inputs to
vollo_torch.nn.Scan - Add support for indexing with negative indices in:
torch.stack,torch.sum,torch.permute,torch.squeeze,torch.unsqueeze - Add support for
torch.nn.functional.linear - Add optional
biasargument tovollo_torch.nn.PaddedConv1d
Vollo SDK 26.0.2
- Update example/partial_update.c to allow multiple inputs and mixed precision
- Fix bug in FP32/multi-input partial updates
- Speed up model compilation
Vollo SDK 26.0.1
- Make
vollo-tool licenseuse the system's CA certificates - Fix bug in FP32 partial updates
Vollo SDK 26.0.0
- V80 DMA optimizations
- Support for a subset of operations at FP32
vollo-torch- Add
vollo_torch.Fp32Activationscontext manager - Add
inputs_precisionsandoutput_precisionsarguments tovollo_torch.fx.nnir.to_nnir
- Add
vollo-compiler:- Add
model_input_number_formatandmodel_output_number_formatmethods tovollo_compiler.Program - Add
vollo_compiler.NumberFormatenum
- Add
vollo-rtC/C++ API- Add
vollo_rt_add_job,vollo_rt_add_job_partial_update,vollo_rt_model_input_format,vollo_rt_model_output_format,vollo_rt_get_raw_buffer_bytesfunctions andnumber_formatenum
- Add
vollo-rtPython bindings- Add
add_job,add_job_f32,model_output_formatmethods tovollo_rt.VolloRTContext
- Add
- Memory usage and compilation time improvements in the compiler
- Add
quick_compileflag tovollo_compiler.NNIR.to_programfor faster compilation - Add
max_sparse_entriesoption tovollo_compiler.NNIR.to_programto configure the number of nonzero entries allowed in weights for non-standard memory format MatMuls - Add
token-infosubcommand tovollo-tool licenseto show information about a purchase token - Add info message to
vollo-tool license redeem-deviceif the device being redeemed for has been redeemed on an expired or nearly expiring token
Vollo SDK 25.1.2
- Reduced memory usage of the compiler during compilation of large models
- Improve
ami-tool's detection of bitstream UUIDs - Improve AMI driver's compatibility with Linux kernel versions
Vollo SDK 25.1.1
- Fix bug in
vollo-toolwherevollo-tool fpga-configdid not enumerate the V80 management physical function - Fix bug in
load-kernel-driver.shwhere the Vollo driver was loaded for the V80 management physical function instead of the AMI driver
Vollo SDK 25.1.0
- Early access support for Mamba models
- Add support for SiLU
- Add support for Softplus
- Add support for Exp, Exp2
- Add support for Sigmoid
- Add support for Softmax
- Speed up model compilation