Skip to content

fix: add bounds check before memcpy in hisi_comp.c#764

Open
orbisai0security wants to merge 230 commits into
Linaro:developfrom
orbisai0security:fix-hisi-comp-dst-buf-overflow-v002
Open

fix: add bounds check before memcpy in hisi_comp.c#764
orbisai0security wants to merge 230 commits into
Linaro:developfrom
orbisai0security:fix-hisi-comp-dst-buf-overflow-v002

Conversation

@orbisai0security
Copy link
Copy Markdown

Summary

Fix high severity security issue in drv/hisi_comp.c.

Vulnerability

Field Value
ID V-002
Severity HIGH
Scanner multi_agent_ai
Rule V-002
File drv/hisi_comp.c:284
Assessment Confirmed exploitable

Description: The compression driver performs multiple memcpy operations copying data into req->dst buffers where copy_len and offset values are derived from hardware responses. These values are not validated against the destination buffer capacity before use. Sequential writes at increasing offsets (STORE_BLOCK_SIZE, STORE_BLOCK_SIZE + sizeof(checksum)) compound the risk as total write size is never verified against req->dst_len.

Changes

  • drv/hisi_comp.c

Verification

  • Build passes
  • Scanner re-scan confirms fix
  • LLM code review passed

Automated security fix by OrbisAI Security

Liulongfang and others added 30 commits December 30, 2024 12:02
Add the algorithm hmac(sm3)-cbc(sm4) to the nosva scene,
the following fileds of the session setup need to be set,
the calg(WCRYPTO_CIPHER_SM4), the cmode(WCRYPTO_CIPHER_CBC),
the dalg(WCRYPTO_SM3) and the dmode(WCRYPTO_DIGEST_HMAC).

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Currently, the algorithm name of the aead cbc mode
is designed only for sha256, but it is not suitable
any more when other algorithms are added, such as
hmac(sm3)-cbc(aes).
Now a common name is used, authenc(generic,cbc(aes)),
the actual algorithm and mode are still specified
by dalg and dmode in the session setup.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
In stream processing encryption mode, a long file
needs to be encrypted. When the accelerator is invoked,
the encryption result of each block is assembled.
The assembled result is the same as the result of
encrypting the entire file at a time.
For hisi_sec, the AAD is filled to the first message,
plaintext are done with the middle and the end message.
In an encrypted stream, the first and the end message
are unique and must be delivered to hardware.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
For the gcm stream mode, assoc bytes should not be 0,
check it to avoid hardware error.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The hardware only uses the block mode, so set the aead
message state to the block mode first.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The hardware supports only 16-byte alignment for the aead
middle messages, the invalid length check is added now.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
In common digest stream mode, io_bytes and iv_bytes need to
be set to 0 when the final bd is calculated. Therefore, in the
appending tag scenario, need to restore the values of io_bytes
and iv_bytes to the values before they are set to 0.

Therefore, the hardware can compute the overall hash value of
the appending packet and the previously calculated packet,
and reduce the repeated calculation.

Signed-off-by: Qi Tao <taoqi10@huawei.com>
uadk: support appending tag for digest stream model
uadk: support aead stream mode and sm4-sm3 alg
When a combined algorithm is used, the authsize
should not be 0, so add check for it.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
According to the HMAC rfc, the auth key could be 0 bytes,
so remove the wrong judgment.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
According to the HMAC rfc, the auth key could be 0 bytes,
so remove the wrong judgment.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The auth key could be 0 bytes, remove the wrong judgment.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The ctx key may be null if the user use the
normal mode, it should return an error before
copy data to the key.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
First, move the algorithm check to the right level,
then we modified the alignment to 4 bytes from 16
bytes according to the hardware specification.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The alignment of authsize should be 4 bytes not 16 bytes
according to the hardware specification.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Add print help when dfx/benchmark/test input
empty parameters.

Signed-off-by: Junchong Pan <panjunchong@h-partners.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
When soft computing is required, an invalid BD
is used to ensure the integrity of the sending
and receiving process, it is more efficient.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
1. Use 'intptr_t' or 'uintptr_t' to convert pointer into integer.
2. Forbidden exit function 'pthread_exit' is called.
3. Use 1 blank space(' ') instead of TAB('\t') between the right
   comment and the previous code.
4. Do not add blank lines on the start of a code block defined by braces.

Signed-off-by: Qi Tao <taoqi10@huawei.com>
When wd_do_comp_strm returns error ret, status in req is not
updated to that in strm_req. As a result, when the internal
status of strm_req is WD_IN_EPARA, the req obtained by the user
is not WD_IN_EPARA.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Replace the user output buffer with the internal buffer when
the size of the user-provided buffer is less than 200 bytes.

It need to copy data in either of the following scenarios:
1.There is data in the internal buffer before hardware processing.
2.After the hardware uses the internal buffer to process.

The following two situations may occur during each copy process:
a.Data is completely copied and output.
b.Part of the data is copied and output.

1 and 2 are combined with a and b to implement the code logic
of check_store_buf and copy_from_hw.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Modify the output size to ensure that the output size is 1.125
times of the input size.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The resp_msg in wd_comp_poll_ctx is used before being
initialized. The null pointer check in recv function is invalid.
As a result, the memory is empty.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Replace the user output buffer with the internal buffer when
the size of the user-provided buffer is less than 200 bytes.

When cached data needs to be copied for multiple times, hardware
processing is skipped. The way to skip hardware is to change the
output size to 16.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Only stateful compression need to adjust the output size to
ensure that the output size is 1.125 times of the input size.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Check whether the pointer is null before the ctx_buf is reset.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Upstream: Yes
Feature or Bugfix: Bugfix
DTS: DTS2025022803997

There are some problems in the public framework code of v1:
1. Some memory judgments are not comprehensive.
2. Some function return values are not checked
3. Some function input parameter verification is incomplete
4. Add some comments to improve code readability
5. Modify some log description errors

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The full life cycle of the shared memory is shmget(create memory),
shmat(attach to the process), shmdt(detach to the process)
and shmctl(delete memory), so fix for the uadk code according
to this rule.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
The variable 'pos' is enumeration type, and the case branches
cover all values. So remove default branch.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Qi Tao <taoqi10@huawei.com>
Liulongfang and others added 30 commits March 20, 2026 11:00
    to ensure the clarity and completeness of the README document, it is
necessary to reformat it into markdown type and refine its content

Signed-off-by: Liulongfang <liulongfang@huawei.com>
When the last decompression block is fully consumed and the stream
context still holds cached data, the driver asks the user to send
a zero-input request to flush the remaining data from hardware.

This case is not caused by insufficient output space. The hardware
treats the cached data as possibly incomplete and needs another
request to continue processing, so qm_parse_zip_sqe() and
qm_parse_zip_sqe_v3() should return WCRYPTO_DECOMP_BLK_NOSTART
instead of WCRYPTO_DECOMP_END_NOSPACE.

Signed-off-by: yuzh1105 <yuzh.aurora@gmail.com>
uadk: use blk_nostart for zip tail flush status
All LZ77 algorithms need to clear the literal length in the context
to avoid coupling of literal data between data segments in the
stream mode.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
When tail data is being compress, if the buffer data is not
cleared, a nospace or again status needs to be returned to notify
the user to add output space.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
The tail packet padding function depends on the driver. Therefore,
this function is moved to the driver layer for implementation, so
as to remove unnecessary processes at the algorithm layer.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
When the output space is too large, the function is not affected.
Therefore, the printf of displaying a large number of messages
is deleted.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
… device failure

1. Add max timeout control and adaptive backoff to prevent long usleep
delays on device failure
2. Implement exponential backoff to reduce context switching overhead
under high load.

Signed-off-by: lizhi <lizhi206@huawei.com>
The key align size of hashjoin is changed now, make it
match the hardware.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Some clean code to improve the readability of the wd_agg code.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Fix parameter validation during the coding process.
Adopt constant-time programming techniques for handling sensitive
data to enhance code resilience against attacks.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Add WD_DEFLATE branch in append_store_block() to output a 5-byte store
block when stream compression receives an empty last packet

Signed-off-by: ZongYu Wu <wuzongyu1@huawei.com>
In the HPRE module's big number comparison, the previous
implementation used memcmp to compare big numbers of the same length.
On little-endian platforms, this approach compares data starting from
the Least Significant Bit (LSB), which can lead to incorrect
results—such as a larger value in the lower bits being misinterpreted
as greater overall, even if the higher bits are smaller.
To address this, the comparison logic must be modified to consistently
start from the Most Significant Bit (MSB).

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
The HPRE driver must store and process data in big-endian (MSB)
format as specified by the chip design.
Therefore, the comparison logic for this data should also be described
and implemented following the big-endian convention.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
This patch set primarily fixes multiple issues related to compression
The 'ret' value is assigned in the case of 'req->op_type == WD_RSA_GENKEY'.
When 'op_type' is not 'WD_RSA_GENKEY', 'ret' is not initialized. If
'ret' is directly returned, the caller may incorrectly determine that
the operation fails. Therefore, 'WD_SUCCESS' instead of 'ret' is
directly returned.

Signed-off-by: Weili Qian <qianweili@huawei.com>
uadk: modify the return value of rsa_prepare_key() in hpre
The internal asynchronous polling interface is not
used in any scenario, remove it.

Signed-off-by: Weili Qian <qianweili@huawei.com>
When wd_get_alg_type() fails, the memory allocated
by strdup() is not freed. fix it.

Signed-off-by: Weili Qian <qianweili@huawei.com>
strdup () may fail to allocate memory. Therefore, the return value
needs to be checked, fix it.

Signed-off-by: Weili Qian <qianweili@huawei.com>
The header file <sys/poll.h> is deprecated in modern Linux
systems(e.g., glibc). Current builds may trigger a compilation
warning when -Werror is enabled:

 #warning redirecting incorrect #include <sys/poll.h> to <poll.h>

Therefore, use poll.h instead of sys/poll.h.

Signed-off-by: Weili Qian <qianweili@huawei.com>
In the UADK no-SVA mode memory pool, queues are used to create
memory pools, and algorithms are employed to find the required queues.
However, a compression algorithm queue cannot simultaneously support both
compression and decompression modes—a queue must be defined as either
compression or decompression. Therefore, when allocating a memory pool,
a queue of the corresponding type must be specified for use.

To facilitate queue lookup for the memory pool, the algorithm name is
distinguished by appending a "-comp" or "-decomp" suffix. Similarly,
when users request a memory pool, the compression algorithm must also
include the corresponding suffix for querying.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
In the ZIP algorithm compression testing tool, when using no-SVA mode,
the creation of the memory pool requires the use of different device queues
based on compression and decompression operations.
We address this distinction through a convention of appending suffixes.
Therefore, when querying queues, suffixes need to be added for processing.

Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Multiplying two u32 values and assigning the result to
a left-hand u64 value carries a risk of overflow,
so the right-hand value needs to be explicitly cast to u64.

Signed-off-by: Zhushuai Yin <yinzhushuai@huawei.com>
Since sess in cleanup_session() cannot
be null, the check on sess can be removed.

Signed-off-by: Weili Qian <qianweili@huawei.com>
Include missing header file: v1/wd_ecc.h v1/wd_comp.h.
Remove internal header file v1/uacce.h.

Signed-off-by: Qi Tao <taoqi10@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
The -lpthread flags should be placed in LIBADD instead of LDFLAGS to
follow libtool conventions. And add -lm for libwd_dae_la_LIBADD.

Signed-off-by: Weili Qian <qianweili@huawei.com>
The log frequency limit function iuses the alarm function.
If users use the alarm function, the log frequency limiting
function may be affected. Therefore, the log frequency
limiting function is changed to timer_settime.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
uadk: Accumulated fixes and optimizations patchset
Automated security fix generated by OrbisAI Security
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants