Skip to content

Bug Fix: Incorrect Uncompressed Length Calculation in Buffer Operations #3

@fufuok

Description

@fufuok

Bug Fix: Incorrect Uncompressed Length Calculation in Buffer Operations

Problem Description

The TestGoCompressUncompressBufferWithLenLessThanCap test is failing with the following error:

func TestGoCompressUncompressBufferWithLenLessThanCap(t *testing.T) {
	var i int = 1
	input := []byte(strconv.Itoa(i))
	// t.Logf("input: %s, len: %d, cap: %d", input, len(input), cap(input)) // input: 1, len: 1, cap: 8

	inputSize := len(input)
	outputSize := inputSize + 64
	output := make([]byte, 0, outputSize)

	compLen, compError := GoGZipCompressBuffer(CompressionLevelBestSpeed, input, output)
	assert.NoError(t, compError)

	compressed := output[:compLen]
	uncompressed := make([]byte, 0, inputSize+64)

	uncompLen, uncompErr := GoUncompressBuffer(compressed, uncompressed)

	assert.NoError(t, uncompErr)
	assert.Equal(t, uint64(inputSize), uncompLen)
	assert.Equal(t, input, uncompressed[:uncompLen])
}
gozlib_buffer_test.go:157:
                Error Trace:    /mnt/c/Work/Go/Code/gozlib-main/gozlib_buffer_test.go:157
                Error:          Not equal:
                                expected: 0x1
                                actual  : 0x8
gozlib_buffer_test.go:158:
                Error Trace:    /mnt/c/Work/Go/Code/gozlib-main/gozlib_buffer_test.go:158
                Error:          Not equal:
                                expected: []byte{0x31}
                                actual  : []byte{0x31, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

The test is trying to compress and decompress a single character "1", but the decompress function is returning a length of 8 instead of 1, and the decompressed data contains extra zero bytes.

Root Cause Analysis

After analyzing the code, I found two critical issues:

  1. Incorrect Input Buffer Size Calculation: Both GoGZipCompressBuffer and GoUncompressBuffer functions were using cap(input) as the input buffer size instead of len(input). This caused the functions to read beyond the actual input data length, potentially reading zero bytes beyond the valid data.

  2. Index Out of Range Error During Fix Attempt: During the initial fix attempt, trying to use &output[0] to get the output buffer pointer caused an index out of range error when output was a zero-length slice.

Fix Solution

I've fixed both issues:

  1. Use len(input) instead of cap(input): Modified both functions to use the actual length of the input slice instead of its capacity, ensuring they only read valid input data.

  2. Use reflect.SliceHeader for Output Buffer Pointer: Maintained the use of reflect.SliceHeader to safely get the underlying array pointer of the output slice, even when it's zero-length.

Code Changes

1. GoGZipCompressBuffer Function

// Before
func GoGZipCompressBuffer(level CompressionLevel, input []byte, output []byte) (uint64, error) {
	inputCap := cap(input)
	outputCap := cap(output)
	if outputCap == 0 {
		return 0, OutputBufferTooSmallError
	}

	var inputPtr unsafe.Pointer = nil
	if inputCap > 0 {
		inputPtr = unsafe.Pointer(&input[0])
	}

	outputHdr := (*reflect.SliceHeader)(unsafe.Pointer(&output))
	outputPtr := unsafe.Pointer(outputHdr.Data)

	var errorCode C.int = C.Z_OK

	compLen := C.gzip_compress_buffer(C.int(level), inputPtr, C.uInt(inputCap), outputPtr, C.uInt(outputCap), &errorCode)

	// ... rest of the function
}

// After
func GoGZipCompressBuffer(level CompressionLevel, input []byte, output []byte) (uint64, error) {
	inputLen := len(input)
	outputCap := cap(output)
	if outputCap == 0 {
		return 0, OutputBufferTooSmallError
	}

	var inputPtr unsafe.Pointer = nil
	if inputLen > 0 {
		inputPtr = unsafe.Pointer(&input[0])
	}

	outputHdr := (*reflect.SliceHeader)(unsafe.Pointer(&output))
	outputPtr := unsafe.Pointer(outputHdr.Data)

	var errorCode C.int = C.Z_OK

	compLen := C.gzip_compress_buffer(C.int(level), inputPtr, C.uInt(inputLen), outputPtr, C.uInt(outputCap), &errorCode)

	// ... rest of the function
}

2. GoUncompressBuffer Function

// Before
func GoUncompressBuffer(input []byte, output []byte) (uint64, error) {
	inputCap := cap(input)
	outputCap := cap(output)
	if outputCap == 0 {
		return 0, OutputBufferTooSmallError
	}

	var inputPtr unsafe.Pointer = nil
	if inputCap > 0 {
		inputPtr = unsafe.Pointer(&input[0])
	}

	outputHdr := (*reflect.SliceHeader)(unsafe.Pointer(&output))
	outputPtr := unsafe.Pointer(outputHdr.Data)

	var errorCode C.int = C.Z_OK

	uncompLen := C.uncompress_buffer_any(inputPtr, C.uInt(inputCap), outputPtr, C.uInt(outputCap), &errorCode)

	// ... rest of the function
}

// After
func GoUncompressBuffer(input []byte, output []byte) (uint64, error) {
	inputLen := len(input)
	outputCap := cap(output)
	if outputCap == 0 {
		return 0, OutputBufferTooSmallError
	}

	var inputPtr unsafe.Pointer = nil
	if inputLen > 0 {
		inputPtr = unsafe.Pointer(&input[0])
	}

	outputHdr := (*reflect.SliceHeader)(unsafe.Pointer(&output))
	outputPtr := unsafe.Pointer(outputHdr.Data)

	var errorCode C.int = C.Z_OK

	uncompLen := C.uncompress_buffer_any(inputPtr, C.uInt(inputLen), outputPtr, C.uInt(outputCap), &errorCode)

	// ... rest of the function
}

Fix Rationale

  1. Using len(input) instead of cap(input): This ensures that the functions only read the actual valid input data, not beyond it. When using output[:compLen] to create a slice of the compressed data, the length is compLen but the capacity remains outputSize. Using cap(input) would cause the decompress function to read beyond the actual compressed data.

  2. Using reflect.SliceHeader for output pointer: This approach safely gets the underlying array pointer of the output slice, even when it's zero-length. Trying to use &output[0] directly would cause an index out of range error for zero-length slices.

Verification

After applying the fix, the TestGoCompressUncompressBufferWithLenLessThanCap test should pass, as well as all other buffer operation tests. The fix ensures that:

  1. Only actual input data is compressed
  2. Only actual compressed data is decompressed
  3. The decompressed length is correctly calculated
  4. No extra zero bytes are included in the decompressed data

Impact

This fix has minimal impact on the codebase, only changing the input buffer size calculation in two functions. It ensures more accurate compression and decompression results, especially for small data sizes.

The fix is backward compatible and should not break any existing functionality, as it only changes the behavior to be more correct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions