Skip to content

Clarification on Image Size and cuDNN Limitations for Backlit Enhancement #11

@rokopi-byte

Description

@rokopi-byte

Hi,

I read in the paper:

Following CLIP-LIT [41], we select the BAID [48] dataset for training the network with an image size of 256 × 256.

I am trying to use much larger images for backlit enhancement (only for inference, with batch_size = 1) and encountered the following error:

RuntimeError: Expected canUse32BitIndexMath(input) && canUse32BitIndexMath(output) to be true, but got false.  
(Could this error message be improved? If so, please report an enhancement request to PyTorch.)

After some investigation, I understand that this issue occurs because large images can result in an output tensor exceeding 2^31 elements, which is not supported by cuDNN versions earlier than 9.3 (related PyTorch issue and PR). Since I am performing inference, I cannot split the batch to work around this limitation.

Additionally, I noticed that PyTorch will only support cuDNN > 9.1 starting from version 2.7, which is expected to be released soon.

This raises a few questions:

  1. Does this mean that in the paper, images of size 256x256 (or slightly larger) were used for both training and testing, but not large enough to produce tensors exceeding 2^31 elements?
  2. The BAID dataset, as referenced in the paper, contains much larger images. How were these larger images handled during testing? Were they resized to 256x256 or cropped to avoid this issue?

I am a bit confused because the paper suggests the use of the BAID dataset, which contains much larger images, but this seems incompatible with the current cuDNN limitations for large tensors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions