-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi,
I read in the paper:
Following CLIP-LIT [41], we select the BAID [48] dataset for training the network with an image size of 256 × 256.
I am trying to use much larger images for backlit enhancement (only for inference, with batch_size = 1) and encountered the following error:
RuntimeError: Expected canUse32BitIndexMath(input) && canUse32BitIndexMath(output) to be true, but got false.
(Could this error message be improved? If so, please report an enhancement request to PyTorch.)
After some investigation, I understand that this issue occurs because large images can result in an output tensor exceeding 2^31 elements, which is not supported by cuDNN versions earlier than 9.3 (related PyTorch issue and PR). Since I am performing inference, I cannot split the batch to work around this limitation.
Additionally, I noticed that PyTorch will only support cuDNN > 9.1 starting from version 2.7, which is expected to be released soon.
This raises a few questions:
- Does this mean that in the paper, images of size 256x256 (or slightly larger) were used for both training and testing, but not large enough to produce tensors exceeding 2^31 elements?
- The BAID dataset, as referenced in the paper, contains much larger images. How were these larger images handled during testing? Were they resized to 256x256 or cropped to avoid this issue?
I am a bit confused because the paper suggests the use of the BAID dataset, which contains much larger images, but this seems incompatible with the current cuDNN limitations for large tensors.