AttributeError: 'NoneType' object has no attribute 'cquantize_blockwise_fp16_fp4'

### System Info

working on google colab.
torch is installed using:
torch==1.13.0
torchvision==0.14.0
torchaudio==0.13.0
pytorchvideo @ git+https://github.com/facebookresearch/pytorchvideo.git@28fe037d212663c6a24f373b94cc5d478c8c1a1d


### Reproduction

I am trying to load a model using 4bit quantization while having another model in the pipline that requires specific version of torch(1.13.0). I try to load it using:

```
model_name = "Hessa/MMTQA-merged2"

# Load the model in 4-bit precision
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    load_in_4bit=True,  # Enable 4-bit quantization
    device_map="auto",  # Automatically place the model on available devices (GPU/CPU)
    torch_dtype=torch.float16,  # Optionally use fp16 if needed
)
```

but I got the error:
```
ERROR:bitsandbytes.cextension:Could not load bitsandbytes native library: libcusparse.so.11: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py", line 85, in <module>
    lib = get_native_library()
  File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py", line 72, in get_native_library
    dll = ct.cdll.LoadLibrary(str(binary_path))
  File "/usr/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.11: cannot open shared object file: No such file or directory
WARNING:bitsandbytes.cextension:
CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

config.json: 100%
 5.27k/5.27k [00:00<00:00, 416kB/s]
The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.
model.safetensors.index.json: 100%
 89.4k/89.4k [00:00<00:00, 6.24MB/s]
Downloading shards: 100%
 5/5 [08:37<00:00, 90.49s/it]
model-00001-of-00005.safetensors: 100%
 4.99G/4.99G [01:58<00:00, 42.7MB/s]
model-00002-of-00005.safetensors: 100%
 4.97G/4.97G [01:57<00:00, 38.8MB/s]
model-00003-of-00005.safetensors: 100%
 4.92G/4.92G [02:01<00:00, 42.4MB/s]
model-00004-of-00005.safetensors: 100%
 5.00G/5.00G [02:01<00:00, 36.1MB/s]
model-00005-of-00005.safetensors: 100%
 1.47G/1.47G [00:36<00:00, 42.2MB/s]
Loading checkpoint shards:   0%
 0/5 [00:00<?, ?it/s]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-6-fffde6e39f01>](https://localhost:8080/#) in <cell line: 9>()
      7 
      8 # Load the model in 4-bit precision
----> 9 model = AutoModelForCausalLM.from_pretrained(
     10     model_name,
     11     load_in_4bit=True,  # Enable 4-bit quantization

7 frames
[/usr/local/lib/python3.10/dist-packages/bitsandbytes/functional.py](https://localhost:8080/#) in quantize_4bit(A, absmax, out, blocksize, compress_statistics, quant_type, quant_storage)
   1232         elif A.dtype == torch.float16:
   1233             if quant_type == "fp4":
-> 1234                 lib.cquantize_blockwise_fp16_fp4(*args)
   1235             else:
   1236                 lib.cquantize_blockwise_fp16_nf4(*args)

AttributeError: 'NoneType' object has no attribute 'cquantize_blockwise_fp16_fp4'
```

Is this a problem in bitsandbytes? 
if I load it again and restart the session I will lose the other model in the pipeline (ImageBind). how can I solve the problem?

### Expected behavior

I want the model to be loaded into 4bit- quantization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AttributeError: 'NoneType' object has no attribute 'cquantize_blockwise_fp16_fp4' #1467

System Info

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

AttributeError: 'NoneType' object has no attribute 'cquantize_blockwise_fp16_fp4' #1467

Description

System Info

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions