-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
When attempting to use the latest HuggingFace MultiModal, it fails to work due to version incompatibility with either the llama core or the HuggingFace LLM base model.
from llama_index.multi_modal_llms.huggingface import HuggingFaceMultiModal
mm_llm = HuggingFaceMultiModal.from_model_name("Qwen/Qwen2-VL-2B-Instruct")Traceback (most recent call last):
File "/workspaces/raggify/temp/test.py", line 4, in <module>
mm_llm = HuggingFaceMultiModal.from_model_name("Qwen/Qwen2-VL-2B-Instruct")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/raggify/.venv/lib/python3.12/site-packages/llama_index/multi_modal_llms/huggingface/base.py", line 255, in from_model_name
return Qwen2VisionMultiModal(model_name=model_name, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/raggify/.venv/lib/python3.12/site-packages/llama_index/multi_modal_llms/huggingface/base.py", line 96, in __init__
super().__init__(**kwargs)
File "/workspaces/raggify/.venv/lib/python3.12/site-packages/llama_index/llms/huggingface/base.py", line 212, in __init__
model = model or AutoModelForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/raggify/.venv/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 607, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.qwen2_vl.configuration_qwen2_vl.Qwen2VLConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of ApertusConfig, ArceeConfig, AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BitNetConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, BltConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, DeepseekV2Config, DeepseekV3Config, DiffLlamaConfig, DogeConfig, Dots1Config, ElectraConfig, Emu3Config, ErnieConfig, Ernie4_5Config, Ernie4_5_MoeConfig, Exaone4Config, FalconConfig, FalconH1Config, FalconMambaConfig, FlexOlmoConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nTextConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GptOssConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, GraniteMoeSharedConfig, HeliumConfig, HunYuanDenseV1Config, HunYuanMoEV1Config, JambaConfig, JetMoeConfig, Lfm2Config, LlamaConfig, Llama4Config, Llama4TextConfig, LongcatFlashConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MiniMaxConfig, MinistralConfig, MistralConfig, MixtralConfig, MllamaConfig, ModernBertDecoderConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NemotronConfig, OlmoConfig, Olmo2Config, Olmo3Config, OlmoeConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3MoeConfig, Qwen3NextConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeedOssConfig, SmolLM3Config, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, VaultGemmaConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, ZambaConfig, Zamba2Config.
Downgrading to version 0.4.2 resolved the issue, but this also downgraded the dependent torch library to version 2.4.1 (torchvision 0.19.1).
Attempting to use HuggingFace LLM for text summarization in this state results in errors because the torch version is too outdated.
from llama_index.llms.huggingface import HuggingFaceLLM
llm = HuggingFaceLLM(model_name="StabilityAI/stablelm-tuned-alpha-3b")ValueError: Due to a serious vulnerability issue in torch.load, even with weights_only=True, we now require users to upgrade torch to at least v2.6 in order to use the function. This version restriction does not apply when loading files with safetensors.
See the vulnerability report here https://nvd.nist.gov/vuln/detail/CVE-2025-32434
With no other option, I gave up on using stablelm and applied Qwen for text summarization as well.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels