Skip to content

Dna dev/run more models#5

Draft
yuqiannemo wants to merge 17 commits into
Xtra-Computing:devfrom
yuqiannemo:dna_dev/run_more_models
Draft

Dna dev/run more models#5
yuqiannemo wants to merge 17 commits into
Xtra-Computing:devfrom
yuqiannemo:dna_dev/run_more_models

Conversation

@yuqiannemo
Copy link
Copy Markdown
Contributor

No description provided.

@JerryLife
Copy link
Copy Markdown
Collaborator

Thanks for the work here. Before this can land, the branch needs a rebase onto current dev — PR #4 already merged most of these commits, and there have been post-review fixes on dev since then. The remaining dev..pr-5 diff is now mostly a reversion of work that's already in:

File Effective change vs dev
configs/huggingface_llm_list_left.jsonl +11813 — the one genuinely new thing
pyproject.toml Downgrades version from 0.2.3 back to 0.2.2
src/llm_dna/api.py Reverts commit b6c7f1f (post-review fixes): removes load_dotenv() in _resolve_hf_token, removes the hoisted model_meta / responses init that guards against unbound variables, swaps the verbose deprecation error for the terser old one
src/llm_dna/models/ModelLoader.py Drops the comment explaining why openai/gpt-3 / openai/gpt-4 are version-pinned (added in b6c7f1f)
tests/test_model_detection.py Deletes 71 lines of tests, including the cases that pin openai/gpt-oss, openai/whisper-*, openai/clip-* ≠ OpenRouter

So merging as-is (even after manually resolving the api.py conflict the wrong way) would silently un-bump the version, undo the v0.2.3 hardening, and delete the model-type-detection regression tests.

Requested fixes

  1. Rebase dna_dev/run_more_models onto current dev and drop everything already merged via PR Adjust/Improve pipeline to be compatible with more models #4. Only 4dff95c Add remaining list for HF models (2026-04-11) is dated after the PR Adjust/Improve pipeline to be compatible with more models #4 merge — the rebase should collapse to essentially the JSONL addition plus the README extras paragraph.
  2. Do not change pyproject.toml's version — leave it at 0.2.3.
  3. Keep tests/test_model_detection.py as-is on dev.
  4. Keep the b6c7f1f post-review fixes in api.py and ModelLoader.py (load_dotenv in _resolve_hf_token, hoisted variable init, verbose deprecation error, and the version-pinned-prefix NOTE comment).
  5. Minor: the README claims the quantization extra is required for bitsandbytes, but bitsandbytes is already in the base dependencies on dev. Either drop bitsandbytes from the quantization extra, or reword the README so the extra is described as adding autoawq / auto-gptq / optimum / compressed-tensors only.
  6. Optional: the 11.8k-line JSONL bloats the repo by ~1 MB. Consider hosting it as a release asset or in a separate data repo instead of committing it — not blocking, just a heads-up.

After the rebase, the diff should be tiny (data file + README section), at which point this is straightforward to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants