Command-line client for interacting with local LLM, using KoboldCpp API
Copied from original repo https://codeberg.org/hum_ma/LLM-scripts
Start a KoboldCpp server instance loaded with LLM (or VLM with mmproj), these scripts will use it on the default port 5001 unless set otherwise.
$ python claim.py -c config-qwen.json -t chatml -n "What was the latest data included in your training?"
<|im_start|>user
What was the latest data included in your training?
<|im_end|>
<|im_start|>assistant
My training data cutoff date is October 2024. This^C
chat logs are written to the current directory one token at a time and the latest file matching log file pattern is resumed by default
$ python claim.py -c config-qwen.json -t chatml
<|im_start|>user
What was the latest data included in your training?
<|im_end|>
<|im_start|>assistant
My training data cutoff date is October 2024. This means that all the information and knowledge I have been trained on up to that point is included. If^C
$ python claim.py -c config-qwen.json -t chatml -i ~/Downloads/workflow.png "What is this?" -n
<|im_start|>user
What is this?
<|im_end|>
<|im_start|>assistant
This is a **Node-based workflow** (specifically, a **ComfyUI** workflow) for generating^C
include the image in subsequent requests of the same chat to avoid reprocessing messages
$ python claim.py -h
usage: claim.py [-h] [-a A] [-S SYS] [-b B] [-r [R]] [-t T] [-d DIR] [-c [CFG_FILE]] [-e [E]] [-i I] [-o O] [-O O] [-D] [-E] [-m]
[-n] [-q] [-Q] [-R] [-T] [-l LOOP] [-p PORT] [-s HOST]
[prompts ...]
Command-line AI, Multi* assembles context from optional elements in this order: 1. context file, 2. positional 'prompts', 3. stdin,
4. -a argument.
options:
-h, --help show this help message and exit
Context and prompt options:
prompts each separate argument will appear on its own line in the request
-a A append A after all other context elements
-S, --sys SYS system prompt, not used for some prompt templates
Generation properties:
-b B banned strings to remove from vocabulary, B is a word or a newline-separated string
-r [R] recover last seed from server, or reuse your favorite; makes -R ineffective
-t, --template T prompt template format, one of ['alpaca', 'chatml', 'deepseek', 'gemma', 'llama2', 'llama3', 'phi3',
'zephyr']
Filesystem related:
-d, --dir DIR directory for files, default '.'
-c, --cfg_file [CFG_FILE]
load/save named file with API properties, without file name use 'config-claim.json'
-e [E] print errors to file instead of stderr, without file name use 'claim-error.txt'
-i I image file(s) for vision, I is a single file path or a newline-separated string of file names
-o O read and append prompt context in named file, this overrides -n and -O
-O O set name for file sequence, the string '-<host>-<port>-0001.txt' (auto-increased) will be added to this
Boolean flags of operation:
-D, --debug enable debug output to 'claim-debug.txt'
-E, --events enable logging of timestamped SSE data to 'claim-tokens.txt'
-m, --min_p use Min_P and disable other samplers
-n, --new create a new context file in sequence instead of continuing last one found
-q, --quiet_input context and prompt not printed
-Q, --quiet_output AI response not printed to stdout
-R revert to the API default of new seed for each request part, default is one seed per run
-T triple-backtick template for any text read from stdin
API connection:
-l, --loop LOOP limit number of requests to given number or until Ctrl-C (< -1), default is to fill context (=-1)
-p, --port PORT port of API instance, default '5001'
-s, --host HOST server with API endpoint, default 'localhost'
$ python tokens.py "How many tokens in this question?"
8
$ python modelname.py
koboldcpp/Qwen3VL-8B-Instruct-Q4_K_M
$ python modelname.py -h
usage: modelname.py [-h] [-c] [-n] [-s S] [-p P]
options:
-h, --help show this help message and exit
-c show max context size
-n split name and return only model without author
-s S server hosting Kobold API
-p P port of API instance