-
Notifications
You must be signed in to change notification settings - Fork 33
Add jbang script and configuration to make easy to run
#90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mikepapadim
merged 9 commits into
beehive-lab:main
from
mikepapadim:claude/java-cli-llama-tornado-b1SUZ
Dec 16, 2025
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
7e57e5a
Add pure Java CLI with JBang support for llama-tornado
claude 7494855
Fix: Remove JBang directives from Maven-compiled CLI version
claude 5f03970
Fix: Remove shebang from JBang CLI to match TornadoVM pattern
claude 5e5c0b1
Add TornadoFlags.java for proper TornadoVM JBang configuration
claude 80040bf
Fix jbang
mikepapadim 5ff078f
Update GPU Llama3 dependency to version 0.3.2-dev and enable usage he…
mikepapadim f539fe8
Remove deprecated JBang example script from repository
mikepapadim b624d57
Update Makefile to change default target from 'package' to 'install'
mikepapadim 7b8bd75
Remove JBang vs llama-tornado comparison section from README.md
mikepapadim File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,145 @@ | ||
| //JAVA 21 | ||
| //PREVIEW | ||
| //DEPS io.github.beehive-lab:gpu-llama3:0.3.2-dev | ||
| //DEPS io.github.beehive-lab:tornado-api:2.1.0 | ||
| //DEPS io.github.beehive-lab:tornado-runtime:2.1.0 | ||
|
|
||
| //SOURCES TornadoFlags.java | ||
| // === Set to not get annoying warnings about annotation processing | ||
| //JAVAC_OPTIONS -proc:full | ||
|
|
||
| // Compiler options | ||
| //JAVAC_OPTIONS --enable-preview | ||
| //JAVAC_OPTIONS --add-modules=jdk.incubator.vector | ||
|
|
||
| // JVM options for basic setup | ||
| //JAVA_OPTIONS --enable-preview | ||
| //JAVA_OPTIONS --add-modules=jdk.incubator.vector | ||
|
|
||
| package org.beehive.gpullama3.cli; | ||
|
|
||
| import org.beehive.gpullama3.Options; | ||
| import org.beehive.gpullama3.auxiliary.LastRunMetrics; | ||
| import org.beehive.gpullama3.inference.sampler.Sampler; | ||
| import org.beehive.gpullama3.model.Model; | ||
|
|
||
| import java.io.IOException; | ||
|
|
||
| import static org.beehive.gpullama3.inference.sampler.Sampler.createSampler; | ||
| import static org.beehive.gpullama3.model.loader.ModelLoader.loadModel; | ||
|
|
||
| /** | ||
| * LlamaTornadoCli - Pure Java CLI for running llama-tornado models | ||
| * | ||
| * This class provides a standalone command-line interface for running LLaMA models | ||
| * with TornadoVM acceleration. It can be executed directly with JBang or as a | ||
| * compiled Java application. | ||
| * | ||
| * Usage with JBang: | ||
| * jbang LlamaTornadoCli.java --model path/to/model.gguf --prompt "Your prompt here" | ||
| * | ||
| * Usage as compiled application: | ||
| * java --enable-preview --add-modules jdk.incubator.vector \ | ||
| * -cp target/gpu-llama3-0.3.1.jar \ | ||
| * org.beehive.gpullama3.cli.LlamaTornadoCli \ | ||
| * --model path/to/model.gguf --prompt "Your prompt here" | ||
| * | ||
| * Examples: | ||
| * # Interactive chat mode | ||
| * jbang LlamaTornadoCli.java -m model.gguf --interactive | ||
| * | ||
| * # Single instruction mode | ||
| * jbang LlamaTornadoCli.java -m model.gguf -p "Explain quantum computing" | ||
| * | ||
| * # With TornadoVM acceleration | ||
| * jbang LlamaTornadoCli.java -m model.gguf -p "Hello" --use-tornadovm true | ||
| * | ||
| * # Custom temperature and sampling | ||
| * jbang LlamaTornadoCli.java -m model.gguf -p "Tell me a story" \ | ||
| * --temperature 0.7 --top-p 0.9 --max-tokens 512 | ||
| */ | ||
| public class LlamaTornadoCli { | ||
|
|
||
| // Configuration flags | ||
| public static final boolean USE_VECTOR_API = Boolean.parseBoolean( | ||
| System.getProperty("llama.VectorAPI", "true")); | ||
| public static final boolean SHOW_PERF_INTERACTIVE = Boolean.parseBoolean( | ||
| System.getProperty("llama.ShowPerfInteractive", "true")); | ||
|
|
||
| /** | ||
| * Run a single instruction and display the response | ||
| */ | ||
| private static void runSingleInstruction(Model model, Sampler sampler, Options options) { | ||
| String response = model.runInstructOnce(sampler, options); | ||
| System.out.println(response); | ||
| if (SHOW_PERF_INTERACTIVE) { | ||
| LastRunMetrics.printMetrics(); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Main entry point for the CLI application | ||
| * | ||
| * @param args command-line arguments (see Options.parseOptions for details) | ||
| * @throws IOException if model loading fails | ||
| */ | ||
| public static void main(String[] args) throws IOException { | ||
| // Print banner | ||
| printBanner(); | ||
|
|
||
| // Check if help requested | ||
| if (args.length == 0 || hasHelpFlag(args)) { | ||
| Options.printUsage(System.out); | ||
| System.exit(0); | ||
| } | ||
|
|
||
| try { | ||
| // Parse options | ||
| Options options = Options.parseOptions(args); | ||
|
|
||
| // Load model | ||
| Model model = loadModel(options); | ||
|
|
||
| // Create sampler | ||
| Sampler sampler = createSampler(model, options); | ||
|
|
||
| // Run in interactive or single-instruction mode | ||
| if (options.interactive()) { | ||
| System.out.println("Starting interactive chat mode..."); | ||
| System.out.println("Type your messages below (Ctrl+C to exit):"); | ||
| System.out.println(); | ||
| model.runInteractive(sampler, options); | ||
| } else { | ||
| runSingleInstruction(model, sampler, options); | ||
| } | ||
| } catch (Exception e) { | ||
| System.err.println("Error: " + e.getMessage()); | ||
| e.printStackTrace(); | ||
| System.exit(1); | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Check if help flag is present in arguments | ||
| */ | ||
| private static boolean hasHelpFlag(String[] args) { | ||
| for (String arg : args) { | ||
| if (arg.equals("--help") || arg.equals("-h")) { | ||
| return true; | ||
| } | ||
| } | ||
| return false; | ||
| } | ||
|
|
||
| /** | ||
| * Print ASCII banner | ||
| */ | ||
| private static void printBanner() { | ||
| System.out.println(""" | ||
| ╔══════════════════════════════════════════════════════════╗ | ||
| ║ Llama-Tornado CLI - GPU-Accelerated LLM ║ | ||
| ║ Powered by TornadoVM & Java 21 ║ | ||
| ╚══════════════════════════════════════════════════════════╝ | ||
| """); | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is inconsistent spacing in this table cell. The word "Powered" has extra trailing spaces compared to other cells, which affects the visual alignment of the banner ASCII art.