Fix LlamaDemo temperature, lifecycle, and LLM memory support by psiddh · Pull Request #239 · meta-pytorch/executorch-examples

psiddh · 2026-05-24T08:55:27Z

Summary

fbjni 0.7.0: Required for compatibility with main_exp AAR builds.
largeHeap="true": LLM models with 4096 context windows can use 2.5GB+ RSS. Enables Android's large heap allocation for the app process.

Context

These fixes were discovered while validating multi-turn conversation support with Qwen3 4B (INT8-INT4) on Samsung Galaxy S23 with max_context_length=4096. Note that the temperature needed to be set to say 0.5, o/w it made all models appear broken by generating 0-1 tokens.

Test plan

Build APK with ./gradlew assembleDebug
Load Llama 3.2 1B or Qwen3 4B model with XNNPACK backend
Verify model generates multi-token responses (temperature fix)
Verify multi-turn conversation works with KV cache accumulation
Verify no resource leaks on app close/rotate (onCleared fix)

Authored with Claude.

…support - Change DEFAULT_TEMPERATURE from 0.0 to 0.6 to prevent greedy decoding from hitting EOS token immediately on first generated token - Add onCleared() lifecycle handler to ChatViewModel for proper Module resource cleanup when ViewModel is destroyed - Update fbjni to 0.7.0 for main_exp AAR compatibility - Enable largeHeap in AndroidManifest for large LLM model support Authored with Claude.

…mperature - Remove onCleared() override — LlmModule.close() was added in the Kotlin conversion (#19211) but the published Maven AAR still ships the old Java version without it - Revert DEFAULT_TEMPERATURE to 0.0 — users can set temperature via the UI settings; no need to change the default Authored with Claude.

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 24, 2026

psiddh requested a review from kirklandsign May 24, 2026 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LlamaDemo temperature, lifecycle, and LLM memory support#239

Fix LlamaDemo temperature, lifecycle, and LLM memory support#239
psiddh wants to merge 2 commits into
mainfrom
fix/llama-demo-improvements

psiddh commented May 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

psiddh commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

psiddh commented May 24, 2026 •

edited

Loading