LLM Task Planner

Natural language control for robotic manipulation through LLM-generated code

A robotic task planner that translates natural language commands into executable manipulation tasks using large language models and physics simulation.

Overview

The system enables intuitive robot control through conversational commands. Users describe tasks in plain English, which are converted into executable Python code by an LLM, reviewed, and executed in a physics-accurate simulation environment.

Key capabilities:

Natural language interface via Streamlit chat
Real-time MuJoCo physics simulation with visual feedback
Code generation and review workflow for safe execution
Robust pick-and-place primitives with error handling
Rich perception API optimized for LLM reasoning

Installation

Prerequisites:

Python 3.11 or higher
uv package manager
Ollama with qwen2.5-coder:14b model

Setup:

git clone https://github.com/kalaiselvan-t/llm-task-planner.git
cd llm-task-planner
uv sync
ollama pull qwen2.5-coder:14b

Quick Start

# Start Ollama (if not already running)
ollama serve

# Launch application
uv run streamlit run app.py

The Streamlit interface opens at http://localhost:8501 while a MuJoCo viewer window displays the simulation.

Example commands:

"Pick up the red cube"
"Place the blue cube next to the green cube"
"Stack all cubes by size, largest at the bottom"
"Move all red objects to the left side of the table"

Workflow:

Enter natural language prompt
Review generated code (regenerate if needed)
Execute and observe in real-time simulation
Iterate with additional commands

API Reference

Core Manipulation

# Pick object by name
sim.pick("red_cube")

# Place at target position [x, y, z]
sim.place([0.5, 0.0, 0.45])

# Return to home configuration
sim.move_to_home()

# Gripper control
sim.open_gripper()
sim.close_gripper()
sim.set_gripper(128)  # 0=closed, 255=open

Perception

# Query all movable objects
objects = sim.get_graspable_objects()
# Returns: [{"name": str, "pos": [x,y,z], "color": str, "size": float, "shape": str}, ...]

# Find object by properties
obj = sim.find_graspable_object(color="red", shape="cube")

# Get position by name
pos = sim.get_object_position("red_cube")

# Count objects with filters
count = sim.count_graspable_objects(color="red")

Spatial Reasoning

# Calculate offset from object
target = sim.get_position_offset("red_cube", dx=0.1, dy=0.0, dz=0.05)

# Validate reachability
if sim.is_position_reachable([0.5, 0.2, 0.45]):
    sim.place([0.5, 0.2, 0.45])

# Get workspace bounds
bounds = sim.get_workspace_bounds()
# Returns: {"x": [0.3, 0.8], "y": [-0.3, 0.3], "z": [0.0, 0.8]}

Architecture

The system consists of two threads synchronized via event-based communication:

Main Thread (Streamlit UI):

Chat interface for user interaction
LLM code generation and streaming
Code review and execution control

Background Thread (MuJoCo Viewer):

Persistent simulation loop
Task execution in viewer context
Real-time visualization with wall-clock sync

Synchronization: TaskExecutor mediates between threads using threading.Event for simple, blocking task submission without queues.

Project Structure

llm-task-planner/
├── src/llm_task_planner/
│   ├── __main__.py              # UI, viewer thread, task executor
│   ├── core/
│   │   ├── simulator.py         # Main simulator (combines mixins)
│   │   ├── perception.py        # Object queries and spatial reasoning
│   │   ├── motion.py            # Inverse kinematics and gripper control
│   │   ├── tasks.py             # Pick-and-place primitives
│   │   └── llm_interface.py     # Ollama integration
│   └── config/
│       └── simulation_params.yaml
├── franka_emika_panda/
│   └── panda_with_table.xml     # Scene definition with objects
├── scripts/
│   └── spawn_cubes.py           # Cube object generator
└── tests/

Configuration

Workspace bounds (defined in perception.py):

X: 0.3 to 0.8m | Y: -0.3 to 0.3m | Z: 0.0 to 0.8m
Table surface: 0.165m height

Task execution constants (defined in tasks.py):

Approach height: 0.1m
Place height: 0.01m
Motion timeout: 2000 steps (~4s)

LLM configuration:

llm = LLM(model="qwen2.5-coder:14b", base_url="http://localhost:11434")

Troubleshooting

Ollama connection error:

ollama serve  # Ensure Ollama is running

Robot not responding:

Verify object names are exact matches (case-sensitive)
Check positions are within workspace bounds
Ensure MuJoCo viewer window is responsive

Invalid generated code:

Click "Regenerate" for alternative solution
Check Ollama model is loaded: ollama list

Development

Direct API usage example:

from llm_task_planner.core.simulator import Simulator

sim = Simulator()
sim.move_to_home()

# Test perception
objects = sim.get_graspable_objects()
print(f"Found {len(objects)} objects")

# Test manipulation
red_cube = sim.find_graspable_object(color="red")
if red_cube:
    sim.pick(red_cube["name"])
    sim.place([0.5, 0.0, 0.20])

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
docs/img		docs/img
examples		examples
franka_emika_panda		franka_emika_panda
scripts		scripts
src/llm_task_planner		src/llm_task_planner
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Task Planner

Overview

Installation

Quick Start

API Reference

Core Manipulation

Perception

Spatial Reasoning

Architecture

Project Structure

Configuration

Troubleshooting

Development

License

About

Uh oh!

Languages

License

kalaiselvan-t/llm-task-planner

Folders and files

Latest commit

History

Repository files navigation

LLM Task Planner

Overview

Installation

Quick Start

API Reference

Core Manipulation

Perception

Spatial Reasoning

Architecture

Project Structure

Configuration

Troubleshooting

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages