Natural language control for robotic manipulation through LLM-generated code
A robotic task planner that translates natural language commands into executable manipulation tasks using large language models and physics simulation.
The system enables intuitive robot control through conversational commands. Users describe tasks in plain English, which are converted into executable Python code by an LLM, reviewed, and executed in a physics-accurate simulation environment.
Key capabilities:
- Natural language interface via Streamlit chat
- Real-time MuJoCo physics simulation with visual feedback
- Code generation and review workflow for safe execution
- Robust pick-and-place primitives with error handling
- Rich perception API optimized for LLM reasoning
Prerequisites:
Setup:
git clone https://github.com/kalaiselvan-t/llm-task-planner.git
cd llm-task-planner
uv sync
ollama pull qwen2.5-coder:14b# Start Ollama (if not already running)
ollama serve
# Launch application
uv run streamlit run app.pyThe Streamlit interface opens at http://localhost:8501 while a MuJoCo viewer window displays the simulation.
Example commands:
- "Pick up the red cube"
- "Place the blue cube next to the green cube"
- "Stack all cubes by size, largest at the bottom"
- "Move all red objects to the left side of the table"
Workflow:
- Enter natural language prompt
- Review generated code (regenerate if needed)
- Execute and observe in real-time simulation
- Iterate with additional commands
# Pick object by name
sim.pick("red_cube")
# Place at target position [x, y, z]
sim.place([0.5, 0.0, 0.45])
# Return to home configuration
sim.move_to_home()
# Gripper control
sim.open_gripper()
sim.close_gripper()
sim.set_gripper(128) # 0=closed, 255=open# Query all movable objects
objects = sim.get_graspable_objects()
# Returns: [{"name": str, "pos": [x,y,z], "color": str, "size": float, "shape": str}, ...]
# Find object by properties
obj = sim.find_graspable_object(color="red", shape="cube")
# Get position by name
pos = sim.get_object_position("red_cube")
# Count objects with filters
count = sim.count_graspable_objects(color="red")# Calculate offset from object
target = sim.get_position_offset("red_cube", dx=0.1, dy=0.0, dz=0.05)
# Validate reachability
if sim.is_position_reachable([0.5, 0.2, 0.45]):
sim.place([0.5, 0.2, 0.45])
# Get workspace bounds
bounds = sim.get_workspace_bounds()
# Returns: {"x": [0.3, 0.8], "y": [-0.3, 0.3], "z": [0.0, 0.8]}The system consists of two threads synchronized via event-based communication:
Main Thread (Streamlit UI):
- Chat interface for user interaction
- LLM code generation and streaming
- Code review and execution control
Background Thread (MuJoCo Viewer):
- Persistent simulation loop
- Task execution in viewer context
- Real-time visualization with wall-clock sync
Synchronization: TaskExecutor mediates between threads using threading.Event for simple, blocking task submission without queues.
llm-task-planner/
├── src/llm_task_planner/
│ ├── __main__.py # UI, viewer thread, task executor
│ ├── core/
│ │ ├── simulator.py # Main simulator (combines mixins)
│ │ ├── perception.py # Object queries and spatial reasoning
│ │ ├── motion.py # Inverse kinematics and gripper control
│ │ ├── tasks.py # Pick-and-place primitives
│ │ └── llm_interface.py # Ollama integration
│ └── config/
│ └── simulation_params.yaml
├── franka_emika_panda/
│ └── panda_with_table.xml # Scene definition with objects
├── scripts/
│ └── spawn_cubes.py # Cube object generator
└── tests/
Workspace bounds (defined in perception.py):
- X: 0.3 to 0.8m | Y: -0.3 to 0.3m | Z: 0.0 to 0.8m
- Table surface: 0.165m height
Task execution constants (defined in tasks.py):
- Approach height: 0.1m
- Place height: 0.01m
- Motion timeout: 2000 steps (~4s)
LLM configuration:
llm = LLM(model="qwen2.5-coder:14b", base_url="http://localhost:11434")Ollama connection error:
ollama serve # Ensure Ollama is runningRobot not responding:
- Verify object names are exact matches (case-sensitive)
- Check positions are within workspace bounds
- Ensure MuJoCo viewer window is responsive
Invalid generated code:
- Click "Regenerate" for alternative solution
- Check Ollama model is loaded:
ollama list
Direct API usage example:
from llm_task_planner.core.simulator import Simulator
sim = Simulator()
sim.move_to_home()
# Test perception
objects = sim.get_graspable_objects()
print(f"Found {len(objects)} objects")
# Test manipulation
red_cube = sim.find_graspable_object(color="red")
if red_cube:
sim.pick(red_cube["name"])
sim.place([0.5, 0.0, 0.20])MIT
