Skip to content

vararth/MoveNet-HumanPose-Estimation

Repository files navigation

Human Pose Estimation with MoveNet (MultiPose Lightning) — Images · Video · GIF · COCO Export

A production-ready, script-based Python project to run Google MoveNet MultiPose Lightning (via TensorFlow Hub) for multi-person 2D human pose estimation on images, videos, and animated GIFs. Includes an ergonomic CLI, reliable GIF I/O, and a COCO-style exporter (with optional annotated overlay output).

Highlights: multi-person keypoints (17 COCO joints), robust GIF handling, clean src/ layout, TF-Hub model loading, COCO JSON export, Windows-friendly, and WSL2 GPU notes.


✨ Features

  • CLI for quick inference on image | video | gif
  • Clean overlays (keypoints + skeleton) with OpenCV
  • COCO-style JSON export (optionally also writes the annotated media in the same run)
  • Reliable GIF support (read/write) using imageio
  • Reproducible project layout (separate modules, ready for VS Code)
  • Windows / WSL2 friendly (GPU on WSL2 Linux; CPU on native Windows)

🔎 Keywords (for discoverability)

MoveNet, MultiPose Lightning, TensorFlow Hub, human pose estimation, multi-person pose, COCO keypoints, 2D pose, OpenCV, imageio, GIF, WSL2, Windows, Python, CLI, computer vision


🧱 Project Structure

pose_project/
├─ data/                             # put your inputs/outputs here
│  └─ .gitkeep
├─ src/
│  └─ movenet_runner/
│     ├─ __init__.py
│     ├─ config.py
│     ├─ model.py
│     ├─ draw.py
│     ├─ io_utils.py
│     ├─ infer_core.py
│     ├─ cli.py
│     └─ export_coco.py
├─ tests/
│  ├─ test_io_utils.py
│  └─ test_infer_core.py
├─ requirements.txt
├─ pyproject.toml
├─ README.md
└─ .gitignore

🚀 Quickstart

1) Create environment & install

Windows (Command Prompt / PowerShell):

python -m venv .venv
.\.venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

WSL2 Ubuntu (recommended for GPU with TensorFlow):

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Note: On native Windows, TensorFlow runs CPU-only. For GPU, use WSL2 (Ubuntu).

2) Run inference

GIF → annotated GIF

pose-run --input data\input.gif --output data\output.gif --kind gif --size 256 --threshold 0.11 --fps 15

Image → annotated PNG

pose-run --input data\person.jpg --output data\person_out.png --kind image

Video → annotated MP4

pose-run --input data\clip.mp4 --output data\clip_out.mp4 --kind video --fps 30

📦 COCO Export (with optional overlay output)

JSON only (no overlay media):

python -m movenet_runner.export_coco ^
  --input data\input.gif ^
  --kind gif ^
  --output data\export.json

JSON + annotated overlay in the same run:

python -m movenet_runner.export_coco ^
  --input data\input.gif ^
  --kind gif ^
  --output data\export.json ^
  --overlay_out data\overlay.gif ^
  --fps 15

Works similarly for --kind image (write PNG) and --kind video (write MP4).


⚙️ CLI Reference

pose-run (inference & overlay)

--input <path>          # image/video/gif
--output <path>         # output media path
--kind [image|video|gif]
--size 256              # square model input (192/256/320)
--threshold 0.11        # keypoint visibility threshold
--fps 15                # for GIF/video writing

python -m movenet_runner.export_coco (export COCO JSON)

--input <path>          # image/video/gif
--kind [image|video|gif]
--output <path.json>    # COCO JSON
--overlay_out <path>    # (optional) also write annotated media (PNG/MP4/GIF)
--size 256
--threshold 0.11
--fps 15

📝 Notes & Tips

  • First run downloads model to TF-Hub cache (default under your user cache dir). You can customize with:

    TFHUB_CACHE_DIR=<custom_folder>
    
  • Performance: --size 192 is faster (slight accuracy drop). Larger sizes cost more time.

  • GIF handling: Done via imageio to avoid OpenCV GIF quirks. Output FPS is controlled by --fps.

  • Multi-person: MoveNet MultiPose Lightning returns up to 6 persons, each with 17 keypoints.

  • Coordinate mapping: Frames are letterboxed to a square for the model, then keypoints are mapped back to original resolution for overlay and COCO export.


🧪 Testing (optional but recommended)

pytest -q

🐛 Troubleshooting

  • ModuleNotFoundError: movenet_runner Ensure you ran pip install -e . in the activated venv, or run with PYTHONPATH=./src.

  • Red squiggles in VS Code (imports unresolved) Select the interpreter: Python: Select Interpreter → choose your project .venv. Add (optional) .env with PYTHONPATH=${workspaceFolder}/src.

  • Slow on Windows Expected on CPU-only TF. Use WSL2 for GPU acceleration.


📄 License

MIT © 2025 Siddharth Varshney


🙏 Acknowledgments

  • Google MoveNet MultiPose Lightning (TensorFlow Hub)
  • OpenCV, ImageIO, NumPy

About

Simple project to apply the MoveNet model for human pose estimation overlays on images, gifs and videos

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors