GridWorld

GridWorld is a small grid-environment project for experimenting with pathfinding and partially observable policy learning. The repository is now intentionally focused on two search paths only:

deterministic A* search
the file-backed PyTorch PPO example

The core gridworld package still provides the environment, rendering, transition model construction, and the gym-like reset / step interface used by those two paths.

Installation

This repo targets Python 3.13+ on CPython and uses uv.

git clone https://github.com/StengerJ/GridFinder
cd GridFinder
uv python install 3.13
uv venv .venv --python 3.13

Activate the virtual environment:

# Windows PowerShell
.venv\Scripts\Activate.ps1

# macOS / Linux
source .venv/bin/activate

Install dependencies:

uv pip install -r Requirements.txt
uv pip install -e .

Core Grid Usage

from gridworld import GridWorld

world = """
wwwww
wa gw
w o w
wwwww
"""

env = GridWorld(world, slip=0.0, random_state=7)
state = env.reset()
next_state, reward, done, info = env.step(0, testing=True)

print(state, next_state, reward, done, info)
print(env.P_sas.shape, env.R_sa.shape)

A* Search

The deterministic A* example lives in examples/search/Astar/ and reads committed maps from examples/maps/search/.

25 small maps are stored in examples/maps/search/small/
25 large maps are stored in examples/maps/search/large/

Example commands:

python examples/search/Astar/main.py --world small
python examples/search/Astar/main.py --world big
python examples/search/Astar/main.py --world small --variant 7
python examples/search/Astar/main.py --world big --variant 12 --no-render

File-Backed PPO

The PPO example lives in examples/Policy-Optimization/ and trains on committed map corpora stored under examples/maps/policy_optimization/.

train/stage1, train/stage2, train/stage3 each contain 64 maps
eval/stage1, eval/stage2, eval/stage3 each contain 16 held-out maps

Useful commands:

python examples/Policy-Optimization/generate_map_corpus.py --force
python examples/Policy-Optimization/train_ppo.py
python examples/Policy-Optimization/evaluate_policy.py --checkpoint logs/policy_optimization/checkpoints/best.pt --stage 3
python examples/Policy-Optimization/render_episode.py --checkpoint logs/policy_optimization/checkpoints/best.pt --stage 3

More detail is in examples/Policy-Optimization/README.md.

Tests

Run the current test suite with:

python -m unittest discover -s tests

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
assets		assets
examples		examples
gridworld		gridworld
logs/policy_optimization		logs/policy_optimization
tests		tests
.gitignore		.gitignore
GW_documentation.pdf		GW_documentation.pdf
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
Requirements.txt		Requirements.txt
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GridWorld

Installation

Core Grid Usage

A* Search

File-Backed PPO

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GridWorld

Installation

Core Grid Usage

A* Search

File-Backed PPO

Tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages