Skip to content

kedar49/Snake-Apple

Repository files navigation

Snake's & The Golden Apple - Advanced RL System

A competitive multi-agent reinforcement learning environment implementing state-of-the-art Deep Q-Learning algorithms.

Environment Algorithm Framework

Game Environment

Bluessy Redish Golden Apple

Competitive Snake Environment

Game in Action

Snake RL Training

Real-time training visualization showing competitive gameplay between AI agents

Technical Architecture

Deep Q-Network (DQN) Implementation

  • Input: 25-dimensional state vector
  • Architecture: 256 → 128 → 4 neurons
  • Activation: ReLU (hidden), Linear (output)
  • Optimizer: Adam (lr=0.001)

Double DQN Algorithm

# Reduces overestimation bias
target_q_values = target_network(next_states).gather(1, next_actions.unsqueeze(1))
targets = rewards + (gamma * target_q_values * (1 - dones))

Dueling DQN Architecture

# Separate value and advantage streams
value_stream = self.value_stream(features)
advantage_stream = self.advantage_stream(features)
q_values = value_stream + (advantage_stream - advantage_stream.mean(dim=1, keepdim=True))

Prioritized Experience Replay

  • Buffer Size: 50,000 experiences
  • Alpha: 0.6 (prioritization strength)
  • Beta: 0.4 (importance sampling)
  • Sampling: TD-error based priority

State Representation (25D)

Index Feature Description Range
0-3 Snake 1 direction One-hot encoding [0,1]
4-7 Snake 2 direction One-hot encoding [0,1]
8-11 Food direction One-hot encoding [0,1]
12-15 Wall proximity One-hot encoding [0,1]
16-19 Snake 1 body proximity One-hot encoding [0,1]
20-23 Snake 2 body proximity One-hot encoding [0,1]
24 Snake 1 length Normalized [0,1]

Reward Engineering

Primary Rewards

  • Food Consumption: 10 + length_bonus + speed_bonus + competitive_bonus
  • Survival: 0.1 * (1 - frame_iteration/1000)
  • Collision: -10

Advanced Rewards

  • Proximity to Food: exp(-distance_to_food/10) * 2
  • Competitive Advantage: 2 if closer_to_food_than_opponent else 0
  • Efficiency: length * 0.1 / max(frame_iteration, 1)

Hyperparameters

LEARNING_RATE = 0.001
GAMMA = 0.99
EPSILON_START = 1.0
EPSILON_END = 0.01
EPSILON_DECAY = 0.995
BATCH_SIZE = 64
TARGET_UPDATE_FREQUENCY = 1000
MEMORY_CAPACITY = 50000

Installation

# Clone repository
git clone <repository-url>
cd Snake-Apple

# Install dependencies
pip install torch pygame numpy matplotlib

# Run training
python train.py

Usage

Basic Training

python train.py

Testing

python test_enhanced_snake.py

Controls

  • SPACE: Pause/Resume
  • H: Toggle help
  • +/-: Speed control
  • R: Reset speed
  • ESC: Exit

File Structure

Snake-Apple/
├── model.py              # DQN architecture
├── snake_env.py          # Game environment
├── train.py              # Training loop
├── logger.py             # Metrics tracking
├── config.py             # Configuration
├── game_assets.py        # Asset loading
├── test_enhanced_snake.py # Test suite
├── requirements.txt      # Dependencies
├── models/               # Checkpoints
└── logs/                 # Training logs

Performance Metrics

Training Dashboard

  • Loss tracking with gradient clipping
  • Q-value monitoring per agent
  • Epsilon decay visualization
  • Win rate analysis
  • Game length distribution

Expected Performance

  • Convergence: 2000-5000 games
  • Peak Score: 15-25 average
  • Win Rate: 60-70%
  • Training Time: 2-4 hours (CPU), 30-60 min (GPU)

Advanced Features

Multi-Agent Competition

  • Simultaneous learning
  • Competitive dynamics
  • Adaptive opponents

Model Persistence

  • Checkpoint system
  • Resume training
  • Version control

Real-time Monitoring

  • Live metrics
  • Performance plots
  • Interactive controls

Troubleshooting

Common Issues

  1. CUDA OOM: Reduce batch size
  2. Slow Training: Enable GPU
  3. Poor Performance: Tune hyperparameters
  4. Memory Leaks: Check buffer size

Optimization

  • GPU acceleration (3-5x speedup)
  • Batch processing
  • Memory management
  • Parallel training

Research Applications

  • Multi-agent RL dynamics
  • Algorithm comparison
  • Strategic gameplay analysis
  • Competitive learning

License

MIT License

Citation

@software{snake_rl_2024,
  title={Snake's & The Golden Apple: Advanced Multi-Agent RL},
  author={[Your Name]},
  year={2024}
}

About

Snake's Food Hunt" is a competitive AI-driven game where two snakes learn to navigate, collect food, and avoid collisions using Deep Q-Learning. The project demonstrates reinforcement learning in a dynamic environment, showcasing how agents evolve their strategies over time.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors