Skip to content

Latest commit

 

History

History
62 lines (37 loc) · 2.47 KB

File metadata and controls

62 lines (37 loc) · 2.47 KB

RL-cogsup 💊👏🏼👩🏼‍🎓

(dopamine encouraging learning)

A description of some concepts explored in an RL course (pertaining to cognitive science or otherwise). Additionally, includes practical projects from sessions, code is built upon what was provided. The QR-DQN section is the project I chose to go on exploring further.

The class focused on RL in cognitive science, as well as exploring computer science algorithms and social learning theories.

Main professors: Mehdi Khamassi and Benoit Girard

Guest lecturers included : Olivier Sigaud and Ismael T. Freire


Table of Contents


🎯 Overview

In short, RL is important and cool whether in the brain or in AI systems
...(more to come)

📖 Topics explored

A general list of concepts explored throughout the course, whether theoretical or practical:

  • Model-Based vs Model-Free approaches, successor representation
  • Markov Decision Processes
  • Goal-Tracking vs Sign-Tracking behavior
  • Q-learning, and the different extensions, DQN, DDQN, QR-DQN, DYNA family,
  • The exploration/exploitation tradeoff
  • Experience replay (prioritized?) and replay buffer (biologically and algorithmically)
  • Memory recall, and replay (sleep 🛌🏼), place cells,
  • Exploration and learning in humans (curiosity based or otherwise), world models
  • Uncertainty (epistemic, aleatory, can be surprise, novelty, ...)
  • Social reinforcement learning (low/high fidelity, multi-agent environments), Sequential episodic control
  • RL on LLMs (e.g. VIPER system), LLMs for RL, and combination,

Project

I chose to further explore the QR-DQN algorithm, and compare it with DQN and DDQN across multiple quantiles and with different parameters.

QR-DQN

Graph showing preliminary results from comparison


Future work

  • Exploring other paradigms in RL and simulating in different environments

Note: I do not take full credit for all code in this repository, most of which has been built upon from practical sessions, credit will be added and others linked in upcoming updates