Asap7772

Anikait Singh Asap7772

I am a PhD student in Computer Science at Stanford University. My research interests are in scaling up decision-making methods such as reinforcement learning.

84 followers · 22 following

Achievements

Highlights

Organizations

Pinned Loading

fewshot-preference-optimization fewshot-preference-optimization Public

Few-Shot Preference Optimization (FSPO) personalizes LLMs by reframing reward modeling as a meta-learning problem, enabling rapid adaptation to user preferences with minimal labeled data, leveragin…

Python 13 5
understanding-rlhf understanding-rlhf Public

Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Our new work finds that approaches employing on-policy samplin…

Python 32 4
PTR PTR Public

This repository contains the implementation of the PTR algorithm described in the paper: Pre-Training for Robots: Leveraging Diverse Multitask Data via Offline Reinforcement Learning.

Python 30 3
OfflineRlWorkflow OfflineRlWorkflow Public

This repository accompanies the following paper: A Workflow for Offline Model-Free Robotic RL

Python 12 2
Personalized-Text-To-Image-Diffusion Personalized-Text-To-Image-Diffusion Public

Public Implementation of PPD

Python 11
DeepCriminalize DeepCriminalize Public

Project that uses GAN's to develop a sketch artist like representation of a criminal. Winners of the Cal Hack Fellowship 2019

Python 2 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anikait Singh Asap7772

Achievements

Achievements

Highlights

Organizations

Block or report Asap7772

Pinned Loading

Uh oh!