-
Stanford University
- California
-
15:15
(UTC -12:00) - asap7772.github.io
- https://orcid.org/0000-0001-5286-6082
- @Anikait_Singh_
- in/asap7772
- https://huggingface.co/Asap7772
Highlights
- Pro
Pinned Loading
-
fewshot-preference-optimization
fewshot-preference-optimization PublicFew-Shot Preference Optimization (FSPO) personalizes LLMs by reframing reward modeling as a meta-learning problem, enabling rapid adaptation to user preferences with minimal labeled data, leveragin…
-
understanding-rlhf
understanding-rlhf PublicLearning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Our new work finds that approaches employing on-policy samplin…
-
OfflineRlWorkflow
OfflineRlWorkflow PublicThis repository accompanies the following paper: A Workflow for Offline Model-Free Robotic RL
-
Personalized-Text-To-Image-Diffusion
Personalized-Text-To-Image-Diffusion PublicPublic Implementation of PPD
Python 11
-
DeepCriminalize
DeepCriminalize PublicProject that uses GAN's to develop a sketch artist like representation of a criminal. Winners of the Cal Hack Fellowship 2019
If the problem persists, check the GitHub status page or contact support.




