You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sana is a hybrid DiT architecture with <3B params that turns one image and a camera trajectory into 720p, minute-long, controllable video on a single GPU.
A few open questions
which dataset should be used to tune
What skill could be teach the model ( example - train on generating specific scenes better)
What features could be added offline to improve GPU util when training
Sana is a hybrid DiT architecture with <3B params that turns one image and a camera trajectory into 720p, minute-long, controllable video on a single GPU.
A few open questions