📚 ScratchSeq

ScratchSeq is a from-scratch learning project for sequence modeling and language understanding, implemented with a PyTorch-first mindset.
The goal is not benchmark performance, but mechanistic understanding — how sequence models evolved, why each innovation mattered, and how to implement them cleanly.

🎯 Philosophy

Implement core models before using abstractions
Read original papers alongside code
Prefer minimal, inspectable implementations
Focus on learning signals, gradients, and inductive biases

This repository is designed as a learning timeline, not a model zoo.

ROADMAP available at TIMELINE

🔍 Non-Goals

❌ No large-scale pretraining
❌ No SOTA chasing
❌ No heavy frameworks or wrappers
❌ No “black box” usage

📌 Outcome

By completing ScratchSeq, one should be able to:

Derive sequence models from first principles
Understand why transformers replaced recurrence
Reason about attention, memory, and scaling limits
Read modern LLM papers without hand-waving gaps

🧱 Status

🚧 Work in progress — built incrementally alongside paper reading and experimentation.

ScratchSeq is about earning intuition, not importing it.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
data		data
docs		docs
experiments/ngrams		experiments/ngrams
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 ScratchSeq

🎯 Philosophy

🔍 Non-Goals

📌 Outcome

🧱 Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 ScratchSeq

🎯 Philosophy

🔍 Non-Goals

📌 Outcome

🧱 Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages