Trustformer

Trustformer is a Transformer-based language model implementation in Rust. This project was built to learn how transformers work at a low level.

Project Structure

src/: Core implementation.
- embeddings/: Token and positional embeddings.
- model/: High-level model architecture (Decoder).
- sampling/: Text generation and sampling strategies.
- tensor/: Custom tensor operations and math backend.
- tokenizer/: BPE tokenizer training and inference.
- transformer/: Transformer blocks (Attention, FeedForward, LayerNorm).
- utils/: Helper functions.
data/: Directory for training data and other resources.
tests/: Integration tests.

Getting Started

Ensure you have Rust installed.
Place your training data in data/training_data.txt.
Run the project:

cargo run

Features

Tokenizer: Byte Pair Encoding (BPE) tokenizer.
Model: Decoder-only Transformer architecture.
Sampling: Text generation with temperature sampling.

Usage

The main.rs file demonstrates how to:

Load training data.
Train the tokenizer.
Initialize the model.
Generate text based on a prompt.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trustformer

Project Structure

Getting Started

Features

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Trustformer

Project Structure

Getting Started

Features

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages