Emlet

An embedding engine built for the sovereign web.

Emlet is a fast, fully self-contained semantic embedding engine designed to run anywhere JavaScript runs—browser, Node, edge, offline. No dependencies, no GPU, no network calls. Just load and embed.

The entire engine fits in 1 MB and produces deterministic vector embeddings suitable for similarity search, clustering, retrieval, tagging, or downstream ML workflows.

Features

100M parameters, ~1MB total size
7K tokens/sec throughput (in the browser)
Deterministic output (same input → same vector)
Out-of-vocabulary synthesis (no missing tokens)
Unicode-aware (text, emoji, symbols, ZWJ)
Configurable vector size (1-1568D)
Offline-first, zero dependencies
Vanilla JavaScript, edge-ready
No GPU. No cloud. No API.
Self-extracting runtime
Neuro-symbolic core
A digital familiar

Installation

npm install emlet

Or load directly via CDN:

<script src="https://unpkg.com/emlet"></script>

This exposes both emlet (a preloaded instance) and Emlet (the class) globally.

Importing

// CommonJS
const emlet = require('emlet')
const { emlet, Emlet } = require('emlet')

// ESM
import emlet from 'emlet'
import { emlet, Emlet } from 'emlet'

Basic Usage

const vec = emlet.embed('Hello, world!')
console.log(vec)
// → [0.08, -0.01, ...] (96-dimensional vector by default)

The default export is a ready-to-use model instance.

Custom Models

You can create your own instance with a different output size:

const modelA = new Emlet()           // 96D default
const modelB = new Emlet(128)        // 128D output
const modelC = new Emlet(256, true)  // 256D head + 32D tail = 288D

Constructor

new Emlet(dim = 96, useTail = false)

dim Number of dimensions to emit from the primary embedding space.
useTail When true, appends a 32-dimensional “glimpse” of the full 1536D semantic space to every vector.

This allows output sizes from 1 up to 1536 dimensions, or 1568 when the tail is enabled.

Out-of-Vocabulary Synthesis

Tokens not present in the internal vocabulary are synthesized deterministically:

emlet.embed('quantaflux')

There are no unknown tokens and no fallbacks to zero vectors.

Unicode and Emoji Support

Emlet natively handles Unicode symbols, emoji, modifiers, and ZWJ sequences:

emlet.embed('🦄')
emlet.embed('👩🏽‍🚀')

These are embedded consistently and can be compared using standard vector similarity.

Punctuation Handling

Punctuation is normally stripped during tokenization. If the input is a single character, it is embedded as-is:

emlet.embed('.')
emlet.embed('[')

This allows punctuation-level modeling when needed without polluting normal text embeddings.

API Surface

Emlet intentionally exposes a minimal API:

embed(text: string): number[]
new Emlet(dim?: number, useTail?: boolean)

Everything else—chunking, similarity, indexing, clustering—is left to userland.

Examples

See test.js for example usage including batch encoding, similarity math, and vector inspection.

Testing

Emlet includes a test suite built with testr.

To run the test, first clone the repository:

git clone https://github.com/basedwon/emlet.git

Install the dependencies, then run npm test:

npm install
npm test

Donations

If Emlet sparks something useful in your work, consider sending some coin to support further development.

Bitcoin (BTC):

1JUb1yNFH6wjGekRUW6Dfgyg4J4h6wKKdF

Monero (XMR):

46uV2fMZT3EWkBrGUgszJCcbqFqEvqrB4bZBJwsbx7yA8e2WBakXzJSUK8aqT4GoqERzbg4oKT2SiPeCgjzVH6VpSQ5y7KQ

License

Emlet License v1.0 (based on Apache 2.0) Use is permitted with attribution. Redistribution, rebranding, resale, and reverse engineering are prohibited without written permission.

See LICENSE for full terms. Contact: basedwon@tuta.com for commercial or licensing inquiries.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
LICENSE		LICENSE
README.md		README.md
emlet.js		emlet.js
package.json		package.json
test.html		test.html
test.js		test.js
types.d.ts		types.d.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Emlet

Features

Installation

Importing

Basic Usage

Custom Models

Constructor

Out-of-Vocabulary Synthesis

Unicode and Emoji Support

Punctuation Handling

API Surface

Examples

Testing

Donations

License

About

Uh oh!

Releases

Packages

Languages

License

basedwon/emlet

Folders and files

Latest commit

History

Repository files navigation

Emlet

Features

Installation

Importing

Basic Usage

Custom Models

Constructor

Out-of-Vocabulary Synthesis

Unicode and Emoji Support

Punctuation Handling

API Surface

Examples

Testing

Donations

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages