This document defines the persistent data model of Notable and the storage of stroke point lists as implemented. It describes the structures and fields that actually exist in the codebase. It is a specification of the current structure, not a change log. It was created by AI, and roughly checked for correctness. Refer to code for actual implementation.
Contents:
- Entities and relationships
- Logical schemas
- Stroke point list storage format
- Spatial indexing
- Read/write semantics (normative behavior)
-
Folder https://github.com/Ethran/notable/blob/main/app/src/main/java/com/ethran/notable/data/db/Folder.kt
- Hierarchical container. Fields in code:
id(String UUID PK),title,parentFolderId(nullable FK to Folder),createdAt,updatedAt.
- Hierarchical container. Fields in code:
-
Page https://github.com/Ethran/notable/blob/main/app/src/main/java/com/ethran/notable/data/db/Page.kt
- A document entry with optional notebook grouping. Fields:
id(String UUID PK),scroll,notebookId(nullable FK),background,backgroundType,parentFolderId(nullable FK to Folder),createdAt,updatedAt.
- A document entry with optional notebook grouping. Fields:
-
Stroke https://github.com/Ethran/notable/blob/main/app/src/main/java/com/ethran/notable/data/db/Stroke.kt
- Addressable record containing style and geometry inline. Fields:
id(String UUID PK),size(Float),pen(serializedPen),color(Int ARGB), bounding box floats (top,bottom,left,right),points(List),pageId(FK), timestamps.
- Addressable record containing style and geometry inline. Fields:
-
StrokePoint (geometry payload) https://github.com/Ethran/notable/blob/1c242de6a005abece5e2d246cdb9e90b34206611/app/src/main/java/com/ethran/notable/data/db/Stroke.kt#L18C12-L18C23
- Inlined per-point samples:
x: Float,y: Float, optionalpressure: Float?, optionaltiltX: Int?,tiltY: Int?, optionaldt: UShort?, plus legacy serialized fields (timestamp,size) retained for backward compatibility.
- Inlined per-point samples:
-
Notebook https://github.com/Ethran/notable/blob/main/app/src/main/java/com/ethran/notable/data/db/Notebook.kt
- Referenced by
Page.notebookId(grouping construct; details in code).
- Referenced by
-
Image https://github.com/Ethran/notable/blob/main/app/src/main/java/com/ethran/notable/data/db/Image.kt
- Raster asset placed on a page (file defines its own fields in code).
The current implementation uses Room with UUID string primary keys (not integer autoincrement). Strokes embed their point list directly (no separate blob table, no quantized integer bbox).
-- folders (Entity: Folder)
CREATE TABLE folder (
id TEXT PRIMARY KEY,
title TEXT NOT NULL,
parentFolderId TEXT REFERENCES folder(id) ON DELETE CASCADE,
createdAt INTEGER NOT NULL, -- epoch ms (Room Date)
updatedAt INTEGER NOT NULL
);
CREATE INDEX index_folder_parentFolderId ON folder(parentFolderId);
-- pages (Entity: Page)
CREATE TABLE page (
id TEXT PRIMARY KEY,
scroll INTEGER NOT NULL,
notebookId TEXT REFERENCES notebook(id) ON DELETE CASCADE,
background TEXT NOT NULL,
backgroundType TEXT NOT NULL,
parentFolderId TEXT REFERENCES folder(id) ON DELETE CASCADE,
createdAt INTEGER NOT NULL,
updatedAt INTEGER NOT NULL
);
CREATE INDEX index_page_notebookId ON page(notebookId);
CREATE INDEX index_page_parentFolderId ON page(parentFolderId);
-- strokes (Entity: Stroke)
CREATE TABLE stroke (
id TEXT PRIMARY KEY,
size REAL NOT NULL,
pen TEXT NOT NULL, -- serialized Pen
color INTEGER NOT NULL, -- ARGB
top REAL NOT NULL,
bottom REAL NOT NULL,
left REAL NOT NULL,
right REAL NOT NULL,
points BLOB NOT NULL, -- serialized List<StrokePoint>
pageId TEXT NOT NULL REFERENCES page(id) ON DELETE CASCADE,
createdAt INTEGER NOT NULL,
updatedAt INTEGER NOT NULL
);
CREATE INDEX index_stroke_pageId ON stroke(pageId);Notes:
- Column affinities shown reflect typical Room output;
pointsmay appear asTEXTorBLOBdepending on the converter but is treated as opaque serialized data.
https://github.com/Ethran/notable/blob/main/app/src/main/java/com/ethran/notable/editor/utils/pen.kt
The stored field is pen (serialized form of Pen).
https://github.com/Ethran/notable/blob/dev/app/src/main/java/com/ethran/notable/data/db/ (implementation in stroke_mask_helpers.kt / formerly stroke_encoding.kt)
We store List<StrokePoint> in a custom binary Structure-of-Arrays (SoA) format (NOT JSON).
Format name (informal): SB1 (Stroke Binary v1). All multi-byte values are little-endian.
Each StrokePoint logical fields (per code):
x: Floaty: Floatpressure: Float?tiltX: Int?tiltY: Int?dt: UShort?- legacy (not serialized in SB1):
timestamp: Long?,size: Float?
Optional fields are UNIFORM per stroke: either present for every point or absent for all (enforced).
Offset Size Field
0 1 MAGIC0 = 'S' (0x53)
1 1 MAGIC1 = 'B' (0x42)
2 1 VERSION = 1
3 1 MASK (bitfield, see 3.2)
4 4 COUNT (Int, number of points, encoder requires >= 1)
8 1 COMPRESSION (0 = none, 1 = LZ4)
-- header total = 9 bytes
Notes:
- Decoder rejects
version > 1. COUNTis stored even when fields compress well.- Encoder rejects empty lists.
Bit positions (set = field present for ALL points):
- bit 0 (0x01): pressure
- bit 1 (0x02): tiltX
- bit 2 (0x04): tiltY
- bit 3 (0x08): dt
Uniformity invariants:
- If a bit is set, every point has non-null value for that field.
- If a bit is clear, every point has null for that field. Future versions may relax this (would require version bump).
After the 9-byte header:
If COMPRESSION == 1 (LZ4):
[ RAW_BODY_SIZE (int32) ]
[ LZ4_COMPRESSED_BODY_BYTES ]
RAW_BODY_SIZE = exact size in bytes of the uncompressed raw body described below.
If COMPRESSION == 0:
[ RAW_BODY (uncompressed) ]
Raw (uncompressed) body structure:
X_SIZE (int32) // byte length of encoded X polyline
X_DATA [X_SIZE] (UTF-8 bytes)
Y_SIZE (int32)
Y_DATA [Y_SIZE] (UTF-8 bytes)
[ pressure[count] ] (if mask bit 0; each int16)
[ tiltX[count] ] (if bit 1; each int8)
[ tiltY[count] ] (if bit 2; each int8)
[ dt[count] ] (if bit 3; each uint16 stored in 2 bytes)
Order is fixed: X, Y, pressure, tiltX, tiltY, dt.
- Coordinates (
x,y):- Encoded independently via a polyline algorithm (precision = 2) → text string → UTF-8 bytes.
X_SIZE/Y_SIZEare the byte lengths of those UTF-8 sequences.
pressure: stored as signed 16-bit integer (short). The encoder castsFloat -> Int -> Short. Decoder convertsShort -> Float.tiltX,tiltY: stored as signed bytes (int8).dt: stored as unsigned 16-bit (uint16) in a Java/Kotlinshortslot; decoder usestoUShort().- Reserved sentinel:
0xFFFF(65535) for potential future per-point nulldt. Current decoder does NOT remap it to null; applications should avoid using 65535 if future null semantics are desired.
Raw body size:
rawBodySize =
4 + X_DATA.size +
4 + Y_DATA.size +
(mask.pressure ? count * 2 : 0) +
(mask.tiltX ? count * 1 : 0) +
(mask.tiltY ? count * 1 : 0) +
(mask.dt ? count * 2 : 0)
Total serialized size (final byte array):
totalSize =
9 /* header */ +
(compression == LZ4 ? 4 /* RAW_BODY_SIZE */ : 0) +
(compression == LZ4 ? compressedSize : rawBodySize)
- Algorithm: LZ4 (jpountz high compressor).
- Candidate only if
rawBodySize >= 512bytes. - Only accepted if
compressedSize <= rawBodySize * 0.75(i.e., ≥ 25% saving). - If accepted:
COMPRESSION = 1- A 4-byte
RAW_BODY_SIZEprecedes compressed payload.
- Otherwise:
COMPRESSION = 0- Raw body follows immediately (no size prefix).
Steps (decodeStrokePoints):
- Validate minimum length (
>= HEADER_SIZE + 8as a sanity check). - Read magic
'S' 'B'. - Read
VERSION; reject if> 1. - Read
MASKandCOUNT; reject ifCOUNT < 0. - Read
COMPRESSIONflag. - If compressed: read
RAW_BODY_SIZE, decompress remainder via LZ4 into a buffer of that size. - From the (decompressed or raw) body buffer:
- Read
X_SIZE,X_DATA, decode to list of Float. - Read
Y_SIZE,Y_DATA, decode. - Verify
xs.size == count && ys.size == count. - Conditionally read optional arrays by mask; ensure sufficient remaining bytes; error on truncation.
- Construct points by parallel index.
- If uncompressed path and trailing bytes remain → error (compressed path size is exact by construction).
- Return list.
encodeStrokePoints:
- Reject empty list.
- Compute
MASKfrom first point (presence flags). - Validate uniform presence across all points (throws on mismatch).
- Encode X & Y via polyline (precision=2) → UTF-8 bytes.
- Compute
rawBodySize; write raw body into a thread-local ByteBuffer. - Optionally LZ4-compress per compression policy.
- Write header (9 bytes): magic, version, mask, count, compression flag.
- If compressed: write
RAW_BODY_SIZE. - Append body (compressed or raw).
- Return final byte array.
- Decoder rejects higher versions.
dtsentinel (0xFFFF) reserved for future nullable per-point representation.- Potential future extensions: per-field metadata, sparse fields, alternative quantization.
Throws IllegalArgumentException for:
- Empty encode list.
- Non-uniform optional fields.
- Page size heuristic trigger (
y > 10_000_000fin first point). - Bad magic bytes.
- Unsupported version.
- Negative count.
- Truncated coordinate or optional sections.
- Invalid raw size when compressed (
<= 0). - Trailing bytes (uncompressed path).
- Body size mismatch sanity check.
- Missing compressed data.
- Additional polyline / simplification pipelines:
- Ignore unused attributes depending on tool mode:
- Ballpoint: x, y only
- Fountain pen: + pressure
- Pencil: + pressure + tilt