Skip to content

File Structure Analysis

M. T. Kimmins edited this page Jan 18, 2026 · 30 revisions

Phenakist's Exploration of File Structure

u/Phenakist has delineated the general regions of the flash chip as follows (see below for Segment 4-Region 13 correction):

So the structure of the firmware seems to be:
A) Chapter Map
B) Chapter
C) Transition
D) Chapter
E) Transtion
... (repeat for number of chapters in cassette) ...
F) [8552] bytes of code, common across all chip dumps
G) Empty space (FF FF FF FF...)
H) 16 byte chip ID and/or key (or possibly the chapter marks)
I) FF's to end of file

They also mentioned that there seems to be 12 segments of "chapters" which correlates with the carousel's 12 windows.

File Structure Diagram

Endianness

Cartridge data has been collected in binary. Bytes are written in 4-byte chunks of Little Endian, meaning that a 4-byte array (eg: 04 5c 78 00) is actually read backwards (0x785c04) as an address. Conversely, Big Endian notation would be forwards, like natural reading (0x045c7800) -- this is not the case for cartridge data!

Segments

Table 1. General stratification of cartridge data by segment. Addresses are reported in hexadecimal.

Arbitrary Designation Address Start Address End (inclusive) Size (bytes) Description
Segment 1 0x0 0x6B 112 This is a highly variable region with a few conserved markers. This is always the initial pattern to these cartridges. The pattern suggests that there are 4-byte addresses repeated here. In conversing with Gemini, it seems that these may make up a pointer-table. The first pointer seems to be always 0x4EE8 and the other 23 pointers are ascending and unique. In total, this will generate 12 unique regions and 12 conservative regions.
Segment 2 0x6C 0x4EE7 20088 This area is completely conserved (validated). It is further split into Segments 2.1 and 2.2 as per the pointer table in Segment 1.
Segment 2.1 0x6C 0x2439 9166 This region is completely conserved.
Segment 2.2 0x243A 0x4EE7 10926 This region is completely conserved.
Segment 3 0x4EE8 variable variable Unique data segment.
Segment 4 variable Variable 8552 This is the last significant data chunk. It is completely conserved (validated).
Segment 5 variable 0xFFF7F variable This is purely empty space. It takes up a decent amount of the chip.
Segment 6 0xFFF80 0xFFF8F 12 This is the last line of data unique to each book "series" (ie: Puppy, Elephant, and Lion are one trio "series").
Segment 7 0xFFF90 0xFFFFF 128 This is the last segment. It is simply empty space.

Cartridge Segment Specifications

A list of cartridge-specific segment specifications

Regions

Inter-Cartridge Region Comparisons
Table of Region References by Cartridge

Regions 1-12 & 13-24 within Segments 3 & 4, Respectively

There are 24 regions within Segments 3 and 4. Within each of these segments, half the regions are located (Regions 1-12 in Segment 3; Regions 13-24 in Segment 4). All Regions start with a 4-byte "size," followed by "80 3E" as the next two bytes for all Regions 1-24.

Delta-Table Theory

As previously stated, all Regions start with a 4-byte "size." In Regions 1-12, this size does not end at the Region, but at the start of a terminal table-like data structure. See "Possible PCM Areas" below for an explanation. Conversely, in Regions 13-24, this size does end the Region. This further supports the idea of a "delta-table," as the sound may be located in Regions 1-12 instead of Regions 13-24 -- as there are 12 film slides and these Regions are longest. The shorter Regions 13-24 are conserved between cartridges, thus indicating that the audio is not stored here. The lengths of Regions 13-24 are also unchanging between cartridges, which lends evidence that they are not some sort of decryption keys for their corresponding Region 1-12 pair (ie: 1 & 13, 2 & 14, ... 12 & 24). If they were some sort of decryption/decompression key for their earlier pair, I would anticipate that their length would be proportional to the length of their Region 1-12 counterparts, which they are not since cartridge Regions 1-12 vary in length, but Regions 13-24 do not vary at all. Potentially, Regions 13-24 may be constant keys used to decrypt each corresponding Region 1-12 through some algorithm...

Segment 4 and Region 13 Reconciliation

Phenakist originally noted that 8556 bytes prior to the start of Segment 5 is conserved among all books ("Segment 4"). However, the pointer table in Segment 1 suggests that Region 13 starts 8552 bytes prior to Segment 5. This 4-byte difference seems to be possibly deviant between books. See Segment 4 Region 13 Differentiation for an analysis of this 4-byte area leading to the correction of Segment 4's starting address.

Possible PCM Areas

PCM Breakdown for Regions 1-12 by Cartridge
Pulse-Code Modulation (PCM) is a type of compressed audio format. Each region can be further broken down into 2-3 distinguished parts. Regions 1-12 can be divided into Address, Body, and Map. Regions 13-24 can be simply divided into Address and Body.

PCM Address

This area is a 4 byte segment, the very first 4 bytes of each region. These 4 bytes actually denote the size of the PCM Body of the Region when converted into decimal. Remember retrograde conversion of hex to decimal in Little Endian (42 99 = 0x9942 = 39234 bytes).

PCM Body

This is suspected to be the area where the compressed audio is (in PCM format). To decompress the audio, a map of predictive changes based on the last value is provided to help reconstruct the audio coherently.

PCM Map (Delta table theory)

As far as I understand, this is a map of values that help predict the holes in the sampled audio of the Body.

Anatomy of a Blank Cartridge

Blank Cartridge Data Template

Clone this wiki locally