Research: Dart internal serialization/deserialization mechanism

## Summary

An analysis of Dart's internal serialization/deserialization mechanism performed at compile time when generating standalone executables or modular _snapshots_, that contain the information an embedded or separate Dart runtime requires to recreate a memory heap from which it will run and extract runtime information of compile-time constant objects. This will serve as the continuation of my [first issue](https://github.com/caverav/flutterdec/issues/23) where I describe the boostrapping flow for a Flutter application on the Android platform.

## Context

It isn't mandatory to read the first part in order to understand the topic covered here, but you might want to do it to get a grasp of the bigger picture and the architecture and inter-operation of the multiple processes involved in the runtime of a Flutter application. So, to make a clarification on the scope of both parts: the first part covered what was mostly Android-side embedder code covered. Here, I'll cover the native (C++) side of the framework, delving into the Dart virtual-machine internals.

* The `// [...]` comments mean that some parts of the function or code block were omitted, as to not lose focus on the more important lines. 
* I'll try to link the source file where the definition of a class is the first time I mention it, I might forget to do this.
* I'll be re-reading these notes, adding more details and going in-depth about certain aspects of the flow that are potentially important, so the version you are reading now is probably not final (2026-02-28).
* If I miss anything or you have proposed corrections or improvements to the clarity of the writing, please feel free to let me know.
* Code blocks annotated with `sdk` are from the Dart SDK itself. Assume that it is from the Flutter codebase otherwise.

## Part 1 - Initialization of the  Dart virtual machine

In the previous part, I described the Flutter's bootstrapping code from the Android platform code perspective, i.e mostly only the Java-side of the implementation: the embedder. I also mentioned an interesting function whose explanation I omitted, as it is a native function I wanted to save for this part, as it deserved a description separately. I'm talking about the `AttachJNI` function, and what it does. If you read the first part, you might recall that this function is first invoked from Java, from the `FlutterEngine` constructor, right after the `startInitizalization` and `ensureInitializationComplete` functions. Take a moment to read the description of `FlutterEngine`'s constructor behavior towards the end of the first issue so you have a clear notion of where inside the bootstrapping flow this occur. 

The course the fact that this function is called only after the aforementioned initialization functions is no coincidence: those two functions are in charge of loading and initializing the native-side code (`libflutter.so`), the binary where the native-side of the Flutter engine and the `AttachJNI` functions are defined.

`FlutterEngine.java`

```java
    if (!flutterJNI.isAttached()) {
      attachToJni();
    }
```

```java
  private void attachToJni() {
    // [...]
    flutterJNI.attachToNative();
    // [...]
  }
```

`FlutterJNI.java`

```java
  @UiThread
  public void attachToNative() {
    // [...]
      nativeShellHolderId = performNativeAttach(this);
    // [...]
  }
  @VisibleForTesting
  public long performNativeAttach(@NonNull FlutterJNI flutterJNI) {
    return nativeAttach(flutterJNI);
  }

  private native long nativeAttach(@NonNull FlutterJNI flutterJNI);
```

The `nativeAttach` function is implemented in C++, as the `AttachJNI` function:

`platform_view_android_jni_impl.cc`

```c++
static jlong AttachJNI(JNIEnv* env, jclass clazz, jobject flutterJNI) {
  fml::jni::JavaObjectWeakGlobalRef java_object(env, flutterJNI);
  std::shared_ptr<PlatformViewAndroidJNI> jni_facade =
      std::make_shared<PlatformViewAndroidJNIImpl>(java_object);
  auto shell_holder = std::make_unique<AndroidShellHolder>(
      FlutterMain::Get().GetSettings(), jni_facade,
      FlutterMain::Get().GetAndroidRenderingAPI()); // 1
  if (shell_holder->IsValid()) {
    return reinterpret_cast<jlong>(shell_holder.release());
  } else {
    return 0;
  }
}
```

An `AndroidShellHolder` instance is created as a singleton smart pointer (through the `make_unique` call). A value of `0` is returned back to Java if the operation isn't successful. Notice the call to `FlutterMain::Get().GetSettings()`. You might remember from the first part that a `FlutterMain` singleton was created by the `Init` function. The `settings` argument to the `AndroidShellHolder` constructor is thus nothing more than the option flags passed by `ensureInitializationComplete` to `Init`, which used to create `FlutterMain` singleton.

Let us now see the constructor call for the `AndroidShellHolder` class, which is where the rest of the initialization flow continues:

`android_shell_holder.cc`

```c++
AndroidShellHolder::AndroidShellHolder(
    const flutter::Settings& settings,
    std::shared_ptr<PlatformViewAndroidJNI> jni_facade,
    AndroidRenderingAPI android_rendering_api)
    : settings_(settings),
      jni_facade_(jni_facade),
      android_rendering_api_(android_rendering_api) {

  // [...]

  host_config.io_config = fml::Thread::ThreadConfig(
      flutter::ThreadHost::ThreadHostConfig::MakeThreadName(
          flutter::ThreadHost::Type::kIo, thread_label),
      fml::Thread::ThreadPriority::kNormal); // 1

  thread_host_ = std::make_shared<ThreadHost>(host_config); // 2

  // [...]

  flutter::TaskRunners task_runners(thread_label,     // label
                                    platform_runner,  // platform
                                    raster_runner,    // raster
                                    ui_runner,        // ui
                                    io_runner         // io
  ); // 3

  shell_ =
      Shell::Create(GetDefaultPlatformData(),  // window data
                    task_runners,              // task runners
                    settings_,                 // settings
                    on_create_platform_view,   // platform view create callback
                    on_create_rasterizer       // rasterizer create callback
      ); // 4

	// [...]
}
```

This function does a couple thing but the most important of them is the creation of the `shell_` instance, which is the instance providing the interface with which Flutter interacts with the Dart runtime. With the call to `Shell::Create`, we finally abandon Flutter's platform-specific (Android, in our case) code, and finally enter native platform-independent implementations and behavior. Before addressing the `Shell::Create` function, let us break down briefly what this function does:

1. Three thread names are created, these names will identify the threads used by the `AndroidShellHolder` and the native side to perform different tasks and separate responsibilities. These threads are the `raster`, `ui` and `io` threads. There's one more thread, the one in charge of receiving and sending messages to the platform code, the `platform` thread, you might notice that only three names are being created, though. The reason for this is that the very thread where this constructor is being called IS the `platform` thread. Which is also as you might have guessed, the very same thread our Android platform spawned to run our Java code, and corresponds to an Android `Looper` object.

2. A `ThreadHost` object is called. This object is used to manage and represent the group of four threads used by the Flutter engine.

3. A `TaskRunners` object is created with references to four `TaskRunner` objects, one of each of the threads. These runners will be in charge of executing asynchronous tasks on the thread they are "attached" to.

4. Finally, a `Shell` object is created. This is a singleton of course, and it is, as mentioned, the interface the native code will use to interact with the Dart runtime. We will analyze the function that returns this object `Shell::Create` next.

Something worth mentioning is that there might not be a `ui` thread at all. Depending on the platform where the Flutter engine is being run, both the `platform` and `ui` thread are one and the same, with a single "merged" thread dealing with both responsibilities.

The call to `Shell::Create` is very significative, as this is the place where the Dart virtual machine will be launched and given context of what binaries it should load, which *isolates* it will have to run and where are they located (the ELF symbols used to identify them inside `libapp.so`), and prepare the runtime for the execution of the Dart entry-point, which is the exact moment where our Flutter application finally receives control of execution, and it can be said that our Flutter app is finally executing (have in mind that thus far, all code is bootstrapping code). Let us take a look at the function:

`shell.cc`

```c++
std::unique_ptr<Shell> Shell::Create(
    ...
    Settings settings,
    ...) {
  
  // [...]
  
  auto [vm, isolate_snapshot] = InferVmInitDataFromSettings(settings);

  return CreateWithSnapshot(platform_data,                     //
                            task_runners,                      //
                            /*parent_thread_merger=*/nullptr,  //
                            /*parent_io_manager=*/nullptr,     //
                            resource_cache_limit_calculator,   //
                            settings,                          //
                            std::move(vm),                     //
                            std::move(isolate_snapshot),       //
                            on_create_platform_view,           //
                            on_create_rasterizer,              //
                            CreateEngine, is_gpu_disabled); 
}
```

The body of this function is small, and its execution forks into two different function calls to `InferVmInitDataFromSettings` and to `CreateWithSnapshot`, with the latter being the one that returns the `Shell` object. We'll take a look at both of them in order:

`shell.cc`

```c++
std::pair<DartVMRef, fml::RefPtr<const DartSnapshot>>
Shell::InferVmInitDataFromSettings(Settings& settings) {

  auto vm_snapshot = DartSnapshot::VMSnapshotFromSettings(settings); // 1
  auto isolate_snapshot = DartSnapshot::IsolateSnapshotFromSettings(settings);// 1
  
  auto vm = DartVMRef::Create(settings, vm_snapshot, isolate_snapshot); // 2

  if (!isolate_snapshot) {
    isolate_snapshot = vm->GetVMData()->GetIsolateSnapshot(); // 3
  }
  return {std::move(vm), isolate_snapshot}; // 4
}
```

The first part of the function call is `InferVmInitDataFromSettings`, which returns a `DartVMRef` and a `DartSnapshot` instance wrapped by a `RefPtr`. The `DartVMRef` object contains a reference to a `DartVM` instance, which is not a Dart SDK type just yet, but rather a class that serves as a representation of a running instance of a virtual machine, and whose `DartVM::Create` method is a wrapper for Dart's `Dart_Initialize`, the actual function pertaining to the Dart embedding library that performs the initialization of the virtual machine. There should only be a single instance of `DartVM` throughout the entire execution of the application.

You might notice by reading the definition of `DartVM` that there does not seem to be a reference to an internal Dart type representing the running virtual machine instance, and you would be correct by noticing this: the communication channels that enable bidirectional communication from platform code to Dart are some specialized Dart API's IPC functions such as `Dart_Invoke`, `Dart_PostCObject` or `Dart_PostInteger`.

In short, the `InferVmInitDataFromSettings` does the following:

1. Fetches the virtual machine isolate snapshot. This isolate is always present in a Dart execution environment, and contains information that's used mostly used by the runtime, such as fundamental objects and types. The *isolate snapshot* is more or less a misnomer, and it refers to the application-specific snapshot. It contains both data and instructions from the user Dart code. This will be our target when reversing a Flutter application. I said it was a misnomer because it suggests that the `vm_snapshot` is not an isolate. It technically is just a "container for VM-global objects", but it is called the **vm isolate** inside Dart's source. Keep it in mind if you see further references to "VM isolate" in these notes: it exclusively means the "isolate" spawned from the contents of the VM data snapshot as basis.

2. A `DartVMRef` object is constructed and initialized, and a reference to it is returned in to the `vm` local. If there was already a running virtual machine instance prior to this point, the reference to it is returned instead. The `DartVMRef` object contains a strong reference to a `DartVM` singleton, the native-side representation of a currently running Dart virtual machine, of which there should only be one, in the context of a Flutter application execution. 

 3. The previously existing virtual machine's isolate snapshot is used if an `isolate_snapshot` argument isn't provided.

4. Both a `DartVMRef` and the isolate snapshot are returned. The isolates are of type `DartIsolate`.

Ultimately, this function will end up calling the `Dart_Initialize` function, which, according to Dart's documentation: "*Initializes the VM.*" Yeah, that's it. Well, the function itself is not well documented, but this is due to the fact that the documentation is "elsewhere". One might not get a detailed description of what the function does and what each of the steps do, but it is possible to get a general idea of what it does from the `params` parameter that this function receives:

`sdk/dart_api.h`

```c++
DART_EXPORT DART_API_WARN_UNUSED_RESULT char* Dart_Initialize(
    Dart_InitializeParams* params);
```

And here's the definition for the `Dart_InitializeParams` struct (omitting fields that are not relevant in our scope):

`sdk/dart_api.h`

```c++
typedef struct {
  // [...]
  const uint8_t* vm_snapshot_data; // 1
  const uint8_t* vm_snapshot_instructions; // 1
  
  Dart_IsolateGroupCreateCallback create_group; // 2
  Dart_InitializeIsolateCallback initialize_isolate; // 2
  Dart_IsolateShutdownCallback shutdown_isolate; // 2
  Dart_IsolateCleanupCallback cleanup_isolate; // 2
  Dart_IsolateGroupCleanupCallback cleanup_group; // 2

  Dart_ThreadStartCallback thread_start; // 3
  Dart_ThreadExitCallback thread_exit; // 3
  Dart_FileOpenCallback file_open; // 3
  Dart_FileReadCallback file_read; // 3
  Dart_FileWriteCallback file_write; // 3
  Dart_FileCloseCallback file_close; // 3
  // [...]
} Dart_InitializeParams;
```

1.  The virtual machine snapshot data and instructions.
2.  Isolate-related callbacks, for when an isolate is created, initialized, shutdown, cleaned, and for when a group of isolates is cleaned, respectively.
3. Callbacks for input/output and thread operations performed by the current isolate.

### Part 2 - Finding the entrypoint and transfering control to it

Now all of that will initialize the text and data segments of the VM isolate, create an isolate resource group, and finally create and initialize the VM isolate using the former supplies. But this is just the VM isolate, which doesn't contain user code nor data at all. The thing is, at this point, no user code nor data has been loaded yet. Dart is yet to be transferred execution control. This leads us to our next milestone: how does Flutter load the "main" or "user" isolate, finds the entry-point in it and transfers control to it? Let us see:

`FlutterEngineGroup.java`

```java
engine.getDartExecutor().executeDartEntrypoint(dartEntrypoint, dartEntrypointArgs);
```

Right after the `FlutterEngine` constructor returns, that function is called. Towards the end of past issue, I made an imprecision: I mentioned that this function was called at some point after entering `onStart`, but this isn't true. It is true that `executeDartEntrypoint` is called there, but that's only a **fallback**, in case the call right above wasn't able to transfer control to Dart. Now, this might happen if the entry-point isn't the default `main`, but one defined by a custom intent, in which case the invocation might be after `onStart`. Now, what does this `executeDartEntrypoint` do? Its native implementation is defined as follows:

`platform_view_android_jni_impl.cc`

```c++
static void RunBundleAndSnapshotFromLibrary(...)
{
  // [...]
  ANDROID_SHELL_HOLDER->Launch(std::move(apk_asset_provider), entrypoint,
                               libraryUrl, entrypoint_args, engineId);
}
```

The interesting part here is `ANDROID_SHELL_HOLDER`. What is it? Recall that in the first issue, we covered the `FlutterJNI`'s `attachToNative` function. That function creates an `AndroidShellHolder` instance and returns a pointer to it to the Java code that invokes it, as a `jlong`. The `shell_holder` parameter you see is nothing more than that very same pointer, and the `ANDROID_SHELL_HOLDER` macro is:

```c++
#define ANDROID_SHELL_HOLDER (reinterpret_cast<AndroidShellHolder*>(shell_holder))
```

A casting of that `jlong` into a pointer type, whose `Launch` method gets invoked. This function receives the entry-point name, the `libraryUrl` parameter, the arguments for the entry-point and an `engineId` (there can be multiple `FlutterEngine` instances).

`android_shell_holder.cc`

```c++
void AndroidShellHolder::Launch(...)
{
  // [...]
  
  auto config = BuildRunConfiguration(entrypoint, libraryUrl, entrypoint_args);
  // [...]
  shell_->RunEngine(std::move(config.value()));
}
```

Now, the shell instance's `RunEngine` (the `shell_` field set in the `AttachJNI` function) method will be called:

`shell.cc`

```c++
void Shell::RunEngine(
    RunConfiguration run_configuration,
    const std::function<void(Engine::RunStatus)>& result_callback) {
  // [...]
  fml::TaskRunner::RunNowOrPostTask(
      task_runners_.GetUITaskRunner(),
      fml::MakeCopyable(
          [run_configuration = std::move(run_configuration),
           weak_engine = weak_engine_, result]() mutable {
            // [...]
            auto run_result = weak_engine->Run(std::move(run_configuration));
            // [...]
          }));
}
```

Now, `weak_engine`, a reference to `weak_engine_` has its `Run` method called. This task is not performed in the main thread (`platform`) where we are right now, but it is posted to the `ui` thread instead, which is the one in charge of continuing and completing this flow.

You are wondering what `weak_engine_` is. It's a weak pointer to an `Engine` instance. You might have noticed that I didn't cover the `CreateWithSnapshot` function. I didn't deem it necessary, as it doesn't perform any crucial enough task so that could be described within the scope of this research. The important thing you need to know is that the `Engine` instance I mentioned is created inside that function. Now, let us move into `Engine::Run`, which gets passed a `RunConfiguration` object, which is a wrapper for the arguments, entry-point name and the library URI. This function will run the so-called "root" isolate, which is the isolate containing both the user data and user instructions, let us see:

`engine.cc`

```c++
Engine::RunStatus Engine::Run(RunConfiguration configuration) {
  // [...]

  if (runtime_controller_->IsRootIsolateRunning()) {
    return RunStatus::FailureAlreadyRunning;
  }

  if (!runtime_controller_->LaunchRootIsolate(
          settings_,                                 //
          root_isolate_create_callback,              //
          configuration.GetEntrypoint(),             //
          configuration.GetEntrypointLibrary(),      //
          configuration.GetEntrypointArgs(),         //
          configuration.TakeIsolateConfiguration(),  //
          native_assets_manager_,                    //
          configuration.GetEngineId()))              //
  {
    return RunStatus::Failure;
  }
  
  // [...]
  return Engine::RunStatus::Success;
}
```

This flow is deep as you can see, but we are nearing its end on the Flutter side, after which I'll start describing the underlying Dart SDK that ultimately gets called, so bear with me. This function will first check whether the isolate isn't running first, and if not, it will launch it through `LaunchRootIsolate`, which we will omit given that this function is pretty much a wrapper for the method `DartIsolate::CreateRunningRootIsolate`:

`dart_isolate.cc`

```c++
std::weak_ptr<DartIsolate> DartIsolate::CreateRunningRootIsolate(...) {
  
  // [...]

  auto isolate = CreateRootIsolate(settings,                           //
                                   isolate_snapshot,                   //
                                   std::move(platform_configuration),  //
                                   isolate_flags,                      //
                                   isolate_create_callback,            //
                                   isolate_shutdown_callback,          //
                                   context,                            //
                                   spawning_isolate,                   //
                                   std::move(native_assets_manager)    //
                                   )
                     .lock(); // 1
 
  // [...]

  if (!isolate->RunFromLibrary(std::move(dart_entrypoint_library),  //
                               std::move(dart_entrypoint),          //
                               dart_entrypoint_args)) { // 2
    FML_LOG(ERROR) << "Could not run the run main Dart entrypoint.";
    return {};
  }
  
  // [...]
 
  return isolate;
}
```

Now, this function performs multiple tasks that I've snipped, the most important ones are:

1. `CreateRootIsolate` is called. This function declares a lambda function, `isolate_maker`, whose definition varies depending on whether the `spawning_isolate` argument is set. This argument is only set when another existing isolate is creating the current isolate. Given a non-null value for the argument, the isolate is created using the SDK's `Dart_CreateIsolateInGroup`. This is not our case, as this is the first isolate created by the engine, not counting the VM isolate created during the Dart virtual machine startup flow. Consequently, the `isolate_maker` will invoke `Dart_CreateIsolateGroup` instead, which creates a new isolate group with the user isolate as root and first isolate of said group (every isolate must be in a group, in fact, there's no "`Dart_CreateIsolate`" function or similar in the Dart API). A `DartIsolate` reference is returned by `CreateRootIsolate`.
 
   The `Dart_CreateIsolateGroup` does **not** transfer control to the Dart entry-point just yet, as this function's sole responsibilities are to create the isolate's heap, deserialize the data snapshot into it, and return the corresponding `Dart_Isolate` object. Notice: the function does not have to do anything special to the instruction snapshot, as it is a symbol living inside `libapp.so`'s `.text` section, which is mapped to an executable region in memory, thus the precompiled AOT code mapping was already performed by the operating system during ELF augmentation. We will further explore this function when studying the snapshot deserialization process.

2. Finally, the `RunFromLibrary` call follows. And this function does exactly what you think: ends up invoking the now mapped and ready isolate's entry-point, transferring control to it.

This function is right in the edge between Flutter and Dart, in fact, you can see that it calls a few functions prepended with `Dart_`, those are all functions from the Dart SDK. Let us take a look:

`dart_isolate.cc`

```c++
bool DartIsolate::RunFromLibrary(std::optional<std::string> library_name,
                                 std::optional<std::string> entrypoint,
                                 const std::vector<std::string>& args) {

  auto library_handle =
      library_name.has_value() && !library_name.value().empty()
          ? ::Dart_LookupLibrary(tonic::ToDart(library_name.value().c_str()))
          : ::Dart_RootLibrary(); // 1
  auto entrypoint_handle = entrypoint.has_value() && !entrypoint.value().empty()
                               ? tonic::ToDart(entrypoint.value().c_str())
                               : tonic::ToDart("main"); // 2
  

  auto user_entrypoint_function =
      ::Dart_GetField(library_handle, entrypoint_handle); // 3

  auto entrypoint_args = tonic::ToDart(args); // 4

  if (!InvokeMainEntrypoint(user_entrypoint_function, entrypoint_args)) { // 5
    return false;
  }

  phase_ = Phase::Running;

  return true; // 5
}
```

I didn't snip this function at all, so we will review the entirety of it, with special focus on steps 1, 3 and 5, but first, let's disclose an important definition: the `Dart_Handle`. A `Dart_Hanlde` is a Dart API type that essentially acts as a generic pointer (think of it as `void*`) to different kind of types inside the Dart API. It is an opaque type, defined as `typedef struct _Dart_Handle* Dart_Handle;`. The `_Dart_Handle` type is undefined, so the type definition is just a pointer to an incomplete type. As such, it must be type-casted to the appropriate API pointer type before being operated on. This is the job of the Dart function that receives the handle as an argument, and you are never meant to operate on it directly. Now:

1. A `Dart_Handle` to a library object is returned, this object will either be the root library or a custom library (for when the engine is initialized with a custom intent that defines the library URI containing the entry point). In a normal scenario, we won't define any custom library, so the root library will be used instead. The root library is nothing but the code corresponding to the source file used at compile time, when calling the `dart compile` command. In Flutter's case, the `flutter` command line tool will fallback to `lib/main.dart` if nothing is provided using the `--target` option. This handle is thus a pointer to an internal Dart object, a `Library` object, to be more specific.

2. The entry-point handle, which is just a pointer to a string object stored in the isolate's heap, containing the string "main" in our case. This allocation and heap writing is done by `tonic::ToDart("main");`. Functions defined in the `tonic` namespace are a series of specialized template functions that will internally invoke Dart-specific allocation routines, in this case, `ToDart` will resolve to calls `Dart_NewStringFromCString`, as it is being passed a string literal.

3. A handle to the field itself is obtained. `Dart_GetField` receives a pair `(scope, field)`, where scope can be either a library object, a type object or an instance object, in which cases a top-level symbol, a static definition or a instance member handles are returned, respectively. In our case, we get the `main` top-level symbol handle, which after knowing what a handle is, we know it's essentially a pointer to the main function.

4. Arguments are casted to a list type that Dart can understand.

5. The `InvokeMainEntrypoint` is called passing the handles we obtained in the previous steps as arguments.

Naturally, if what we want is to be able to deterministically recover entry point information such as name, library where it is, offset into the instruction snapshot and arguments (if any), we need to perform the exact same steps as this function, therefore, we need to replicate the behavior of this function. Given this, let us finish this second part taking a deep look into `InvokeMainEntrypoint`, how root library information is recovered from the snapshot (and how Dart gets a handle to it), and finally how Dart uses this information to precisely identify and obtain the handle that uniquely identifies the entry-point:

[...] COMING SOON [...]

### Part 3 - Deserialization and serialization mechanism

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research: Dart internal serialization/deserialization mechanism #42

Summary

Context

Part 1 - Initialization of the Dart virtual machine

Part 2 - Finding the entrypoint and transfering control to it

Part 3 - Deserialization and serialization mechanism

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Research: Dart internal serialization/deserialization mechanism #42

Description

Summary

Context

Part 1 - Initialization of the Dart virtual machine

Part 2 - Finding the entrypoint and transfering control to it

Part 3 - Deserialization and serialization mechanism

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions