diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 3f49cb90e..29777b111 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -2,7 +2,9 @@ The architecture of Nova engine is built around data-oriented design. This means that most "data" like things are found in the heap in a vector with like-minded -individuals. +individuals. Data-oriented design is all the rage on the Internet because of its +cache-friendliness. This engine is one more attempt at seeing what sort of +real-world benefits one might gain with this sort of architecture. ## ECMAScript implementation @@ -24,7 +26,15 @@ For details on the engine, see the Nova's heap is made up of a mix of normal Rust `Vec`s and custom `SoAVec` structs that implement a "Struct of Arrays" data structure with an API -equivalent to normal `Vec`s, all referenced by-index. - -For details on the heap architecture, see the -[heap/README.md](./nova_vm/src/heap/README.md). +equivalent to normal `Vec`s, all referenced by index using the public API's +handle types. The eventual aim is to store everything in Structs of Arrays, with +a smattering of keyed side-tables (hash maps or b-trees) on the side to hold +optional data. + +The intention here is to make it fast for the computer to access frequently used +things while allowing infrequently used things to stay out of the hot path, and +enabling rarely used optional parts of structures to take little or no memory at +all to store at the cost of access performance. For details on the heap +architecture, see the [heap/README.md]. + +[heap/README.md]: ./nova_vm/src/heap/README.md diff --git a/Cargo.toml b/Cargo.toml index 376aa6ad5..e3f48fc97 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -5,7 +5,7 @@ exclude = ["nova_lint"] [workspace.package] edition = "2024" -version = "0.3.1" +version = "1.0.0" license = "MPL-2.0" homepage = "https://trynova.dev/" repository = "https://github.com/trynova/nova/" diff --git a/README.md b/README.md index 21b5e80fd..817014d3c 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,94 @@ -# Nova - Your favorite javascript and wasm engine +# Nova JavaScript engine -## :warning: This project is a Work In Progress, and is very far from being suitable for use :warning: - -Nova is a [JavaScript](https://tc39.es/ecma262) (and eventually -[WebAssembly](https://webassembly.org)) engine written in Rust. +Nova is a [JavaScript] engine focused on being lightweight, modular, and easy to +embed. The engine's architecture is built close to the ECMAScript specification +in structure with the implementation relying on idiomatic Rust and data-oriented +design over traditional JavaScript engine building strategies. Interpreter +performance is also a goal, but not yet a high priority. The engine is exposed as a library with an API for implementation in Rust projects which themselves must serve as a runtime for JavaScript code. The -execution model is currently greatly inspired by -[Kiesel](https://codeberg.org/kiesel-js/kiesel) and -[SerenityOS's LibJS](https://github.com/SerenityOS/serenity). See the code for -more details. - -The project's website can be found at [trynova.dev](https://trynova.dev/), where -we blog about the project's progress, and where we track our Test262 pass rate. -The core of our team is on our [Discord server](https://discord.gg/bwY4TRB8J7). +execution model is greatly inspired by [Kiesel] and [LibJS]. + +The project's website can be found at [trynova.dev], where we blog about the +project's progress and where we track our Test262 pass rate. The development +discussion is in the progress of moving to [Zulip] but our old [Discord server] +is also still available. + +## Lightweight + +The engine's heap is set up to keep heap allocations minimal, sacrifing speed of +uncommon structures for a smaller memory footprint in the common cases. The +intention is for modern JavaScript written with strict TypeScript types to run +light and fast, while any TypeScript lines requiring `as any` or `as unknown` +around objects is likely to be much slower and take more memory than expected. + +## Easy to embed + +The engine has very little bells or whistles and is very easy to set up for +one-off script runs or simple call-and-return instances. The engine uses the +[WTF-8] encoding internally for [`String`] storage, making interfacing with +JavaScript look and act similar to normal Rust code. + +```rust +use nova_vm::{ecmascript::{DefaultHostHooks, GcAgent}, engine::GcScope}; +let mut agent = GcAgent::new(Default::default(), &DefaultHostHooks); +let realm = agent.create_default_realm(); +let _ = agent.run_in_realm(&realm, |_agent, _gc| { + // do work here +}); +agent.gc(); +``` + +## [Architecture] + +The engine's public API relies on idiomatic Rust over traditional JavaScript +engine building wisdom. This is most apparent in the [`Value`] type and its +subvariants such as [`Object`]: instead of using NaN-boxing, NuN-boxing, or +other traditional and known efficient strategies for building a dynamically +typed language, Nova uses normal Rust enums carrying either on-stack data or a +32-bit handle to heap-allocated data. The only pointer that gets consistently +passed through call stacks is the [`Agent`] reference, and handles are merely +ways to access heap-allocated JavaScript data held inside the `Agent`. + +Internally, the architecture and structure of the engine follows the ECMAScript +specification but uses data-oriented design for the actual implementation. Data +on the heap is allocated in homogenous (containing data of only one type) arenas +with hot data split apart from cold data, and optional data stored behind keyed +indirections using the arena's associated 32-bit handle as the key, thus using +no memory to store the default null case. The arenas are additionally compacted +during garbage collection, trading some extra collection time for better runtime +cache locality for hot data. + +## Shortcomings and unexpected edge cases + +Nova JavaScript engine is not perfect and has many shortcomings. + +1. The engine performance is acceptable, but it is not fast by any means. +1. The [`Array`] implementation does not support sparse storage internally. + Calling `new Array(10 ** 9)` will request an allocation for 1 billion + JavaScript [`Value`]s. +1. The [`RegExp`] implementation does not support lookaheads, lookbehinds, or + backreferences. It is always in UTF-8 / Unicode sets mode, does not support + RegExp patterns containing unpaired surrogates, and its groups are slightly + different from what the ECMAScript specification defines. In short: it is not + compliant. +1. [`Promise`] subclassing is currently not supported. +1. The engine does not support [WebAssembly] execution. ## Talks -### [Out the cave, off the cliff — data-oriented design in Nova JavaScript engine](https://www.youtube.com/watch?v=QuJRKhySp-0) +### [Out the cave, off the cliff — data-oriented design in Nova JavaScript engine] Slides: [Google Drive](https://docs.google.com/presentation/d/1_N5uLxkR0G4HSYtGuI68eXaj51c7FVCngDg7lxiRytM/edit?usp=sharing) Presented originally at Turku University JavaScript Day, then at Sydney Rust -Meetup, and finally at [JSConf.jp](https://jsconf.jp/2025/en) in slightly -differing and evolving forms, the talk presents the "today" of major JavaScript -engines and the "future" of what Nova is doing, and why it is both a good and a -bad idea. +Meetup, and finally at [JSConf.jp] in slightly differing and evolving forms, the +talk presents the "today" of major JavaScript engines and the "future" of what +Nova is doing, and why it is both a good and a bad idea. -### [Abusing reborrowing for fun, profit, and a safepoint garbage collector @ FOSDEM 2025](https://fosdem.org/2025/schedule/event/fosdem-2025-4394-abusing-reborrowing-for-fun-profit-and-a-safepoint-garbage-collector/) +### [Abusing reborrowing for fun, profit, and a safepoint garbage collector @ FOSDEM 2025] Slides: [PDF](https://fosdem.org/2025/events/attachments/fosdem-2025-4394-abusing-reborrowing-for-fun-profit-and-a-safepoint-garbage-collector/slides/237982/Abusing_r_4Y4h70i.pdf) @@ -42,7 +101,7 @@ abuses Rust's "reborrowing" functionality to make the borrow checker not only understand Nova's garbage collector but cooperate with making sure it is used in the correct way. -### [Nova Engine - Building a DOD JS Engine in Rust @ Finland Rust-lang meetup 1/2024](https://www.youtube.com/watch?v=WKGo1k47eYQ) +### [Nova Engine - Building a DOD JS Engine in Rust @ Finland Rust-lang meetup 1/2024] Slides: [Google Drive](https://docs.google.com/presentation/d/1PRinuW2Zbw9c-FGArON3YHiCUP22qIeTpYvDRNbP5vc/edit?usp=drive_link) @@ -51,7 +110,7 @@ Presented at the Finland Rust-lang group's January meetup, 2024. Focus on how JavaScript engines work in general, and what sort of design choices Nova makes in this context. -### [Nova JavaScript Engine - Exploring a Data-Oriented Engine Design @ Web Engines Hackfest 2024](https://www.youtube.com/watch?v=5olgPdqKZ84) +### [Nova JavaScript Engine - Exploring a Data-Oriented Engine Design @ Web Engines Hackfest 2024] Slides: [Google Drive](https://docs.google.com/presentation/d/1YlHr67ZYCyMp_6uMMvCWOJNOUhleUtxOPlC0Gz8Bg7o/edit?usp=drive_link) @@ -66,45 +125,30 @@ but the slightly modified slides are. TC39 slides: [Google Drive](https://docs.google.com/presentation/d/1Pv6Yn2sUWFIvlLwX9ViCjuyflsVdpEPQBbVlLJnFubM/edit?usp=drive_link) -## [Architecture](./ARCHITECTURE.md) - -The architecture and structure of the engine follows the ECMAScript -specification in spirit, but uses data-oriented design for the actual -implementation. Types that are present in the specification, and are often -called "Something Records", are generally found as a `struct` in Nova in an -"equivalent" file / folder path as the specification defines them in. But -instead of referring to these records by pointer or reference, the engine -usually calls these structs the "SomethingRecord" or "SomethingHeapData", and -defines a separate "handle" type which takes the plain "Something" type name and -only contains a 32-bit unsigned integer. The record struct is stored inside the -engine heap in a vector, and the handle type stores the correct vector index for -the value. Polymorphic index types, such as the main JavaScript Value, are -represented as tagged enums over the index types. - -In general, all specification abstract operations are then written to operate on -the index types instead of operating on references to the heap structs -themselves. This avoids issues with re-entrancy, pointer aliasing, and others. - -### Heap structure - Data-oriented design - -Reading the above, you might be wondering why the split into handle and heap -data structs is done. The ultimate reason is two-fold: - -1. It is an interesting design. - -1. It helps the computer make frequently used things fast while allowing the - infrequently used things to take less (or no) memory at the cost of access - performance. - -Data-oriented design is all the rage on the Internet because of its -cache-friendliness. This engine is one more attempt at seeing what sort of -real-world benefits one might gain with this sort of architecture. - -If you find yourself interested in where the idea spawns from and why, take a -look at [the Heap README.md](./nova_vm/src/heap/README.md). It gives a more -thorough walkthrough of the Heap structure and what the idea there is. - -## [Contributing](./CONTRIBUTING.md) +## [Contributing] So you wish to contribute, eh? You're very welcome to do so! Please take a look -at [the CONTRIBUTING.md](./CONTRIBUTING.md). +at [the CONTRIBUTING.md][Contributing]. + +[`Agent`]: crate::ecmascript::Agent +[`Array`]: crate::ecmascript::Array +[`RegExp`]: crate::ecmascript::RegExp +[`Promise`]: crate::ecmascript::Promise +[`Object`]: crate::ecmascript::Object +[`String`]: crate::ecmascript::String +[`Value`]: crate::ecmascript::Value +[WebAssembly]: https://webassembly.org +[WTF-8]: https://wtf-8.codeberg.page/ +[JavaScript]: https://tc39.es/ecma262 +[Kiesel]: https://codeberg.org/kiesel-js/kiesel +[LibJS]: https://github.com/LadybirdBrowser/ladybird/tree/master/Libraries/LibJS +[Architecture]: https://github.com/trynova/nova/blob/main/ARCHITECTURE.md +[Contributing]: https://github.com/trynova/nova/blob/main/CONTRIBUTING.md +[trynova.dev]: https://trynova.dev/ +[Out the cave, off the cliff — data-oriented design in Nova JavaScript engine]: https://www.youtube.com/watch?v=QuJRKhySp-0 +[Nova JavaScript Engine - Exploring a Data-Oriented Engine Design @ Web Engines Hackfest 2024]: https://www.youtube.com/watch?v=5olgPdqKZ84 +[Nova Engine - Building a DOD JS Engine in Rust @ Finland Rust-lang meetup 1/2024]: https://www.youtube.com/watch?v=WKGo1k47eYQ +[Abusing reborrowing for fun, profit, and a safepoint garbage collector @ FOSDEM 2025]: https://fosdem.org/2025/schedule/event/fosdem-2025-4394-abusing-reborrowing-for-fun-profit-and-a-safepoint-garbage-collector/ +[Discord server]: https://discord.gg/bwY4TRB8J7 +[Zulip]: https://trynova.zulipchat.com/ +[JSConf.jp]: https://jsconf.jp/2025/en diff --git a/nova_cli/Cargo.toml b/nova_cli/Cargo.toml index ebf5c1c6c..c8cccdce5 100644 --- a/nova_cli/Cargo.toml +++ b/nova_cli/Cargo.toml @@ -11,6 +11,7 @@ homepage.workspace = true readme.workspace = true keywords.workspace = true categories = ["development-tools", "command-line-utilities"] +publish = false [lib] name = "nova_cli" @@ -25,7 +26,7 @@ clap = { workspace = true } cliclack = { workspace = true } ctrlc = { workspace = true } console = { workspace = true } -nova_vm = { path = "../nova_vm", version = "0.3.0" } +nova_vm = { path = "../nova_vm" } oxc_ast = { workspace = true } oxc-miette = { workspace = true } oxc_parser = { workspace = true } diff --git a/nova_vm/Cargo.toml b/nova_vm/Cargo.toml index 9945aad9e..8c2d12f00 100644 --- a/nova_vm/Cargo.toml +++ b/nova_vm/Cargo.toml @@ -31,7 +31,7 @@ oxc_syntax = { workspace = true } rand = { workspace = true } regex = { workspace = true, optional = true } ryu-js = { workspace = true } -small_string = { path = "../small_string", version = "0.2.0" } +small_string = { path = "../small_string", version = "1.0.0" } soavec = { workspace = true } soavec_derive = { workspace = true } sonic-rs = { workspace = true, optional = true } @@ -105,5 +105,5 @@ proposal-atomics-microwait = ["atomics"] proposal-temporal = ["temporal"] [build-dependencies] -small_string = { path = "../small_string", version = "0.2.0" } +small_string = { path = "../small_string", version = "1.0.0" } usdt = { workspace = true } diff --git a/nova_vm/src/lib.rs b/nova_vm/src/lib.rs index a4f91d3ce..6ffa564b2 100644 --- a/nova_vm/src/lib.rs +++ b/nova_vm/src/lib.rs @@ -4,66 +4,7 @@ #![cfg_attr(feature = "proposal-float16array", feature(f16))] #![warn(missing_docs)] - -//! # Nova JavaScript engine -//! -//! Nova is a JavaScript engine aiming to be lightweight, easy to embed, and -//! close to the ECMAScript specification in form with the implementation -//! relying on idiomatic Rust rather than traditional JavaScript engine building -//! wisdom. Great performance is also an aspirational goal of the engine, but -//! not something that can be said to really be a reality today. -//! -//! ## API architecture -//! -//! The API of the engine relies on idiomatic Rust rather than traditional -//! JavaScript engine building wisdom. This is most apparent in the [`Value`] -//! type and its subtypes: instead of using NaN-boxing, NuN-boxing, or other -//! traditional and known efficient strategies for building a dynamically typed -//! language, Nova uses normal Rust enums carrying either on-stack data or a -//! handle to heap-allocated data. The only pointer that gets consistently -//! passed through call stacks is the [`Agent`] reference, and handles are -//! merely ways to access heap-allocated JavaScript data held inside the -//! `Agent`. -//! -//! ## Lightweight engine -//! -//! The engine's heap is set up to keep heap allocations small, trading speed -//! for a smaller memory footprint in the general case. This should make working -//! with large, regular datasets fairly low-impact on the memory usage of the -//! engine. -//! -//! ## Ease of embedding -//! -//! The engine has very little bells or whistles and is very easy to set up for -//! one-off script runs. The engine uses the [WTF-8] encoding internally for -//! [`String`] storage, making interfacing between the engine and normal Rust -//! code much nicer than one might expect. -//! -//! ## Shortcomings and unexpected edge cases -//! -//! Nova JavaScript engine has not been born perfect, and has many shortcomings. -//! -//! 1. The engine performance is acceptable, but it is not fast by any means. -//! -//! 1. The [`Array`] implementation does not support sparse storage internally. -//! Calling `new Array(10 ** 9)` will request an allocation for 8 billion -//! bytes. -//! -//! 1. The [`RegExp`] implementation does not support lookaheads, lookbehinds, -//! or backreferences. It is always in UTF-8 / Unicode sets mode, does not -//! support RegExp patterns containing unpaired surrogates, and its groups -//! are slightly different from what the ECMAScript specification defines. In -//! short: it is not compliant. -//! -//! 1. [`Promise`] subclassing is currently not supported. -//! -//! [`Agent`]: crate::ecmascript::Agent -//! [`Array`]: crate::ecmascript::Array -//! [`RegExp`]: crate::ecmascript::RegExp -//! [`Promise`]: crate::ecmascript::Promise -//! [`String`]: crate::ecmascript::String -//! [`Value`]: crate::ecmascript::Value -//! [WTF-8]: https://wtf-8.codeberg.page/ +#![doc = include_str!("../../README.md")] pub mod ecmascript; pub mod engine; diff --git a/small_string/Cargo.toml b/small_string/Cargo.toml index abee3dc50..d378639fb 100644 --- a/small_string/Cargo.toml +++ b/small_string/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "small_string" -version = "0.2.0" +version = "1.0.0" repository = "https://github.com/trynova/nova/tree/main/small_string" description = "7-byte small string optimisation for use in Nova JavaScript engine" authors.workspace = true