refactor/docs 2 by leo-gan · Pull Request #25 · leo-gan/GLD.SerializerBenchmark

leo-gan · 2026-05-01T05:28:23Z

PR Description:

Documentation Restructure & Theory Content Expansion

Summary

Restructures documentation site architecture and adds comprehensive theory perspectives on serialization.

Changes

Documentation Architecture:

Moved theory content to dedicated docs/theory/ directory with new index
Renamed test_data_design.md → test_data_configuration.md for clarity
Renamed languages-overview.md → serialization_categories.md
Updated all internal cross-references

New Theory Content:

data_science_perspective.md - Statistical analysis approach to serializer evaluation
engineer_perspective.md - Text vs binary vs schema-driven format trade-offs with historical context
historical_perspective.md - Evolution of serialization formats and key contributors

Updates:

Regenerated benchmark logs with latest run data
Updated python_tested_serializers.md with additional serializer details
Revised MkDocs navigation to reflect new structure

Testing

Full benchmark suite executed (100 reps, all serializers) - results verified
Documentation builds successfully with MkDocs
All internal links validated

gemini-code-assist

Code Review

This pull request restructures the documentation and adds comprehensive historical, engineering, and data science perspectives on serialization, while updating serializer references for C# and Python. The review feedback identifies several issues in the new content, such as accessibility concerns from using images for text, factual errors, and broken citations. Suggestions were also made to fix code block formatting, resolve a missing import in a Python example, and consolidate the new perspective documents to reduce redundancy and maintain consistent spelling and citation styles.

gemini-code-assist · 2026-05-01T05:34:28Z

+The following serializers are partially tested in the benchmark via the `Supports()` method.
+That means they are not tested with all data types, but some of them are tested with specific data types.


The explanation for 'Partially Tested Serializers' could be more precise. Stating that some are tested with specific data types is good, but it would be even clearer to mention that they might fail or be explicitly excluded from tests with certain data types (like those with circular references or requiring specific schema attributes), as detailed in the table below.

gemini-code-assist · 2026-05-01T05:34:28Z

+- **Tim Bray and others (XML, 1996+):** As W3C editors, they defined XML, which became a foundation for data interchange before JSON’s rise (Citing directly: [26] notes Crockford’s JSON emerging as an alternative).
+
+- **Brian Behlendorf, etc. (YAML):** The YAML editors listed on the spec include Evans, Ben-Kiki, Ingerson.


This section contains a couple of issues:

The citation [26] on line 120 is broken, as there is no corresponding entry in the references section.

On line 122, Brian Behlendorf is incorrectly listed as a YAML contributor. He is primarily known for his work on the Apache Web Server. The previous line correctly identifies the main YAML contributors.

gemini-code-assist · 2026-05-01T05:34:28Z

@@ -0,0 +1,227 @@
+# **The Architecture of Information: A Comprehensive History and Evolution of Data Serialization**
+
+The fundamental challenge of distributed computing has always been the translation of abstract, multi-dimensional in-memory data structures into a linear, one-dimensional stream of bytes suitable for transmission across physical media or storage on non-volatile devices.1 This process, known as data serialization or marshalling, serves as the primary bridge between disparate hardware architectures, operating systems, and addressing mechanisms that characterize the global computing landscape.1 To understand the trajectory of serialization is to understand the history of software engineering itself—a persistent struggle to balance human readability with machine efficiency, flexibility with performance, and the organic growth of complex systems with the rigid requirements of network protocols.4


The citation style in this document is inconsistent with the other new documents and is hard to read. Citations are appended directly to words (e.g., devices.1), which can be confusing.

For better readability and consistency, please:

Use the [1] format for citations, as seen in data_science_perspective.md.

Add a space before the citation (e.g., devices [1]).

This formatting issue appears throughout the document.

gemini-code-assist · 2026-05-01T05:34:28Z

+      - Historical Perspective: theory/historical_perspective.md
+      - Data Science Perspective: theory/data_science_perspective.md
+      - Engineer Perspective: theory/engineer_perspective.md


There appears to be significant content overlap between the three new 'perspective' documents (historical_perspective.md, data_science_perspective.md, engineer_perspective.md). The quality also varies, with data_science_perspective.md being the most comprehensive and polished, while the others contain formatting issues and errors.

Consider consolidating these into a single, high-quality document to avoid redundancy and provide a more focused reading experience for users. If different perspectives are desired, it might be better to structure them as sections within one file.

leo-gan added 3 commits April 30, 2026 21:03

before editign refs in the 'engineer_perspective.md'

c219a77

fixes

f64cee8

logs

caceb1e

gemini-code-assist Bot reviewed May 1, 2026

View reviewed changes

leo-gan added 2 commits April 30, 2026 22:55

fixes

2a31c94

fixes

360c435

leo-gan merged commit 1ef5f82 into master May 1, 2026
4 checks passed

leo-gan deleted the refactor/docs-2 branch May 1, 2026 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor/docs 2#25

refactor/docs 2#25
leo-gan merged 5 commits intomasterfrom
refactor/docs-2

leo-gan commented May 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		The following serializers are partially tested in the benchmark via the `Supports()` method.
		That means they are not tested with all data types, but some of them are tested with specific data types.

		- Tim Bray and others (XML, 1996+): As W3C editors, they defined XML, which became a foundation for data interchange before JSON’s rise (Citing directly: [26] notes Crockford’s JSON emerging as an alternative).

		- Brian Behlendorf, etc. (YAML): The YAML editors listed on the spec include Evans, Ben-Kiki, Ingerson.

		@@ -0,0 +1,227 @@
		# The Architecture of Information: A Comprehensive History and Evolution of Data Serialization

		The fundamental challenge of distributed computing has always been the translation of abstract, multi-dimensional in-memory data structures into a linear, one-dimensional stream of bytes suitable for transmission across physical media or storage on non-volatile devices.1 This process, known as data serialization or marshalling, serves as the primary bridge between disparate hardware architectures, operating systems, and addressing mechanisms that characterize the global computing landscape.1 To understand the trajectory of serialization is to understand the history of software engineering itself—a persistent struggle to balance human readability with machine efficiency, flexibility with performance, and the organic growth of complex systems with the rigid requirements of network protocols.4

Conversation

leo-gan commented May 1, 2026

Documentation Restructure & Theory Content Expansion

Summary

Changes

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant