Conversation
23b6098 to
69250ee
Compare
Strip all arc, SDF, blob encoding, and shader API from glyphy.h. Add glyphy_curve_t (quadratic Bezier with p1/p2/p3) and glyphy_curve_accumulator_t that directly stores quadratic curves from glyph outlines without arc approximation. This is the first step toward Slug-style winding number rendering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Encodes quadratic Bezier curves into a RGBA16UI blob with: - Horizontal and vertical bands for spatial indexing - Curves sorted by descending max coordinate for early exit - Em-space quantized int16 coordinates (configurable scale) - Horizontal/vertical line filtering (can't intersect parallel rays) Blob layout: [band headers] [curve index lists] [curve data] All offsets are 1D from blob start; shader converts to 2D via atlas width. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port SlugPixelShader.hlsl to GLSL 1.30 for our single isampler2D atlas: - CalcRootCode: equivalence class via sign-bit extraction (floatBitsToUint) - SolveHorizPoly/SolveVertPoly: quadratic root finding with discriminant clamping - CalcCoverage: weighted combination of horizontal and vertical ray coverage - glyphy_slug_render: main entry point, iterates H-bands and V-bands Adapted for single-texture blob layout with 1D offsets via CalcBandLoc. Curve data decoded from int16 quantized em-space coordinates. Also: cap bands at 16, use signed int16 texels, update meson.build to remove old arc/sdf/blob sources and add new files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New vertex shader passes em-space texcoords and flat per-glyph data (band transform, atlas location, band counts) to fragment shader. Fragment shader calls glyphy_slug_render() for coverage. Updated glyph_info_t: replaced nominal_w/h with num_hbands/num_vbands. Updated glyph_vertex_t: new layout with position, texcoord, band transform, and glyph data fields for Slug rendering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Atlas: RGBA16I texture with isampler2D, texelFetch upload - Font: use curve accumulator + blob encoder instead of arc pipeline - Shader: assemble slug GLSL + demo fragment shader, new vertex layout with position, texcoord, bandTransform (flat), glyphData (flat) - Buffer: set up 4 vertex attributes including glVertexAttribIPointer for integer glyph data - GLState: simplified, removed old SDF uniforms - HarfBuzz/FreeType headers: updated for curve accumulator, use hb_font_draw_glyph, stub cubic_to for now - Removed demo-atlas.glsl (no longer needed) Builds and links successfully. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use 16384 texels (vs 4096) to handle complex glyphs with many curves. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bump to #version 330 (needed for floatBitsToUint) - Remove #version from demo-fshader.glsl (prepended by demo-shader.cc) - Remove demo-atlas.glsl from Makefile.am (no longer used) - Update src/Makefile.am for new source files and shader - Slug shader by Eric Lengyel, ported to GLSL by Behdad Esfahbod Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Delete all arc geometry, bezier-to-arc approximation, SDF calculation, grid-based blob encoding, outline manipulation, and glyphy-validate. Fix glyphy-extents.cc to not depend on deleted glyphy-common.hh. Update both autotools and meson build files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add close_path callback to HarfBuzz draw funcs. Without it, the last contour's closing segment was never emitted, causing winding number errors (horizontal streak artifacts). - Use floor() instead of (int) cast for band assignment to avoid off-by-one at band boundaries due to floating-point truncation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lines encoded as degenerate quadratics with p2=midpoint(p1,p3) had a=p1-2*p2+p3=0 mathematically, but float32 rounding of the intermediate p12.zw*2.0 in the shader made a nonzero (~1e-7). This triggered the quadratic path with 1/a producing Inf, corrupting winding numbers across entire scanlines. Fix: encode lines as (p1,p1,p3) instead. This gives a=p3-p1 (full line span, well-conditioned) and b=0. The standard quadratic solver handles this cleanly with no near-zero divisions and no shader changes needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These features were part of the old SDF pipeline and are not implemented in the Slug renderer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the 2D RGBA16I texture atlas with a 1D buffer texture (GL_TEXTURE_BUFFER). This eliminates: - CalcBandLoc wrapping logic in the shader - Atlas width constant (GLYPHY_ATLAS_WIDTH, GLYPHY_LOG_ATLAS_WIDTH) - 2D coordinate math for atlas lookups - item_w / item_h_quantum parameters The shader now uses isamplerBuffer with plain integer offsets: texelFetch(u_atlas, glyphLoc + offset) instead of texelFetch(u_atlas, glyphy_calc_blob_loc(glyphLoc, offset), 0). glyphLoc is a single int (1D offset) instead of ivec2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Store extents and band counts in a 2-texel header at the start of each glyph's blob. The shader reads the header and computes the band transform itself. Client no longer needs to pass bandTransform (vec4) or band counts. Per-glyph vertex data is now just: position, texcoord, and one flat int (atlas_offset). The encoder API drops num_hbands/num_vbands output parameters. ~1% performance cost for much simpler client integration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per-glyph data is now just one uint (atlas offset) instead of an ivec4. Vertex size drops from 48 to 20 bytes. The shader API takes uint for the glyph location. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Meson is the only build system now. Remove all autotools files (configure.ac, Makefile.am, autogen.sh, git.mk) and stale generated headers that caused shader compilation issues when both build systems were used. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The encoder outputs extents in font design units. Previously the demo divided by upem then the shader multiplied back. Now everything stays in font units. Screen positions use font_size/upem as the scale factor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adjacent curves in a contour share p3/p1. Instead of storing 2 texels per curve (2N total), store N+1 texels per contour: (p1,p2) (p3/p1,p2_next) ... (p3,0) The shader reads curveLoc and curveLoc+1, unchanged. Band indices point to each curve's first texel. Nearly halves curve data storage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ubuntu 24.04 has harfbuzz >= 4.0.0, no need to build from source. Remove autotools steps, autotools deps, and harfbuzz cache. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For each band, store curves sorted in both directions: descending max (for rightward/upward ray) and ascending min (for leftward/ downward ray). A split value per band determines which direction to use based on pixel position. Pixels on the left side of the glyph use leftward ray with ascending sort; pixels on the right use rightward ray with descending sort. Each side exits early after processing only the nearby curves. Coverage formula adjusted for leftward ray: saturate(0.5 - r) instead of saturate(r + 0.5). ~15% speedup (1725 -> 1980 fps). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of using the glyph midpoint for all bands, compute the optimal split per band by finding the x/y value that minimizes max(left_count, right_count). This balances the work between leftward and rightward rays more evenly, especially for asymmetric glyphs. ~2% speedup (1980 -> 2020 fps). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Dilate glyph quads by exactly half a pixel on screen, regardless of zoom level. The vertex shader computes the dilation in clip space and corrects the texcoord using the inverse 2x2 MVP Jacobian. Per-vertex data adds normal direction and inverse Jacobian for the object-to-em-space mapping. Viewport size passed as uniform. Based on SlugDilate from the Slug reference vertex shader, simplified for z=0 geometry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add glyphy-vertex.glsl with glyphy_dilate() that computes exact half-pixel dilation using the MVP matrix and viewport size. Client vertex shader just calls one function. Rename shader files and API: glyphy-slug.glsl -> glyphy-fragment.glsl glyphy_slug_shader_source -> glyphy_fragment_shader_source glyphy_slug_render -> glyphy_render New: glyphy_vertex_shader_source, glyphy_dilate Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drop "(MIT license)" from Slug attribution -- the algorithm is referenced, not the code. Remove glyphy_*_shader_source_path() functions and PKGDATADIR/SHADER_PATH since shaders are embedded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace GLUT/FreeGLUT windowing with GLFW3: - Explicit main loop instead of glutMainLoop - GLFW native vsync via glfwSwapInterval (removes platform-specific CGLSetParameter/wglSwapIntervalEXT/glXSwapIntervalSGI code) - Proper core profile context hints (no GLUT_3_2_CORE_PROFILE ifdef) - Proper scroll callback instead of GLUT_WHEEL_UP/DOWN hack - Split keyboard handling into key callback (special keys) and char callback (printable characters) - Dirty-flag based redraw with glfwWaitEvents when idle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
glewInit() calls glGetString(GL_EXTENSIONS) which generates GL_INVALID_ENUM in a core profile context. Ignore the return value and clear the GL error; the GL_VERSION_3_3 check that follows is the real gate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Set initial viewport from framebuffer size (not window size) - Use window size for mouse coordinate math since GLFW cursor positions are in screen coordinates, not framebuffer pixels - Poll events before first render Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Render an initial frame, poll events to let the Wayland compositor configure the surface at the correct content scale, then render a second frame at the right resolution. Also set viewport from framebuffer size every frame to avoid stale viewport state on HiDPI displays. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Track cursor position continuously from the cursor callback instead of querying glfwGetCursorPos in the mouse button handler. On Wayland, the two can disagree, causing a position jump on the first drag. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integer millisecond timing caused jitter at high frame rates (vsync off) because frame deltas alternated between 0 and 1ms. Switch to double-precision seconds throughout the animation and FPS reporting code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Owner
Author
|
As for encoding speed, the new benchmark shows that this encodes NotoSansCJK in 750ms on my machine. Not bad... For reference, this is 8x slower than HarfBuzz extracting the outlines. Shade memory is bad though: over 400MB for this font. Roboto-Regular: |
Owner
Author
|
This PR also modernizes the codebase, ports to GLFW, and many other improvements. I suggest we merge it after the license is sorted out. |
Owner
Author
|
Okay, I received written (email) permission from Eric to use the shaders under HarfBuzz's Old MIT license: I'll go ahead and cherry-pick and update the license change from master, then this can be merged. |
|
Great! Don't forget to update the repo description: |
Owner
Author
|
Thanks. Done. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See: https://terathon.com/blog/decade-slug.html
On my setup, runs >2.5x faster than GLyphy algorithm and seems to consume slightly more shader memory. All rendering artifacts are gone!
This was done in two hours with Claude, from SLUG PDFs & shader code, to working demo. Insane.