Skip to content

graphex hpack mode#74

Draft
parsonsmatt wants to merge 4 commits into
dustin:mainfrom
parsonsmatt:mattp/graphex-hpack-mode
Draft

graphex hpack mode#74
parsonsmatt wants to merge 4 commits into
dustin:mainfrom
parsonsmatt:mattp/graphex-hpack-mode

Conversation

@parsonsmatt

Copy link
Copy Markdown

This PR introduces a subcommand hpack which allows us to skip slow Cabal file parsing.

With #73, the bottleneck is now parsing the Cabal file - accounts for about 70% of the runtime of graphex cabal --discover-all on our codebase. Globbing files directly ends up being significantly faster, shaving a significant amount of time off of generating the graph.

parsonsmatt and others added 4 commits February 27, 2026 10:24
Reads package.yaml directly (with !include support via Data.Yaml.Include)
instead of going through the Cabal library. Globs source directories when
explicit module lists aren't present, with proper handling of overlapping
source dirs and main file exclusion.

Extracts buildModuleGraph from Graphex.Cabal so both discovery paths
share the same graph-building pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ations

Keep parsed YAML data as Text and Vector from aeson all the way through
module discovery. Only unpack to String/FilePath at external API boundaries
(ModuleFile constructor, CabalUnit constructors, filesystem calls).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ule name computation

pathToModuleName was using System.FilePath.makeRelative and dropExtension which
operate on [Char]. The new pathToModuleNameText does prefix stripping, suffix
removal, and separator replacement entirely in Text, eliminating the bottleneck
that was 36.8% of profiled runtime.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… style improvements

- Make findHsFilesExcluding concurrent with pooledMapConcurrentlyN
- Use partitionEithers instead of unsafe fromLeft/fromRight helpers
- Use Set for exclude dirs/files instead of Vector elem/notElem
- Replace V.head with safe V.!? 0 fallback
- Use <> and mempty instead of ++ and V.empty
- Remove list comprehensions in favor of map/filter
- Keep all lines under 80 columns
- Fix import ordering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@parsonsmatt parsonsmatt left a comment

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this and #73 , the hot path goes through Megaparsec, and a direct text parser ends up getting graph generation on our codebase under a second.

Comment thread src/Graphex/Hpack.hs
Comment on lines +44 to +48
data ComponentConfig = ComponentConfig
{ ccSourceDirs :: V.Vector Text
, ccExposedModules :: Maybe (V.Vector Text)
, ccOtherModules :: Maybe (V.Vector Text)
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of decisions here are very perf sensitive - avoiding allocations as much as possible when parsing the big JSON.

Comment thread src/Graphex/Hpack.hs
Comment on lines +142 to +145
-- | Join a directory and filename as Text using "/"
(</+>) :: Text -> Text -> Text
dir </+> file = dir <> "/" <> file
infixr 5 </+>

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is.. not terribly portable... but I've got another PR I'd like to draft soon that uses OsPath here which avoids a lot of the string/text shenanigans. But that's maybe a bit more invasive.

Comment thread src/Graphex/Hpack.hs
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE StrictData #-}

module Graphex.Hpack

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graphex.Hpack and Main.Hpack and HpackSpec rhyme soooo much with the .Cabal equivalents, but they're so slightly different that I can't immediately think of a way to abstract over it. And also, I don't think it's too worth it. These features don't change too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants