graphex hpack mode#74
Conversation
Reads package.yaml directly (with !include support via Data.Yaml.Include) instead of going through the Cabal library. Globs source directories when explicit module lists aren't present, with proper handling of overlapping source dirs and main file exclusion. Extracts buildModuleGraph from Graphex.Cabal so both discovery paths share the same graph-building pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ations Keep parsed YAML data as Text and Vector from aeson all the way through module discovery. Only unpack to String/FilePath at external API boundaries (ModuleFile constructor, CabalUnit constructors, filesystem calls). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ule name computation pathToModuleName was using System.FilePath.makeRelative and dropExtension which operate on [Char]. The new pathToModuleNameText does prefix stripping, suffix removal, and separator replacement entirely in Text, eliminating the bottleneck that was 36.8% of profiled runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… style improvements - Make findHsFilesExcluding concurrent with pooledMapConcurrentlyN - Use partitionEithers instead of unsafe fromLeft/fromRight helpers - Use Set for exclude dirs/files instead of Vector elem/notElem - Replace V.head with safe V.!? 0 fallback - Use <> and mempty instead of ++ and V.empty - Remove list comprehensions in favor of map/filter - Keep all lines under 80 columns - Fix import ordering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
parsonsmatt
left a comment
There was a problem hiding this comment.
With this and #73 , the hot path goes through Megaparsec, and a direct text parser ends up getting graph generation on our codebase under a second.
| data ComponentConfig = ComponentConfig | ||
| { ccSourceDirs :: V.Vector Text | ||
| , ccExposedModules :: Maybe (V.Vector Text) | ||
| , ccOtherModules :: Maybe (V.Vector Text) | ||
| } |
There was a problem hiding this comment.
Lots of decisions here are very perf sensitive - avoiding allocations as much as possible when parsing the big JSON.
| -- | Join a directory and filename as Text using "/" | ||
| (</+>) :: Text -> Text -> Text | ||
| dir </+> file = dir <> "/" <> file | ||
| infixr 5 </+> |
There was a problem hiding this comment.
This is.. not terribly portable... but I've got another PR I'd like to draft soon that uses OsPath here which avoids a lot of the string/text shenanigans. But that's maybe a bit more invasive.
| {-# LANGUAGE OverloadedStrings #-} | ||
| {-# LANGUAGE StrictData #-} | ||
|
|
||
| module Graphex.Hpack |
There was a problem hiding this comment.
Graphex.Hpack and Main.Hpack and HpackSpec rhyme soooo much with the .Cabal equivalents, but they're so slightly different that I can't immediately think of a way to abstract over it. And also, I don't think it's too worth it. These features don't change too much.
This PR introduces a subcommand
hpackwhich allows us to skip slow Cabal file parsing.With #73, the bottleneck is now parsing the Cabal file - accounts for about 70% of the runtime of
graphex cabal --discover-allon our codebase. Globbing files directly ends up being significantly faster, shaving a significant amount of time off of generating the graph.