Skip to content

Optimize artifact lookup in DefaultProjectDependencyAnalyzer#270

Open
sanjana2505006 wants to merge 2 commits intoapache:masterfrom
sanjana2505006:perf/optimize-artifact-lookup
Open

Optimize artifact lookup in DefaultProjectDependencyAnalyzer#270
sanjana2505006 wants to merge 2 commits intoapache:masterfrom
sanjana2505006:perf/optimize-artifact-lookup

Conversation

@sanjana2505006
Copy link
Copy Markdown
Contributor

@sanjana2505006 sanjana2505006 commented Feb 2, 2026

While reviewing the core analysis logic, I noticed a performance bottleneck in DefaultProjectDependencyAnalyzer. \n\nPreviously, it used a linear search to find which artifact a class belongs to ($O(N)$ for every class usage), which could be a bottleneck in large projects with many dependencies. These changes switch to a Map-based lookup ($O(1)$) and modernize the code using computeIfAbsent and more robust scope checks.\n\nThis significantly improves the efficiency of the analysis process for projects with large dependency trees.

@sanjana2505006
Copy link
Copy Markdown
Contributor Author

Hello @slawekjaranowski, I’ve opened a PR with a small improvement I noticed.
Whenever you have time, I’d appreciate your thoughts on whether this approach makes sense.
Thank you!

@sanjana2505006
Copy link
Copy Markdown
Contributor Author

Hello @slawekjaranowski, just a gentle follow-up on this PR. Whenever you have time, could you please share your feedback or let me know if any further changes are needed from my side? Happy to update anything required. Thank you

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes core dependency analysis by replacing per-class linear artifact lookup with a precomputed class-to-artifact map, improving performance for projects with large dependency graphs.

Changes:

  • Build a Map<String, Artifact> index (buildClassToArtifactMap) for O(1) artifact lookup by class name.
  • Update used-artifact computation to use the new map and modernize collection population via computeIfAbsent.
  • Minor robustness/style updates (scope constant, formatting).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return entry.getKey();
Artifact artifact = entry.getKey();
for (String className : entry.getValue()) {
classToArtifactMap.put(className, artifact);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildClassToArtifactMap overwrites any existing mapping when the same class name appears in multiple artifacts. Previously the lookup returned the first matching artifact encountered in artifactClassMap iteration order; with put(...) the last artifact wins, which can change results (e.g., shaded/duplicate classes) and cause incorrect used/unused dependency reporting. Preserve the previous behavior (e.g., putIfAbsent to keep the first mapping) or explicitly detect duplicates and handle/report them deterministically.

Suggested change
classToArtifactMap.put(className, artifact);
classToArtifactMap.putIfAbsent(className, artifact);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanjana2505006 please look for this one

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to putIfAbsent in the latest commit to preserve the original first-match-wins semantics. Thank you for flagging this, @slawekjaranowski!

Comment on lines 232 to 242
private static Map<Artifact, Set<DependencyUsage>> buildUsedArtifacts(
Map<Artifact, Set<String>> artifactClassMap, Set<DependencyUsage> dependencyClasses) {
Map<String, Artifact> classToArtifactMap, Set<DependencyUsage> dependencyClasses) {
Map<Artifact, Set<DependencyUsage>> usedArtifacts = new HashMap<>();

for (DependencyUsage classUsage : dependencyClasses) {
Artifact artifact = findArtifactForClassName(artifactClassMap, classUsage.getDependencyClass());
Artifact artifact = classToArtifactMap.get(classUsage.getDependencyClass());

if (artifact != null && !includedInJDK(artifact)) {
Set<DependencyUsage> classesFromArtifact = usedArtifacts.get(artifact);
if (classesFromArtifact == null) {
classesFromArtifact = new HashSet<>();
usedArtifacts.put(artifact, classesFromArtifact);
}
classesFromArtifact.add(classUsage);
usedArtifacts.computeIfAbsent(artifact, k -> new HashSet<>()).add(classUsage);
}
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no unit tests covering the new class-to-artifact resolution path (including duplicate-class scenarios and ensuring the chosen artifact is deterministic). Adding focused tests around buildUsedArtifacts/buildClassToArtifactMap would help prevent regressions in dependency classification as this is core analysis logic.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I'll add focused unit tests covering the new class-to-artifact resolution path, including duplicate-class scenarios, to ensure the chosen artifact is deterministic. Will update it.

…mantics

The original findArtifactForClassName returned the first matching artifact
during iteration. Using put() overwrites previous entries, making the last
artifact win instead. putIfAbsent preserves the first-match-wins behavior,
preventing incorrect used/unused dependency reporting for shaded/duplicate
classes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants