Skip to content

Conversation

@barsh404error
Copy link

@barsh404error barsh404error commented Dec 30, 2025

Thanks alot to @christolis for helping me out on making this pull request.
Added two utulity methods isLinkBroken and replaceDeadLinks

-isLinkBroken(String url) checks the link availability using a HEAD request
I used HEAD request instead of GET request to check link availability without downloading the response body, reducing bandwidth and improving the performance.

-replaceDeadLinks (String text, String replacement) replaces unreachable/broken links asynchronously.

This change does not have any behavior changes to the existing code.

Part of #1276, implements the mentioned utility but doesnt apply it.

@barsh404error barsh404error requested a review from a team as a code owner December 30, 2025 10:52
@CLAassistant
Copy link

CLAassistant commented Dec 30, 2025

CLA assistant check
All committers have signed the CLA.

@barsh404error
Copy link
Author

@tj-wazei I've added your request and fixed them
I’ve updated the implementation to avoid mutating shared state in async callbacks and now collect the results before doing replacements sequentially.
I also reuse a single HttpClient and added a GET fallback when HEAD is not supported.
Let me know if anything else should be adjusted ^^

@barsh404error barsh404error requested a review from tj-wazei January 2, 2026 10:22
@barsh404error barsh404error changed the title Add utilities to detect and replace broken links. Add utilities to detect and replace broken links. Fixed #1276 Jan 3, 2026
@christolis christolis changed the title Add utilities to detect and replace broken links. Fixed #1276 Add utilities to detect and replace broken links. Jan 4, 2026
@Zabuzard Zabuzard self-requested a review January 4, 2026 15:37
Copy link
Member

@Zabuzard Zabuzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.

The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.

The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.

@barsh404error
Copy link
Author

The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.

The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.

The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.

Yes, i got that solved made javadoc for isLinkBroken and replaceDeadLinks.
i explained what does it do step by step altho idk if i wrote to long

@barsh404error barsh404error requested a review from Zabuzard January 6, 2026 16:30
tj-wazei
tj-wazei previously approved these changes Jan 6, 2026
Copy link
Contributor

@tj-wazei tj-wazei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

int status = response.statusCode();
return status < 200 || status >= 400;
})
.exceptionally(ignored -> true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idiomatic name for something you ignore is _

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed ignored lambda parameters to _ where applicable (e.g. exceptionally(_ -> ...), thenApply(_ -> ...)) to clearly indicate intentional non-usage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you forgot one:

.exceptionally(ignored -> true); // still never null

Comment on lines 150 to 153
List<CompletableFuture<String>> deadLinkFutures = links.stream()
.distinct()
.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better: instead of map(foo -> ...thenApply(...)), use .map(foo -> ...).filter(...) then you also dont have all these null items in ur list, polluting it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0]))
.thenApply(ignored -> deadLinkFutures.stream()
.map(CompletableFuture::join)
.filter(Objects::nonNull)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.filter(Objects::nonNull) that one isnt needed anymore with the above fix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

@barsh404error
Copy link
Author

I focused primarily on improving the Javadocs as requested @Zabuzard
While doing so, I slightly refactored the internal stream to use Optional instead of null to better match the documented behavior and avoid null handling
No external behavior was changed.

.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))

.thenApply(isBroken -> isBroken ? Optional.of(link) : Optional.<String>empty()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u misunderstood me. its better if u simply filter out these, skip them here instead of later.
.filter(...) so they are not even part of the list

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

@barsh404error
Copy link
Author

Thanks for the detailed review @Zabuzard I went through the comments one by one and addressed them explicitly:

Javadoc & behavior clarification

Expanded Javadoc for isLinkBroken to clearly define what “broken” means:

HTTP request failure

HTTP status codes outside the 200–399 range

Clarified that:

Status 200 is considered valid even with an empty body

Response body content is not inspected

Invalid URL formats result in an IllegalArgumentException

HEAD / GET request logic

HEAD is used as an initial, cheaper check

If HEAD indicates failure, a GET request is used as a fallback to handle servers that don’t implement HEAD correctly

Any exception during either request is treated as a broken link

Naming improvements

Renamed request variables to describe intent rather than HTTP mechanics (e.g. link status validation rather than fallback flow)

Asynchronous behavior

replaceDeadLinks performs all link checks asynchronously

CompletableFuture.allOf(...) is used only as a synchronization point; no blocking is introduced

thenApply(_ -> …) is intentionally used to ignore the unused result and avoid misleading variable names

replaceDeadLinks behavior

Added a concrete usage example to the Javadoc showing input → output transformation

Only links detected as broken are replaced

Working links and non-URL content remain unchanged

Duplicate links are checked once and replaced consistently

Edge cases

Empty input or no detected links returns the original text

Empty response bodies are not treated as broken

Exceptions during HTTP checks are handled defensively and result in “broken”

sorry for stopping to answer those comments one by one it was annoying and frustrating lol


private static final Set<LinkFilter> DEFAULT_FILTERS =
/**
* Default filters used when extracting links: skip suppressed URLs and non-http schemes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are repeating what the code already says, this javadoc isnt helpful. Instead, make an example so people understand the why 👍

Comment on lines 175 to 181
StringBuilder sb = new StringBuilder(text);
for (String deadLink : deadLinks) {
result = result.replace(deadLink, replacement);
int idx = sb.indexOf(deadLink);
while (idx != -1) {
sb.replace(idx, idx + deadLink.length(), replacement);
idx = sb.indexOf(deadLink, idx + replacement.length());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i find that code a bit too complex to just put it between everything else. move it into a quick private static helper method. sth like private static void replace(StringBuilder haystack, String needle, String replacement)

Comment on lines +167 to +168


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double empty line

* @return a future containing the modified text
*/

public static CompletableFuture<String> replaceDeadLinks(String text, String replacement) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

Comment on lines 150 to 153
List<CompletableFuture<String>> deadLinkFutures = links.stream()
.distinct()
.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0]))
.thenApply(ignored -> deadLinkFutures.stream()
.map(CompletableFuture::join)
.filter(Objects::nonNull)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

.map(link -> isLinkBroken(link)
.thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null))

.thenApply(isBroken -> isBroken ? Optional.of(link) : Optional.<String>empty()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants