-
-
Notifications
You must be signed in to change notification settings - Fork 105
Add utilities to detect and replace broken links. #1366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
|
@tj-wazei I've added your request and fixed them |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
Zabuzard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The added methods are utility methods and for those it is crucial that they have a proper Javadoc explaining what they do and give examples.
The isLinkBroken method needs to explain what it means for a link to be broken and how it behaves in edge case, for example if the given text isnt a url and so on.
The other method needs to explain in more detail what the two parameters are and how they work, maybe giving a concrete example in the javadoc.
Yes, i got that solved made javadoc for |
tj-wazei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| int status = response.statusCode(); | ||
| return status < 200 || status >= 400; | ||
| }) | ||
| .exceptionally(ignored -> true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idiomatic name for something you ignore is _
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed ignored lambda parameters to _ where applicable (e.g. exceptionally(_ -> ...), thenApply(_ -> ...)) to clearly indicate intentional non-usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you forgot one:
.exceptionally(ignored -> true); // still never null
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
| List<CompletableFuture<String>> deadLinkFutures = links.stream() | ||
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better: instead of map(foo -> ...thenApply(...)), use .map(foo -> ...).filter(...) then you also dont have all these null items in ur list, polluting it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
| return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0])) | ||
| .thenApply(ignored -> deadLinkFutures.stream() | ||
| .map(CompletableFuture::join) | ||
| .filter(Objects::nonNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.filter(Objects::nonNull) that one isnt needed anymore with the above fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
|
I focused primarily on improving the Javadocs as requested @Zabuzard |
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
application/src/main/java/org/togetherjava/tjbot/features/utils/LinkDetection.java
Outdated
Show resolved
Hide resolved
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) | ||
|
|
||
| .thenApply(isBroken -> isBroken ? Optional.of(link) : Optional.<String>empty())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u misunderstood me. its better if u simply filter out these, skip them here instead of later.
.filter(...) so they are not even part of the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
|
Thanks for the detailed review @Zabuzard I went through the comments one by one and addressed them explicitly: Javadoc & behavior clarification Expanded Javadoc for isLinkBroken to clearly define what “broken” means: HTTP request failure HTTP status codes outside the 200–399 range Clarified that: Status 200 is considered valid even with an empty body Response body content is not inspected Invalid URL formats result in an IllegalArgumentException HEAD / GET request logic HEAD is used as an initial, cheaper check If HEAD indicates failure, a GET request is used as a fallback to handle servers that don’t implement HEAD correctly Any exception during either request is treated as a broken link Naming improvements Renamed request variables to describe intent rather than HTTP mechanics (e.g. link status validation rather than fallback flow) Asynchronous behavior replaceDeadLinks performs all link checks asynchronously CompletableFuture.allOf(...) is used only as a synchronization point; no blocking is introduced thenApply(_ -> …) is intentionally used to ignore the unused result and avoid misleading variable names replaceDeadLinks behavior Added a concrete usage example to the Javadoc showing input → output transformation Only links detected as broken are replaced Working links and non-URL content remain unchanged Duplicate links are checked once and replaced consistently Edge cases Empty input or no detected links returns the original text Empty response bodies are not treated as broken Exceptions during HTTP checks are handled defensively and result in “broken” sorry for stopping to answer those comments one by one it was annoying and frustrating lol |
|
|
||
| private static final Set<LinkFilter> DEFAULT_FILTERS = | ||
| /** | ||
| * Default filters used when extracting links: skip suppressed URLs and non-http schemes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are repeating what the code already says, this javadoc isnt helpful. Instead, make an example so people understand the why 👍
| StringBuilder sb = new StringBuilder(text); | ||
| for (String deadLink : deadLinks) { | ||
| result = result.replace(deadLink, replacement); | ||
| int idx = sb.indexOf(deadLink); | ||
| while (idx != -1) { | ||
| sb.replace(idx, idx + deadLink.length(), replacement); | ||
| idx = sb.indexOf(deadLink, idx + replacement.length()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i find that code a bit too complex to just put it between everything else. move it into a quick private static helper method. sth like private static void replace(StringBuilder haystack, String needle, String replacement)
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double empty line
| * @return a future containing the modified text | ||
| */ | ||
|
|
||
| public static CompletableFuture<String> replaceDeadLinks(String text, String replacement) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
| List<CompletableFuture<String>> deadLinkFutures = links.stream() | ||
| .distinct() | ||
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
| return CompletableFuture.allOf(deadLinkFutures.toArray(new CompletableFuture[0])) | ||
| .thenApply(ignored -> deadLinkFutures.stream() | ||
| .map(CompletableFuture::join) | ||
| .filter(Objects::nonNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
| .map(link -> isLinkBroken(link) | ||
| .thenApply(isBroken -> Boolean.TRUE.equals(isBroken) ? link : null)) | ||
|
|
||
| .thenApply(isBroken -> isBroken ? Optional.of(link) : Optional.<String>empty())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not done yet
Thanks alot to @christolis for helping me out on making this pull request.
Added two utulity methods
isLinkBrokenandreplaceDeadLinks-
isLinkBroken(String url) checks the link availability using a HEAD requestI used HEAD request instead of GET request to check link availability without downloading the response body, reducing bandwidth and improving the performance.
-
replaceDeadLinks(String text, String replacement) replaces unreachable/broken links asynchronously.This change does not have any behavior changes to the existing code.
Part of #1276, implements the mentioned utility but doesnt apply it.