I recently used copilot to do some link checking and it did a great job of not only testing them but updating links that were out of date to new locations (such as in old links to referenced documentation).
Can we build a link checker GitHub action that makes use of copilot or another AI agent to:
- check all web links on a project (as a whole) that is based on a weekly schedule
- just check all web links on specific documents that have changed as part of a pull request
The link-checker could be saved as an action in .github/actions/link-checker
What to parse:
Our project sources files are saved as MyST Markdown files which includes links to websites (html) but also includes links to other documents in a project, in addition to documents located in other Jupyter Book QuantEcon projects.
Here is an example project: https://github.com/QuantEcon/lecture-python-programming.myst
The source files for the published lectures are in the lectures folder.
It might be easier to parse the HTML output to check the links as MyST markdown links need to be parsed for context. The HTML output in our GitHub actions workflows is saved in _build/html/
HTML response codes when testing website links:
It would be fine to silently report the following status codes (without failing the action workflow)
and it would be nice if these codes could be configurable for future flexibility.
Current workflow
Currently we use a program called lychee to parse links, however this only checks the status of links and does not update some of the old redirects to newer, more relevant links. It would be nice if a link is redirected by an external server, that we update the link to the new location. The copilot suggestions made in this example were great, so an AI enabled workflow would be preferable.
An example of a current link checker workflow is https://github.com/QuantEcon/lecture-python-programming.myst/blob/main/.github/workflows/linkcheck.yml
but removing the dependancy onlychee and peter-evans/create-issue-from-file would be preferable.
I recently used
copilotto do some link checking and it did a great job of not only testing them but updating links that were out of date to new locations (such as in old links to referenced documentation).Can we build a link checker GitHub action that makes use of
copilotor another AI agent to:The link-checker could be saved as an action in
.github/actions/link-checkerWhat to parse:
Our project sources files are saved as MyST Markdown files which includes links to
websites (html)but also includes links to other documents in a project, in addition to documents located in other Jupyter Book QuantEcon projects.Here is an example project: https://github.com/QuantEcon/lecture-python-programming.myst
The source files for the published lectures are in the
lecturesfolder.It might be easier to parse the HTML output to check the links as MyST markdown links need to be parsed for context. The HTML output in our GitHub actions workflows is saved in
_build/html/HTML response codes when testing website links:
It would be fine to silently report the following status codes (without failing the action workflow)
and it would be nice if these codes could be configurable for future flexibility.
Current workflow
Currently we use a program called
lycheeto parse links, however this only checks the status of links and does not update some of the old redirects to newer, more relevant links. It would be nice if a link is redirected by an external server, that we update the link to the new location. Thecopilotsuggestions made in this example were great, so an AI enabled workflow would be preferable.An example of a current link checker workflow is https://github.com/QuantEcon/lecture-python-programming.myst/blob/main/.github/workflows/linkcheck.yml
but removing the dependancy on
lycheeandpeter-evans/create-issue-from-filewould be preferable.