Extracts links and_or anchors from markup files.
Currently, markdown/md and html files are supported.
The main intended purpose of the Markup Link Extractor,
is to extract links from a set of files,
and then check them for validity using a separate tool,
e.g. the Markdown Link Checker.
Together, two such tools could be integrated in your CI pipeline
to warn about broken links in your markup docs.
- Extracts links from
markdown/mdandhtmlfiles - Extracts anchors from
markdown/mdandhtmlfiles.
Anchors are parts of a file that can be linked to, by appending the parts identifier/name to the file path/URL after a#(hash);
e.g.https://www.example.com/some-dir/some-file.html#sub-section - Support HTML links and plain URLs in
markdownfiles - Command line interface according to the UNIX philosophy,
first item: of "Make each program do one thing well".
-> Therefore, this tool does not scan for markup files, nor does it check the links itself. - Easy CI pipeline integration
- Very fast execution using async
- Operates offline, accessing only files on the local file-system
There are different ways to install and use mle.
Use rust's package manager cargo to install mle from crates.io:
cargo install mleTo download a compiled binary version of mle
go to github releases
and download the binaries compiled for x86_64-unknown-linux-gnu
or x86_64-apple-darwin.
Use mle in GitHub using the GitHub-Action from the Marketplace.
- name: Markup Link Extractor (mle)
uses: hoijui/mle@v0.14.3Use mle command line arguments using the with argument:
- name: Markup Link Extractor (mle)
uses: hoijui/mle@v0.14.3
with:
args: ./README.mdTo integrate mle in your CI pipeline running in a linux x86_64 environment, you can add the following commands to download the tool:
curl -L https://github.com/hoijui/mle/releases/download/v0.14.3/mle -o mle
chmod +x mleFor example take a look at the ntest repo which uses mle in the CI pipeline.
Use the mle docker image from the docker hub, which includes mle.
Once you have mle installed, it can be called from the command line. The following call will extract all links in markup files found under the current folder (including sub-directories):
mle ./**.{html,md}This extracts links from all git-tracked Markdown files,
except those matching README or LICENSE,
and write the result to stdout in CSV format.
# explicit version
g ls-files **.{html,md} -z \
| grep --null-data --invert-match --ignore-case --regexp README --regexp LICENSE \
| xargs -0 mle --result-format csv
# same in short form
g ls-files **.{html,md} -z | grep -z -v -i -e README -e LICENSE | xargs -0 mle --result-format csvHere we write the list of files to a file first,
and then pass that to mle.
This is useful for when the list of files is used multiple times,
or if it is very large,
potentially exceeding the shells limit for arguments.
g ls-files **.{html,md} -z | tr '\0' '\n' > /tmp/link-check_files.csv
mle --markup-files-list /tmp/link-check_files.csvCall mle with the --help flag to display all available cli arguments:
mle --helpThis project was funded by the European Regional Development Fund (ERDF)
in the context of the INTERFACER Project,
from July 2022 (fork from mlc/project start)
until March 2023.
