Skip to content

Generalized the Codebraid support to MkDocs#154

Open
reenberg wants to merge 1 commit intomicrosoft:mainfrom
reenberg:feature/mkdocs
Open

Generalized the Codebraid support to MkDocs#154
reenberg wants to merge 1 commit intomicrosoft:mainfrom
reenberg:feature/mkdocs

Conversation

@reenberg
Copy link
Copy Markdown

@reenberg reenberg commented Dec 22, 2023

In #57 support for Codebraid syntax was added, which essentially is just Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra identifier in the list of languages, for which Codebraid has support, such as for python: \\{\\.python.+?\\}.

The below example would give the following scope: "text.html.markdown markup.fenced_code.block.markdown fenced_code.block.language.markdown" to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part, and the current support doesn't allow spaces between the curly braces, and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if additional classes, id or key/value pairs are used, then the curly braces must be used and the language must be prefixed with a dot. In simple cases where only the language is specified, then the curly braces and the dot may be omitted. The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as an identifier attribute, and moved into the RegEx by defining it as two alternative cases: surrounded by curly braces or allowing them after the language:

  1. The case where the entire line after the code fence is wrapped in curly braces. In this case the curly braces is not part of the language and attribute scope.
  2. The case where the attributes follows the language specification in all sorts of ways (I'm specifically thinking of you Gatsby Adding attributes to fenced markdown code blocks breaks syntax highlighting #62). In this case the curly braces are included in the attribute scope as it is not trivial to handle all the various ways it may be used, and since this is the current behavior.

@microsoft-github-policy-service agree

Closes #153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md

In microsoft#57 support for Codebraid syntax was added, which essentially is just
Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra `identifier` in the list of languages,
for which Codebraid has support, such as for python:
`\\{\\.python.+?\\}`.

The below example would give the following scope: "text.html.markdown
markup.fenced_code.block.markdown fenced_code.block.language.markdown"
to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part,
and the current support doesn't allow spaces between the curly braces,
and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if
additional classes, id or key/value pairs are used, then the curly
braces must be used and the language must be prefixed with a dot.  In
simple cases where only the language is specified, then the curly braces
and the dot may be omitted.  The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as
an `identifier` attribute, and moved into the RegEx by defining it as
two alternative cases: surrounded by curly braces or allowing them after
the language:

1. The case where the entire line after the code fence is wrapped in
   curly braces.  In this case the curly braces is not part of the
   language and attribute scope.
2. The case where the attributes follows the language specification in
   all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62).  In
   this case the curly braces are included in the attribute scope as it
   is not trivial to handle all the various ways it may be used, and
   since this is the current behavior.

@microsoft-github-policy-service agree

Closes microsoft#153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
@reenberg
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@reenberg
Copy link
Copy Markdown
Author

It would seem that this PR also has the side effect of fixing the broken Codebraid support for Rust, which was mistakenly matched as R code. This is most of the changes in the file pr-57_md.json. The rest are basically just updates to the fenced block not correctly scoping the language and attributes.

@reenberg
Copy link
Copy Markdown
Author

@mjbvz, @alexdima, any chance this can receive a review and be moved along?

@mjbvz mjbvz added this to the March 2024 milestone Feb 21, 2024
@rzhao271
Copy link
Copy Markdown
Contributor

Moving the milestone
Also, there are merge conflicts

@rzhao271 rzhao271 modified the milestones: March 2024, April 2024 Mar 28, 2024
@lramos15 lramos15 modified the milestones: April 2024, Backlog Apr 26, 2024
return `fenced_code_block_${name}:
begin:
(^|\\G)(\\s*)(\`{3,}|~{3,})\\s*(?i:(${identifiers.join('|')})((\\s+|:|,|\\{|\\?)[^\`]*)?$)
(^|\\G)(\\s*)([\`~]{3,})\\s*(?i:(?:\\{\\s*\\.?(${identifiers.join('|')})(?:\\}|\\s+([^\`\\r\\n]*?)?\\s*\\}))|(?:\\.?(\\g<4>)((?:\\s+|:|,|\\{|\\?)[^\`\\r\\n]*?)?))$
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we break this up into a multiline regular expression. It was already a bit long but now it's unreadable

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill see if I can find time to setup another dev environment to properly test out the changes that this will include.

@reenberg
Copy link
Copy Markdown
Author

reenberg commented Jun 4, 2024

Moving the milestone Also, there are merge conflicts

There would have been no conflicts if this had not been ignored, until way after a MS employee merged other changes (#158), ignoring all other PRs!

I will try and see if I can find the time. But its about 5months since i invested time into understanding this set of fairly complex expressions. not to mention setting up a dev environment to test this, in order to implement the suggested changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support MkDocs fenced codeblock attributes and dot prefixed language

4 participants