Skip to content

Add python support to quarto-authoring skill#44

Open
AlejandroGomezFrieiro wants to merge 1 commit intoposit-dev:mainfrom
AlejandroGomezFrieiro:main
Open

Add python support to quarto-authoring skill#44
AlejandroGomezFrieiro wants to merge 1 commit intoposit-dev:mainfrom
AlejandroGomezFrieiro:main

Conversation

@AlejandroGomezFrieiro
Copy link
Copy Markdown

The current skill outlines well how to use R to author quarto documents, but does not include Python which is another greatly supported language.

Copy link
Copy Markdown
Collaborator

@gadenbuie gadenbuie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlejandroGomezFrieiro, this looks great. I agree that this was a missing aspect of the skill and appreciate you bringing more Python into the examples!

I took a quick look and have a couple small comments. @mcanouil would you like to review, too?

Comment on lines +109 to +110
R:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than showing two examples of the same thing, I'd rather we diversify the examples to use languages in addition to R.

In other words, in this case, we're trying to show the model the set of cell options that are relevant for figures. LLMs can generalize from one example showing the code cell options; we don't need two examples that contain identical options.

That said, I do think it's helpful for there to be diversity in the examples, which also helps the model generalize. So I'd prefer if, in cases like this example, rather than adding a new identical example we were to simply rewrite some examples in other languages. We should prefer R and Python, but it'd be nice to have Julia or another language often used by Quarto authors in the examples.

Comment on lines +203 to +213
Python (pandas — `output: asis` renders markdown table in all formats):

````markdown
```{python}
#| label: tbl-summary
#| tbl-cap: "Summary statistics by group."
#| output: asis

print(summary_df.to_markdown(index=False))
```
````
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good example of a case where it is quite helpful to have an extra example in Python, because the additional example shows something new.

Comment on lines +215 to +224
Python (pandas — plain `df` auto-displays as HTML table in HTML output):

````markdown
```{python}
#| label: tbl-summary
#| tbl-cap: "Summary statistics by group."

summary_df
```
````
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend removing this example, but keeping the comment about pandas data frames auto-displaying as HTML in HTML output (either above or below the asis-output example)

@mcanouil
Copy link
Copy Markdown
Contributor

mcanouil commented Apr 10, 2026

@gadenbuie I'll take a look this week-end. A quick comment is to change examples to not even use actual code.
Because, what about Julia which is also natively supported.

Briefly, this means something like:

```{language}
#| option-one: value
CODE
```

where "option" use the language specific inline comment symbol plus pipe (`|`).

Quarto has three native engines: `knitr`, `jupyter`, and `julia`.
`jupyter` engine default to `python` Jupyter kernel.

```yaml
engine: ...
kernel: ...
```

Also worth noting that libraries' behaviour inside Quarto depends on the libraries themselves and on the engine/kernel being used (e.g., Python code outputs via knitr does not behave similarly to Jupyter Python kernel, same is true with other Jupyter kernel supporting Python), so stating libraries behaviours in the skill seems out of scope and very hard to ensure it remains true.

YAML metadata configuration, and Quarto extensions. Also covers converting and
migrating R Markdown (.Rmd), bookdown, blogdown, xaringan, and distill projects
to Quarto, and creating Quarto websites, books, presentations, and reports.
Writing and authoring Quarto documents (.qmd) with R (knitr) and Python (Jupyter)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues with the description rewrite.

First, "R (knitr) and Python (Jupyter) engines" is not quite accurate. Knitr runs any language registered as a knitr language engine (Python, Julia, Bash, SQL, Stan, and more), and jupyter runs any registered kernel (R via IRkernel, Julia via IJulia, Bash, and more). Pinning the description to two language/engine pairs paints Quarto as narrower than it is. There are three native engines (knitr, jupyter, julia), and in 1.9 the julia engine was refactored to sit on top of the new engine-extension mechanism, so it is both a native engine and the reference implementation for third-party engine extensions. The skill is pinned to 1.9.36 (line 17), so leaving julia and engine extensions out is a dated inaccuracy. Quarto is language-agnostic anyway, so any enumeration of languages will bit-rot the next time a new kernel or engine ships. And since Quarto 1.9 ships engine extensions, the set of computing engines behind a code cell is no longer closed: in principle anything can end up there, so engines are the stable abstraction to describe in the skill.

Second, the trigger wording. "Migrating R Markdown, bookdown, ..." narrows a key trigger, because users are far more likely to say "convert my old .Rmd" than "migrate my .Rmd", and the description is the main signal the LLM uses to pick this skill. Please keep "converting and migrating".

One more thing: the rewrite shortens the keyword surface ("code cell options, figure and table captions, cross-references, callout blocks (notes, warnings, tips), citations and bibliography" becomes "code cells, figures, tables, cross-references, callouts, citations"). Shorter descriptions match less reliably on keyword-rich queries, so the longer form was closer to what I would want.

Suggested rewording that keeps the enumerative surface, restores "converting", adds the missing .ipynb trigger, and refers to engines rather than languages:

description: >
  Writing and authoring Quarto documents (.qmd) with the knitr, jupyter, and
  julia engines (and any Quarto 1.9+ engine extensions). Covers code cell
  options, figure and table captions, cross-references, callout blocks
  (notes, warnings, tips), citations and bibliography, page layout and
  columns, Mermaid diagrams, YAML metadata configuration, and Quarto
  extensions. Also covers converting and migrating R Markdown (.Rmd),
  bookdown, blogdown, xaringan, distill projects, and Jupyter notebooks
  (.ipynb) to Quarto, and creating Quarto websites, books, presentations,
  and reports.

```{python}
#| label: tbl-summary
#| tbl-cap: "Summary statistics by group."
#| output: asis
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output: asis shows up seven times across this PR (here, in tables.md, and in layout.md) without a single line explaining what it does. The skill will end up learning "add output: asis when doing Python tables" as a ritual rather than understanding the underlying model.

Briefly, asis tells Quarto to pass the cell output through as raw markdown that should not be processed further. That matters because:

  • The content must already be valid markdown (pipe table, headings, prose).
  • Non-markdown content needs a raw block: ```{=html}, ```{=latex}, and so on.
  • It interacts with tbl-cap in format-specific ways. Captions on output: asis markdown tables do not behave identically to knitr's table path.

Could we add one explainer subsection in this file, under "Execution Options", that defines asis and its requirements, then reference it from the table and layout examples instead of repeating the option verbatim in every Python snippet?

````markdown
```{python}
#| label: slow-computation
#| cache: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cell-level #| cache: true is a knitr feature. With the jupyter engine, caching is handled by jupyter-cache and is enabled at document level via execute: cache: true in YAML (or freeze: auto in _quarto.yml for project-level freezing). Individual Python cells can only opt out with #| cache: false. They cannot opt in.

As written this example is silently inert: it will not cache, and it will teach the skill a pattern that does nothing, which is worse than omitting Python caching entirely.

Suggest dropping the code example and replacing it with a short prose note pointing at https://quarto.org/docs/projects/code-execution.html#cache (use the .llms.md URL instead or .html).

#| code-annotations: hover

import pandas as pd # <1>
df = pd.read_csv("data.csv") # <2>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but the example is self-inconsistent: it reads data.csv and then uses mtcars column names (mpg, cyl). Either load the dataset under a filename that matches (mtcars.csv) or switch to a source that actually has those columns, like seaborn.load_dataset("mpg"). As-is, the annotation "Load the dataset from a CSV file" leaves the skill guessing at what is in data.csv.

@@ -0,0 +1,154 @@
# Converting Jupyter Notebooks to Quarto
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we scope this reference down quite a bit? Several sections work against the skill's purpose.

  1. quarto convert is bidirectional. .qmd to .ipynb is just as valid as .ipynb to .qmd. The current doc presents it as one-way, which is factually wrong. See inline below.
  2. Scope drift into notebook JSON editing. This skill is about authoring, not hand-editing .ipynb JSON. The "Cell Option Migration" section with JSON examples (jupyter.source_hidden, and so on) and the "Common Metadata Mappings" table push the skill towards editing notebook JSON, when quarto convert handles that automatically. The mapping table is not even a 1:1 mapping (those are nbconvert tags, not Quarto options), so training on it would teach the wrong behaviour.
  3. Edit-experience and version-control comparison table. Partially accurate, but not about authoring. If we want that framing, it belongs as a one-line note rather than a feature table.
  4. Duplication with existing references. "Controlling Re-execution", "freeze for Collaborative Projects", and "Specifying the Jupyter Kernel" all repeat content already in code-cells.md and yaml-front-matter.md. Since this file is only loaded when the user asks to convert an ipynb (per SKILL.md:76), the duplicated content will never be read on the normal authoring path and brings nothing more than increase token cost when "converting".
  5. Attached to the wrong trigger. Everything except the conversion instructions is irrelevant to "convert ipynb to qmd" but would be useful for authoring, so the possibly useful content (not already mentioned in other files) is sitting behind a trigger that will not fire for authoring tasks.

Suggest cutting this file to roughly 30 lines: direct rendering of .ipynb, bidirectional quarto convert, and one paragraph on why .qmd is usually preferred for version control. Anything kernel or execution related would move into code-cells.md or yaml-front-matter.md so it is reachable during authoring.

Use the Quarto CLI to convert:

```bash
quarto convert notebook.ipynb
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quarto convert is bidirectional: quarto convert notebook.qmd also works and produces a .ipynb. Worth showing alongside the .ipynb to .qmd direction so the skill does not invent a more awkward path when the user asks for .qmd to .ipynb.

```
````

### Python Examples
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section now walks through four Python plotting libraries (matplotlib, seaborn, plotnine, plotly) with usage examples. Per my earlier note on the PR thread, library-specific behaviour is out of scope for a Quarto authoring skill: behaviour depends on the library, engine, and format matrix (for example plotly's fig.show() differs between jupyter and knitr+reticulate, and plotnine's figure-size inheritance is non-obvious), and that matrix is hard to keep accurate over time.

Could we collapse this to one minimal matplotlib example showing the Quarto cell options (which is what this file is actually about), and let the library docs cover library behaviour? The upstream Quarto docs take the same approach: https://quarto.org/docs/computations/python.html#overview.

```
````

### Python with polars
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concerns with the polars and great-tables sections.

  1. df.to_pandas().to_markdown(index=False) is not a pattern to teach. Round-tripping polars through pandas just to render markdown is wasteful and it will disappear the moment polars' own output improves. If we really need to show polars to markdown, tabulate(df.rows(named=True), headers="keys", tablefmt="pipe") is closer to idiomatic.
  2. Scope. Same concern as figures.md: library-specific instructions (polars, great-tables) expand the surface area the skill has to maintain. This file should document Quarto's table options and cross-referencing mechanics and leave library choice to the author.

#| tbl-cap: "Long table."
#| output: asis

print(long_df.to_markdown(index=False))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caption "use output: asis so the markdown table is recognized across pages" is not correct. longtable is a LaTeX package, and in the R example above it is knitr::kable(..., longtable = TRUE) that triggers the LaTeX longtable environment. A pandas or polars markdown table emitted via output: asis is processed as a standard Pandoc pipe-table and will not automatically break across pages in PDF.

The honest Python equivalents for PDF long tables are:

  • use tabulate with tablefmt="latex_longtable" and emit the raw LaTeX in a ```{=latex} block, or
  • render with great_tables (GT.as_latex(self, use_longtable=False, tbl_pos=None)) and let the format engine handle pagination, or
  • simply document that long-table pagination is not handled automatically on the jupyter engine.

Suggest either dropping the "recognized across pages" claim or deleting the Python long-table example entirely.

Finally, longtable is a very specific LaTeX case with several rough edges in the LaTeX ecosystem. I don't feel it's a great idea to make the skill go into those very dark and deep waters as maintenance on this part will require validation in LaTeX and agains the libraries.
Authors wishing for this should look in their library of choice if/when they choose to output to LaTeX which I don't expect to be the default when there is Typst.

progress: true
```

## Execution Engine
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues.

  1. Engine is not a kernel. engine: knitr # R (knitr) and engine: jupyter # Python, ... conflates engines with languages. Knitr handles Python, Julia, Bash, SQL, Stan via its language engines; jupyter handles any registered kernel. Quarto ships three native engines (knitr, jupyter, julia), and in 1.9 the julia engine was refactored to sit on top of the new engine-extension mechanism, so the set of engines is no longer closed, and third-party extensions can register more. Both the third native engine and the extension mechanism are missing here.
  2. Location. Execution options (cache, freeze, echo, and so on) affect code cells and are already authored in code-cells.md. Duplicating them here risks drift. Either move them over to code-cells.md and leave a See: pointer, or drop the new "Execution Engine" subsection and fold engine selection into the existing "Execution Options" block.

Suggested rewording of the engine block, if we keep it here:

engine: knitr   # native; runs R plus any knitr language engine (Python, Julia, Bash, SQL, Stan, ...)
engine: jupyter # native; runs any registered Jupyter kernel
engine: julia   # native; implemented on top of the engine-extension mechanism since Quarto 1.9

Plus a note that Quarto 1.9+ supports engine extensions, so additional engines can be registered the same way as any other Quarto extension.
(the engine extension will allow Julia to move out from Quarto development cycle and will allow the maintainers to ship new versions more easily)

@mcanouil
Copy link
Copy Markdown
Contributor

mcanouil commented Apr 11, 2026

(Context for my review above, which I accidentally submitted with an empty body.)

Thanks @AlejandroGomezFrieiro for picking this up, and a meta suggestion before the inline comments.

I would like to propose widening the scope of this PR from "add Python support" to "make the skill language-agnostic, as it should have been from the start". The original R-centric framing of this skill was an oversight in my own initial proposal: Quarto was already language-agnostic at the time, and I should have written it that way rather than anchoring everything on R and knitr. Adding Python alongside R fixes part of that oversight but also bakes the same bias in a new shape (now two privileged languages instead of one), which is why so much of this PR ends up as R/Python pairs.

Quarto 1.9 ships engine extensions, so the set of computing engines behind a code cell is no longer closed: knitr, jupyter, julia, and anything a third-party extension registers, all sit behind the same ```{LANGUAGE} surface with the same cell options. The authoring layer (cell-option syntax, fig-cap, cross-references, layouts, YAML metadata) does not change when a new engine ships. If we write the reference examples with a ```{LANGUAGE} placeholder rather than R/Python pairs, the skill teaches the Quarto option rather than the particular language, it generalises naturally to julia and to future engines, and it stops being a maintenance liability the next time a new kernel or engine lands. A concrete language is still worth spelling out where the behaviour genuinely differs, like the two Python table examples at code-cells.md:203-223 that @gadenbuie approved (they show output: asis vs HTML auto-display, which is a real divergence. Should HTML table processing be mention? That's another question, because you can emit raw HTML table for any outputs formats, Quarto will process them into the AST, then Pandoc will write the correct output for the matching formats).

Concretely this touches most of the reference files (SKILL.md, code-cells.md, figures.md, layout.md, tables.md, yaml-front-matter.md) and the description itself. It is a bigger change than the PR currently proposes, but it aligns the skill with what Quarto actually is, closes the gap my original proposal left behind, and saves us from doing the same exercise every time a new engine lands. Happy to help if you want to take it on in this PR, or to split it into a follow-up with this PR trimmed down to the factually-blocking fixes.

Individual issues follow inline.

Hope this helps understand my initial oversight when writing the skill and how to correct the bias instead of building on it.

@AlejandroGomezFrieiro
Copy link
Copy Markdown
Author

Thanks for the review, I will look into the changes in a few days. Making it language agnostic sounds great.

@mcanouil
Copy link
Copy Markdown
Contributor

mcanouil commented Apr 13, 2026

@AlejandroGomezFrieiro Let me know how you want to proceed as it's a consequence of my initial oversight. I can take over and make another PR to fix all of the above if needed and if you want.

@AlejandroGomezFrieiro
Copy link
Copy Markdown
Author

@mcanouil I don't have lots of time at hand and did this to be able to use it myself. If you want to take over I would gladly accept that and can test it out once ready.

@mcanouil
Copy link
Copy Markdown
Contributor

mcanouil commented Apr 13, 2026

Ok, I'll do this and ping you on the new PR making the skill more agnostic and more engine-aware.

Keeping the PR until I supersede it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants