feat: use variable name as download filename in dataframe viewer#8811
feat: use variable name as download filename in dataframe viewer#8811Sushit-prog wants to merge 9 commits intomarimo-team:mainfrom
Conversation
Use bound_names from UI element registry to get the actual variable name at runtime instead of infer_variable_name which fails on wrapped/transformed dataframes. Fixes marimo-team#8095
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@Light2Dark @dmadisetti ready for review! Here's a screenshot showing the fix working — downloaded file
Could you add the appropriate label so all CI checks can run? |
|
@Light2Dark @dmadisetti gentle ping, all checks are passing |
There was a problem hiding this comment.
Pull request overview
This PR aims to fix a regression where dataframe/table downloads are named df.csv instead of using the bound variable name, by looking up the UI element’s bound names at runtime and attempting to use that name during download.
Changes:
- Added a
filenameparameter todownload_as()and introduced a helper to create a virtual file with a custom filename. - Updated dataframe viewer and table download handlers to fetch the bound variable name via
ctx.ui_element_registry.bound_names(self._id)and pass it intodownload_as(). - Added a unit test intended to verify that the bound variable name is used for downloads.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
marimo/_plugins/ui/_impl/utils/dataframe.py |
Adds custom-filename virtual file creation and extends download_as() with a filename option. |
marimo/_plugins/ui/_impl/dataframes/dataframe.py |
Attempts to derive a bound name at download time and pass it to download_as(). |
marimo/_plugins/ui/_impl/table.py |
Same approach as dataframe: derive bound name at download time and pass it to download_as(). |
tests/_plugins/ui/_impl/dataframes/test_dataframe.py |
Adds a test intended to validate filename behavior for dataframe downloads. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from marimo._runtime.context import get_context | ||
|
|
||
| ctx = get_context() | ||
| bound = list(ctx.ui_element_registry.bound_names(self._id)) |
There was a problem hiding this comment.
ctx.ui_element_registry.bound_names(self._id) returns a set[str] (see UIElementRegistry._bindings), so converting to list(...) and taking [0] makes the chosen filename nondeterministic when multiple names are bound. Consider choosing a stable name (e.g., sorted(bound)[0]) or applying a clear preference rule.
| bound = list(ctx.ui_element_registry.bound_names(self._id)) | |
| bound = sorted(ctx.ui_element_registry.bound_names(self._id)) |
| bound_filename: str | None = None | ||
| try: | ||
| from marimo._runtime.context import get_context | ||
|
|
||
| ctx = get_context() | ||
| bound = list(ctx.ui_element_registry.bound_names(self._id)) | ||
| if bound: | ||
| bound_filename = bound[0] | ||
| except Exception: | ||
| LOGGER.debug("Error getting bound names for download filename") | ||
|
|
||
| return download_as( | ||
| manager, | ||
| args.format, | ||
| csv_encoding=self._download_csv_encoding, | ||
| csv_separator=self._download_csv_separator, | ||
| json_ensure_ascii=self._download_json_ensure_ascii, | ||
| filename=bound_filename, | ||
| ) |
There was a problem hiding this comment.
This change passes filename=bound_filename into the backend download_as(), but the UI’s actual downloaded filename is set on the frontend via the downloadFileName prop (see frontend/src/components/data-table/download-actions.tsx, which calls downloadByURL(url, ${baseName}.${ext})). Since this file still sets the dataframe-name arg from infer_variable_name(df, "df"), users may still get df.csv even if the bound name is found here. To fully address the regression, the dataframe-name/downloadFileName value likely needs to come from ui_element_registry.bound_names(self._id) as well (or otherwise be updated at runtime).
marimo/_plugins/ui/_impl/table.py
Outdated
| from marimo._runtime.context import get_context | ||
|
|
||
| ctx = get_context() | ||
| bound = list(ctx.ui_element_registry.bound_names(self._id)) |
There was a problem hiding this comment.
ctx.ui_element_registry.bound_names(self._id) returns a set[str], so list(...)[0] can pick a different name across runs when multiple bindings exist. Prefer a deterministic choice (e.g., sorted(...)) or a defined preference rule.
| bound = list(ctx.ui_element_registry.bound_names(self._id)) | |
| bound = sorted(ctx.ui_element_registry.bound_names(self._id)) |
| if isinstance(manager_candidate, TableManager): | ||
| bound_filename: str | None = None | ||
| try: | ||
| from marimo._runtime.context import get_context | ||
|
|
||
| ctx = get_context() | ||
| bound = list(ctx.ui_element_registry.bound_names(self._id)) | ||
| if bound: | ||
| bound_filename = bound[0] | ||
| except Exception: | ||
| LOGGER.debug("Error getting bound names for download filename") | ||
|
|
||
| return download_as( | ||
| manager_candidate, args.format, drop_marimo_index=True | ||
| manager_candidate, | ||
| args.format, | ||
| drop_marimo_index=True, | ||
| filename=bound_filename, | ||
| ) |
There was a problem hiding this comment.
Similar to the dataframe viewer, the downloaded filename in the UI is determined on the frontend from the download-file-name arg (see downloadByURL(..., ${baseName}.${ext}) in frontend/src/components/data-table/download-actions.tsx). This function now passes filename=bound_filename into download_as(), but the download-file-name arg in this class is still computed via infer_variable_name(data, "download") during __init__, so the user-visible filename may remain incorrect. Consider updating the download-file-name arg to use ui_element_registry.bound_names(self._id) (or otherwise refreshing it) rather than relying on the URL filename.
| mock_registry = Mock() | ||
| mock_registry.bound_names.return_value = ["my_dataframe"] | ||
|
|
||
| mock_ctx = Mock() | ||
| mock_ctx.ui_element_registry = mock_registry | ||
|
|
||
| with patch( | ||
| "marimo._runtime.context.get_context", | ||
| return_value=mock_ctx, | ||
| ): | ||
| download_url = subject._download_as(DownloadAsArgs(format="csv")) | ||
| assert "my_dataframe.csv" in download_url or ( | ||
| download_url.startswith("data:") | ||
| ) | ||
|
|
There was a problem hiding this comment.
This test can pass even if the bound-variable filename behavior is broken: it accepts any data: URL, and it only patches marimo._runtime.context.get_context, while the new virtual-file code path uses the get_context reference imported into marimo._plugins.ui._impl.utils.dataframe at import time. To validate the regression fix, consider configuring the patched context with virtual_files_supported=True and a virtual_file_registry, patching the get_context used by the download helper, and then asserting that the resulting URL/virtual file metadata reflects my_dataframe (or that downloadFileName is set appropriately).
| mock_registry = Mock() | |
| mock_registry.bound_names.return_value = ["my_dataframe"] | |
| mock_ctx = Mock() | |
| mock_ctx.ui_element_registry = mock_registry | |
| with patch( | |
| "marimo._runtime.context.get_context", | |
| return_value=mock_ctx, | |
| ): | |
| download_url = subject._download_as(DownloadAsArgs(format="csv")) | |
| assert "my_dataframe.csv" in download_url or ( | |
| download_url.startswith("data:") | |
| ) | |
| mock_ui_registry = Mock() | |
| mock_ui_registry.bound_names.return_value = ["my_dataframe"] | |
| mock_virtual_file_registry = Mock() | |
| mock_ctx = Mock() | |
| mock_ctx.ui_element_registry = mock_ui_registry | |
| # Ensure the virtual file–based download path is taken | |
| mock_ctx.virtual_files_supported = True | |
| mock_ctx.virtual_file_registry = mock_virtual_file_registry | |
| with patch( | |
| "marimo._runtime.context.get_context", | |
| return_value=mock_ctx, | |
| ), patch( | |
| "marimo._plugins.ui._impl.utils.dataframe.get_context", | |
| return_value=mock_ctx, | |
| ): | |
| download_url = subject._download_as(DownloadAsArgs(format="csv")) | |
| # The download should register a virtual file whose metadata or | |
| # filename reflects the bound variable name "my_dataframe". | |
| assert mock_virtual_file_registry.register.called | |
| args, kwargs = mock_virtual_file_registry.register.call_args | |
| # Check all positional and keyword argument values for the name | |
| all_values = list(args) + list(kwargs.values()) | |
| assert any( | |
| "my_dataframe" in str(value) for value in all_values | |
| ), f"Bound variable name not used in virtual file registration: {all_values}" | |
| # Also ensure we still get a URL-like string back from the helper | |
| assert isinstance(download_url, str) and download_url |
| try: | ||
| ctx = get_context() | ||
| if ctx.virtual_files_supported: | ||
| vfile = VirtualFile(filename=f"{filename}.{ext}", buffer=buffer) | ||
| ctx.virtual_file_registry.add(vfile, ctx) | ||
| return vfile.url | ||
| except ContextNotInitializedError: | ||
| pass | ||
|
|
||
| vfile = VirtualFile( | ||
| filename=f"{filename}.{ext}", buffer=buffer, as_data_url=True | ||
| ) | ||
| return vfile.url |
There was a problem hiding this comment.
_create_download_virtual_file uses a fixed registry key ({filename}.{ext}) and then returns the new URL even when VirtualFileRegistry.add() is a no-op (e.g., because the key already exists, or because the buffer is empty). This can make subsequent downloads serve stale/incorrect bytes and can even create Content-Length mismatches when the byte length in the URL differs from what’s stored.
Additionally, this path bypasses VirtualFileLifecycleItem/mo_data.*, so the file isn’t tied into cell_lifecycle_registry for cleanup, which risks leaking virtual files across cell reruns/sessions. Consider keeping registry filenames unique (random key) and letting the frontend control the user-facing filename, or implement an overwrite+cleanup strategy that safely replaces prior entries and disposes them via the cell lifecycle.
| json_ensure_ascii: bool = True, | ||
| ) -> str: | ||
| filename: str | None = None, | ||
| ) -> tuple[str, str]: |
There was a problem hiding this comment.
Opposed to a tuple can we share with DownloadAsResponse? Maybe new name and location.
dmadisetti
left a comment
There was a problem hiding this comment.
Looks alright to me but @Light2Dark is a better authority for tables
| try: | ||
| from marimo._runtime.context import get_context | ||
|
|
||
| ctx = get_context() | ||
| bound = sorted(ctx.ui_element_registry.bound_names(self._id)) | ||
| if bound: | ||
| bound_filename = bound[0] | ||
| except Exception: | ||
| LOGGER.debug("Error getting bound names for download filename") |
There was a problem hiding this comment.
should this be a shared util function?
| if filename is not None: | ||
| if ext == "csv": | ||
| encoding = ( | ||
| csv_encoding | ||
| if csv_encoding is not None | ||
| else get_default_csv_encoding() | ||
| ) | ||
| data = manager.to_csv(encoding=encoding, separator=csv_separator) | ||
| elif ext == "json": | ||
| data = manager.to_json( | ||
| encoding=None, ensure_ascii=json_ensure_ascii, strict_json=True | ||
| ) | ||
| elif ext == "parquet": | ||
| data = manager.to_parquet() | ||
| else: | ||
| raise ValueError( | ||
| "format must be one of 'csv', 'json', or 'parquet'." | ||
| ) | ||
| return _create_download_virtual_file(data, ext, filename) |
There was a problem hiding this comment.
Sorry, just to check my understanding, why do we need this? And the custom function?
- Extract bound_names lookup into shared get_bound_name() utility - Use get_bound_name() in both dataframe.py and table.py - Restore comments removed from utils/dataframe.py - Address Copilot and Light2Dark review comments
|
@Light2Dark addressed all review comments:
The _create_download_virtual_file function is needed because |
|
@Light2Dark the Test BE / macos-latest failure (test_terminal_ws) All dataframe and table tests pass locally. Happy to investigate Also, the Test BE / ubuntu-latest / Py 3.14 failure is |
|
Yeah, I don't think those are related. Sorry about that, sometimes tests are flakey 😅 |
| tuple: (url, user-facing filename) for the downloaded file. | ||
| """ | ||
| if drop_marimo_index: | ||
| # Remove the selection column if exists |
marimo/_plugins/ui/_impl/table.py
Outdated
| data = table_manager.data | ||
| is_simple_list = ( | ||
| isinstance(data, (list, tuple)) | ||
| and len(data) > 0 | ||
| and not isinstance(data[0], (list, tuple, dict)) | ||
| ) | ||
|
|
||
| if is_simple_list: | ||
| try: | ||
| data_url = mo_data.json( | ||
| table_manager.to_json({}, ensure_ascii=True) | ||
| ).url | ||
| return data_url, "json" | ||
| except Exception as e: | ||
| LOGGER.error("Failed to export table data as JSON: %s", e) | ||
| raise |
| download_csv_encoding="utf-8-sig", | ||
| ) | ||
|
|
||
| # CSV should include BOM |
There was a problem hiding this comment.
don't remove comments like these
| table._search(SearchTableArgs(query="2", page_size=10, page_number=0)) | ||
| # Make a cell selection; download should still include the filtered view | ||
| table._convert_value([{"rowId": "0", "columnName": "a"}]) | ||
| # Use JSON format to avoid optional dependencies |
| # dataframe-name is set at init time using infer_variable_name. | ||
| # For downloads, the actual filename comes from bound_names at | ||
| # download time via _download_as returning DownloadAsResponse. |
There was a problem hiding this comment.
doesn't this mean we can use the dataframe-name for download?
|
@Light2Dark @dmadisetti addressed all remaining review comments:
All backend tests pass locally. CI frontend build confirmed passing. |
|
@Light2Dark @dmadisetti addressed all remaining review comments:
All 15 dataframe tests and 21 table tests pass locally. |

Summary
Fixes the download filename regression where files were being
named
df.csvinstead of the actual variable name.Root Cause
The previous fix used
infer_variable_name()which relies onidentity checks (
is) that fail when dataframes arewrapped/transformed, causing it to fall back to
df.Fix
Use
ctx.ui_element_registry.bound_names(self._id)to look upthe actual variable name at runtime via a shared
get_bound_name()utility function.
Changes
utils/dataframe.py— addedget_bound_name()shared utility,_create_download_virtual_file()helper andfilenameparameterto
download_as()dataframes/dataframe.py— usesget_bound_name()to get variablename at download time
table.py— same as abovetest_dataframe.py— added test verifying correct filenameFixes #8095