Skip to content

feat: use variable name as download filename in dataframe viewer#8811

Open
Sushit-prog wants to merge 9 commits intomarimo-team:mainfrom
Sushit-prog:fix/dataframe-download-filename
Open

feat: use variable name as download filename in dataframe viewer#8811
Sushit-prog wants to merge 9 commits intomarimo-team:mainfrom
Sushit-prog:fix/dataframe-download-filename

Conversation

@Sushit-prog
Copy link
Copy Markdown
Contributor

@Sushit-prog Sushit-prog commented Mar 21, 2026

Summary

Fixes the download filename regression where files were being
named df.csv instead of the actual variable name.

Root Cause

The previous fix used infer_variable_name() which relies on
identity checks (is) that fail when dataframes are
wrapped/transformed, causing it to fall back to df.

Fix

Use ctx.ui_element_registry.bound_names(self._id) to look up
the actual variable name at runtime via a shared get_bound_name()
utility function.

Changes

  • utils/dataframe.py — added get_bound_name() shared utility,
    _create_download_virtual_file() helper and filename parameter
    to download_as()
  • dataframes/dataframe.py — uses get_bound_name() to get variable
    name at download time
  • table.py — same as above
  • test_dataframe.py — added test verifying correct filename

Fixes #8095

Use bound_names from UI element registry to get the actual
variable name at runtime instead of infer_variable_name which
fails on wrapped/transformed dataframes.

Fixes marimo-team#8095
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Mar 28, 2026 10:15pm

Request Review

@Sushit-prog Sushit-prog marked this pull request as draft March 22, 2026 06:32
@Sushit-prog Sushit-prog marked this pull request as ready for review March 22, 2026 06:38
@Sushit-prog
Copy link
Copy Markdown
Contributor Author

Sushit-prog commented Mar 22, 2026

@Light2Dark @dmadisetti ready for review!

Here's a screenshot showing the fix working — downloaded file
is now named after the variable (some_frame.csv) instead of df.csv.

Screenshot 2026-03-22 022235

Could you add the appropriate label so all CI checks can run?

@Light2Dark Light2Dark added the enhancement New feature or request label Mar 22, 2026
@Sushit-prog
Copy link
Copy Markdown
Contributor Author

@Light2Dark @dmadisetti gentle ping, all checks are passing
and the fix is working as shown in the screenshot.
Happy to make any changes if needed!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix a regression where dataframe/table downloads are named df.csv instead of using the bound variable name, by looking up the UI element’s bound names at runtime and attempting to use that name during download.

Changes:

  • Added a filename parameter to download_as() and introduced a helper to create a virtual file with a custom filename.
  • Updated dataframe viewer and table download handlers to fetch the bound variable name via ctx.ui_element_registry.bound_names(self._id) and pass it into download_as().
  • Added a unit test intended to verify that the bound variable name is used for downloads.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
marimo/_plugins/ui/_impl/utils/dataframe.py Adds custom-filename virtual file creation and extends download_as() with a filename option.
marimo/_plugins/ui/_impl/dataframes/dataframe.py Attempts to derive a bound name at download time and pass it to download_as().
marimo/_plugins/ui/_impl/table.py Same approach as dataframe: derive bound name at download time and pass it to download_as().
tests/_plugins/ui/_impl/dataframes/test_dataframe.py Adds a test intended to validate filename behavior for dataframe downloads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

from marimo._runtime.context import get_context

ctx = get_context()
bound = list(ctx.ui_element_registry.bound_names(self._id))
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx.ui_element_registry.bound_names(self._id) returns a set[str] (see UIElementRegistry._bindings), so converting to list(...) and taking [0] makes the chosen filename nondeterministic when multiple names are bound. Consider choosing a stable name (e.g., sorted(bound)[0]) or applying a clear preference rule.

Suggested change
bound = list(ctx.ui_element_registry.bound_names(self._id))
bound = sorted(ctx.ui_element_registry.bound_names(self._id))

Copilot uses AI. Check for mistakes.
Comment on lines 354 to 372
bound_filename: str | None = None
try:
from marimo._runtime.context import get_context

ctx = get_context()
bound = list(ctx.ui_element_registry.bound_names(self._id))
if bound:
bound_filename = bound[0]
except Exception:
LOGGER.debug("Error getting bound names for download filename")

return download_as(
manager,
args.format,
csv_encoding=self._download_csv_encoding,
csv_separator=self._download_csv_separator,
json_ensure_ascii=self._download_json_ensure_ascii,
filename=bound_filename,
)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change passes filename=bound_filename into the backend download_as(), but the UI’s actual downloaded filename is set on the frontend via the downloadFileName prop (see frontend/src/components/data-table/download-actions.tsx, which calls downloadByURL(url, ${baseName}.${ext})). Since this file still sets the dataframe-name arg from infer_variable_name(df, "df"), users may still get df.csv even if the bound name is found here. To fully address the regression, the dataframe-name/downloadFileName value likely needs to come from ui_element_registry.bound_names(self._id) as well (or otherwise be updated at runtime).

Copilot uses AI. Check for mistakes.
from marimo._runtime.context import get_context

ctx = get_context()
bound = list(ctx.ui_element_registry.bound_names(self._id))
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx.ui_element_registry.bound_names(self._id) returns a set[str], so list(...)[0] can pick a different name across runs when multiple bindings exist. Prefer a deterministic choice (e.g., sorted(...)) or a defined preference rule.

Suggested change
bound = list(ctx.ui_element_registry.bound_names(self._id))
bound = sorted(ctx.ui_element_registry.bound_names(self._id))

Copilot uses AI. Check for mistakes.
Comment on lines 869 to 886
if isinstance(manager_candidate, TableManager):
bound_filename: str | None = None
try:
from marimo._runtime.context import get_context

ctx = get_context()
bound = list(ctx.ui_element_registry.bound_names(self._id))
if bound:
bound_filename = bound[0]
except Exception:
LOGGER.debug("Error getting bound names for download filename")

return download_as(
manager_candidate, args.format, drop_marimo_index=True
manager_candidate,
args.format,
drop_marimo_index=True,
filename=bound_filename,
)
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the dataframe viewer, the downloaded filename in the UI is determined on the frontend from the download-file-name arg (see downloadByURL(..., ${baseName}.${ext}) in frontend/src/components/data-table/download-actions.tsx). This function now passes filename=bound_filename into download_as(), but the download-file-name arg in this class is still computed via infer_variable_name(data, "download") during __init__, so the user-visible filename may remain incorrect. Consider updating the download-file-name arg to use ui_element_registry.bound_names(self._id) (or otherwise refreshing it) rather than relying on the URL filename.

Copilot uses AI. Check for mistakes.
Comment on lines +508 to +522
mock_registry = Mock()
mock_registry.bound_names.return_value = ["my_dataframe"]

mock_ctx = Mock()
mock_ctx.ui_element_registry = mock_registry

with patch(
"marimo._runtime.context.get_context",
return_value=mock_ctx,
):
download_url = subject._download_as(DownloadAsArgs(format="csv"))
assert "my_dataframe.csv" in download_url or (
download_url.startswith("data:")
)

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can pass even if the bound-variable filename behavior is broken: it accepts any data: URL, and it only patches marimo._runtime.context.get_context, while the new virtual-file code path uses the get_context reference imported into marimo._plugins.ui._impl.utils.dataframe at import time. To validate the regression fix, consider configuring the patched context with virtual_files_supported=True and a virtual_file_registry, patching the get_context used by the download helper, and then asserting that the resulting URL/virtual file metadata reflects my_dataframe (or that downloadFileName is set appropriately).

Suggested change
mock_registry = Mock()
mock_registry.bound_names.return_value = ["my_dataframe"]
mock_ctx = Mock()
mock_ctx.ui_element_registry = mock_registry
with patch(
"marimo._runtime.context.get_context",
return_value=mock_ctx,
):
download_url = subject._download_as(DownloadAsArgs(format="csv"))
assert "my_dataframe.csv" in download_url or (
download_url.startswith("data:")
)
mock_ui_registry = Mock()
mock_ui_registry.bound_names.return_value = ["my_dataframe"]
mock_virtual_file_registry = Mock()
mock_ctx = Mock()
mock_ctx.ui_element_registry = mock_ui_registry
# Ensure the virtual file–based download path is taken
mock_ctx.virtual_files_supported = True
mock_ctx.virtual_file_registry = mock_virtual_file_registry
with patch(
"marimo._runtime.context.get_context",
return_value=mock_ctx,
), patch(
"marimo._plugins.ui._impl.utils.dataframe.get_context",
return_value=mock_ctx,
):
download_url = subject._download_as(DownloadAsArgs(format="csv"))
# The download should register a virtual file whose metadata or
# filename reflects the bound variable name "my_dataframe".
assert mock_virtual_file_registry.register.called
args, kwargs = mock_virtual_file_registry.register.call_args
# Check all positional and keyword argument values for the name
all_values = list(args) + list(kwargs.values())
assert any(
"my_dataframe" in str(value) for value in all_values
), f"Bound variable name not used in virtual file registration: {all_values}"
# Also ensure we still get a URL-like string back from the helper
assert isinstance(download_url, str) and download_url

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +52
try:
ctx = get_context()
if ctx.virtual_files_supported:
vfile = VirtualFile(filename=f"{filename}.{ext}", buffer=buffer)
ctx.virtual_file_registry.add(vfile, ctx)
return vfile.url
except ContextNotInitializedError:
pass

vfile = VirtualFile(
filename=f"{filename}.{ext}", buffer=buffer, as_data_url=True
)
return vfile.url
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_create_download_virtual_file uses a fixed registry key ({filename}.{ext}) and then returns the new URL even when VirtualFileRegistry.add() is a no-op (e.g., because the key already exists, or because the buffer is empty). This can make subsequent downloads serve stale/incorrect bytes and can even create Content-Length mismatches when the byte length in the URL differs from what’s stored.

Additionally, this path bypasses VirtualFileLifecycleItem/mo_data.*, so the file isn’t tied into cell_lifecycle_registry for cleanup, which risks leaking virtual files across cell reruns/sessions. Consider keeping registry filenames unique (random key) and letting the frontend control the user-facing filename, or implement an overwrite+cleanup strategy that safely replaces prior entries and disposes them via the cell lifecycle.

Copilot uses AI. Check for mistakes.
json_ensure_ascii: bool = True,
) -> str:
filename: str | None = None,
) -> tuple[str, str]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opposed to a tuple can we share with DownloadAsResponse? Maybe new name and location.

Copy link
Copy Markdown
Collaborator

@dmadisetti dmadisetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks alright to me but @Light2Dark is a better authority for tables

Comment on lines +356 to +364
try:
from marimo._runtime.context import get_context

ctx = get_context()
bound = sorted(ctx.ui_element_registry.bound_names(self._id))
if bound:
bound_filename = bound[0]
except Exception:
LOGGER.debug("Error getting bound names for download filename")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a shared util function?

Comment on lines +120 to +138
if filename is not None:
if ext == "csv":
encoding = (
csv_encoding
if csv_encoding is not None
else get_default_csv_encoding()
)
data = manager.to_csv(encoding=encoding, separator=csv_separator)
elif ext == "json":
data = manager.to_json(
encoding=None, ensure_ascii=json_ensure_ascii, strict_json=True
)
elif ext == "parquet":
data = manager.to_parquet()
else:
raise ValueError(
"format must be one of 'csv', 'json', or 'parquet'."
)
return _create_download_virtual_file(data, ext, filename)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just to check my understanding, why do we need this? And the custom function?

- Extract bound_names lookup into shared get_bound_name() utility
- Use get_bound_name() in both dataframe.py and table.py
- Restore comments removed from utils/dataframe.py
- Address Copilot and Light2Dark review comments
@Sushit-prog
Copy link
Copy Markdown
Contributor Author

@Light2Dark addressed all review comments:

  1. Restored removed comments
    2.Extracted bound_names lookup into shared get_bound_name() utility
  2. Used in both dataframe.py and table.py

The _create_download_virtual_file function is needed because
mo_data.csv/json/parquet() generate random filenames internally
via VirtualFileLifecycleItem with no way to pass a custom name.
Happy to simplify further if you have a preferred approach!

@Sushit-prog
Copy link
Copy Markdown
Contributor Author

Sushit-prog commented Mar 24, 2026

@Light2Dark the Test BE / macos-latest failure (test_terminal_ws)
appears to be a pre-existing flaky test unrelated to this PR —
it fails due to a WebSocket file descriptor issue in terminal.py
which is unrelated to the dataframe download changes.

All dataframe and table tests pass locally. Happy to investigate
further if needed!

Also, the Test BE / ubuntu-latest / Py 3.14 failure is
the test_terminal_ws flaky test — unrelated to this PR's
dataframe download changes. Is this a known issue on Python 3.14?

@Light2Dark
Copy link
Copy Markdown
Collaborator

Light2Dark commented Mar 24, 2026

Yeah, I don't think those are related. Sorry about that, sometimes tests are flakey 😅

tuple: (url, user-facing filename) for the downloaded file.
"""
if drop_marimo_index:
# Remove the selection column if exists
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove this

Comment on lines +1129 to +1144
data = table_manager.data
is_simple_list = (
isinstance(data, (list, tuple))
and len(data) > 0
and not isinstance(data[0], (list, tuple, dict))
)

if is_simple_list:
try:
data_url = mo_data.json(
table_manager.to_json({}, ensure_ascii=True)
).url
return data_url, "json"
except Exception as e:
LOGGER.error("Failed to export table data as JSON: %s", e)
raise
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?

download_csv_encoding="utf-8-sig",
)

# CSV should include BOM
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove comments like these

table._search(SearchTableArgs(query="2", page_size=10, page_number=0))
# Make a cell selection; download should still include the filtered view
table._convert_value([{"rowId": "0", "columnName": "a"}])
# Use JSON format to avoid optional dependencies
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

Comment on lines +199 to +201
# dataframe-name is set at init time using infer_variable_name.
# For downloads, the actual filename comes from bound_names at
# download time via _download_as returning DownloadAsResponse.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this mean we can use the dataframe-name for download?

@Sushit-prog
Copy link
Copy Markdown
Contributor Author

Sushit-prog commented Mar 28, 2026

@Light2Dark @dmadisetti addressed all remaining review comments:

  1. Simplified download_as() :removed the complex _create_download_virtual_file()
    helper and the duplicate if filename is not None branch entirely. Now always uses
    mo_data.csv/json/parquet() for the URL and simply returns the bound variable name
    as the filename. The frontend's downloadByURL(url, filename) already handles them
    as separate arguments so no custom virtual file needed.

  2. Answers Light2Dark's question : "doesn't this mean we can use dataframe-name
    for download?" — dataframe-name is a static init-time arg that can't be updated
    after render, but we don't need to. DownloadAsResponse returns the correct filename
    directly from get_bound_name() at download time, and the frontend already uses it.

  3. Restored all deleted comments : # CSV should include BOM,
    # Use JSON format to avoid optional dependencies,
    # Remove the selection column if exists

  4. Strengthened test : now properly mocks mo_data.csv and explicitly asserts
    result.filename == "my_dataframe" and result.url comes from mo_data.

  5. is_simple_list block : this is in _to_chart_data_url (chart rendering),
    not in _download_as. It was not added by this PR.

All backend tests pass locally. CI frontend build confirmed passing.

@Sushit-prog
Copy link
Copy Markdown
Contributor Author

Sushit-prog commented Mar 28, 2026

@Light2Dark @dmadisetti addressed all remaining review comments:

  1. Simplified download_as() : removed _create_download_virtual_file()
    and the duplicate if filename is not None branch. Now always uses
    mo_data.csv/json/parquet() for the URL and returns the bound variable name
    as filename directly. No custom virtual file needed since downloadByURL(url, filename)
    handles them as separate arguments.

  2. Removed is_simple_list block from _to_chart_data_url — this was
    incorrectly added and was breaking existing tests.

  3. Restored all deleted comments : # CSV should include BOM,
    # Use JSON format to avoid optional dependencies,
    # Remove the selection column if exists

  4. Clarified comment in dataframe.py explaining why dataframe-name is
    init-time only and how get_bound_name() correctly resolves the filename at
    download time.

  5. Strengthened test: properly mocks mo_data.csv and explicitly asserts
    result.filename == "my_dataframe".

All 15 dataframe tests and 21 table tests pass locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dataframe download from rich dataframe viewer automatically names download file by variable name

4 participants