The FrequencyDetector class detects embedding-space shortcut signatures by identifying classes whose signal concentrates in a small set of dimensions.
::: shortcut_detect.frequency.FrequencyDetector options: show_root_heading: true show_source: true
FrequencyDetector(
top_percent: float = 0.05,
tpr_threshold: float = 0.5,
fpr_threshold: float = 0.15,
probe_estimator: Optional[BaseEstimator] = None,
probe_evaluation: str = "train",
probe_holdout_frac: float = 0.2,
random_state: int = 42,
)| Parameter | Type | Default | Description |
|---|---|---|---|
top_percent |
float | 0.05 | Fraction of top dimensions to examine |
tpr_threshold |
float | 0.5 | Per-class TPR threshold for flagging |
fpr_threshold |
float | 0.15 | Per-class FPR threshold for flagging |
probe_estimator |
BaseEstimator | None | sklearn classifier (default: LogisticRegression) |
probe_evaluation |
str | "train" | "train" or "holdout" |
probe_holdout_frac |
float | 0.2 | Holdout fraction for evaluation |
random_state |
int | 42 | Random seed |
def fit(
embeddings: np.ndarray,
labels: np.ndarray,
) -> FrequencyDetectorFit the frequency detector on embeddings and labels.
Parameters:
| Parameter | Type | Description |
|---|---|---|
embeddings |
ndarray | Shape (n_samples, n_features), 2D array |
labels |
ndarray | Shape (n_samples,), 1D class labels |
Returns: self
Raises:
ValueErrorif embeddings is not 2D or labels is not 1DValueErrorif fewer than 10 samples or fewer than 2 unique classes
def get_report() -> dictGet the detection report after fitting.
Returns: Dictionary with method, shortcut_detected, risk_level, metrics, report, notes, and metadata.
| Attribute | Type | Description |
|---|---|---|
config |
FrequencyConfig | Frozen configuration dataclass |
probe_ |
BaseEstimator | Fitted probe classifier |
_is_fitted |
bool | Whether the detector has been fitted |
from shortcut_detect import FrequencyDetector
detector = FrequencyDetector()
detector.fit(embeddings, labels)
report = detector.get_report()
print(report["shortcut_detected"])detector = FrequencyDetector(
probe_evaluation="holdout",
probe_holdout_frac=0.2,
random_state=42,
)
detector.fit(embeddings, labels)from sklearn.svm import LinearSVC
detector = FrequencyDetector(
probe_estimator=LinearSVC(max_iter=5000),
top_percent=0.1,
)
detector.fit(embeddings, labels)from shortcut_detect import ShortcutDetector
detector = ShortcutDetector(
methods=["frequency"],
freq_top_percent=0.05,
freq_probe_evaluation="holdout",
)
detector.fit(embeddings, labels)
print(detector.summary())