Add Resemble AI Detect guardrail plugin by devshahofficial · Pull Request #1610 · Portkey-AI/gateway

devshahofficial · 2026-04-14T23:07:55Z

Summary

Adds a new guardrail plugin that scans audio, video, and image URLs referenced in LLM requests for deepfake / synthetic content via the Resemble AI Detect API.

The plugin runs in beforeRequestHook and afterRequestHook. It extracts a media URL from the request, submits it to POST /detect, polls GET /detect/{uuid} until completion, and returns verdict: false when Resemble labels the media as `fake` or the aggregated score exceeds the configured threshold.

Why

Deepfakes are increasingly being used to manipulate AI workflows — voice-cloned audio pasted into a chat-completion prompt, synthetic images submitted to a multimodal model, video content slipped through a RAG pipeline. Portkey users asking "is this input real?" currently have no option in the plugins catalog. Resemble Detect covers audio, video, and image from a single endpoint, and ships with useful side signals like audio source tracing (identifies which TTS vendor — ElevenLabs, Resemble, PlayHT, OpenAI — generated flagged audio) and reverse image search for images.

What this PR adds

`plugins/resemble/manifest.json` — plugin declaration with credentials schema, parameters, and supported hooks
`plugins/resemble/detect.ts` — handler, URL extraction, polling, and verdict logic
`plugins/resemble/detect.test.ts` — 23 jest tests
Registers `resemble: { detect: ... }` in `plugins/index.ts`

Configuration

Param	Default	Description
`credentials.apiKey`	— (required)	Resemble API token (encrypted)
`credentials.apiBase`	`https://app.resemble.ai/api/v2\`	Override for self-hosted / staging
`threshold`	0.5	Aggregated score above which media is treated as fake
`mediaType`	auto	Force `audio` / `video` / `image`, or let Resemble auto-detect
`audioSourceTracing`	false	Identify TTS vendor for flagged audio
`useReverseSearch`	false	Run reverse image search for images
`zeroRetentionMode`	false	Delete media after detection completes
`urlSource`	`auto`	Where to look for the URL: `auto`, `metadata`, or `content`
`metadataKey`	`mediaUrl`	Metadata key when `urlSource` is `metadata` or `auto`
`pollIntervalMs`	2000	How often to poll Resemble
`pollTimeoutMs`	60000	Max wait before failing open
`failClosed`	false	If `true`, API errors block the request

URL extraction

The handler searches for a media URL in three places, in order:

Multimodal content parts — OpenAI-style `input_audio`, `image_url`, and Anthropic-style `source.url` (image / document)
Regex over joined text — matches https URLs ending in common audio/video/image extensions, inside message content, `prompt`, or `input`
`context.metadata[metadataKey]` — fallback for the `auto` mode, primary source when `urlSource: 'metadata'`

Fail modes

By default the plugin fails open — if the Resemble API errors out (auth, network, timeout), the request still passes through and the error is recorded in `data`. Set `failClosed: true` to block on API errors. This matches the pattern in other guardrail plugins.

Test plan

23 jest tests pass (`npm run test:plugins -- plugins/resemble`)
Prettier check passes
Existing plugins tests unaffected
Manifest conforms to the plugin schema (loads via plugin manager)

```
Test Suites: 1 passed, 1 total
Tests: 23 passed, 23 total
```

Notes

This is the first new guardrail under the `resemble` namespace — happy to adjust the ID / naming to fit your conventions before merge.
Resemble AI will co-announce the plugin on launch and add Portkey to our integrations documentation.

Adds a new guardrail plugin that scans audio, video, and image URLs referenced in LLM requests for deepfake / synthetic content via the Resemble AI Detect API (https://app.resemble.ai/api/v2/detect). The handler supports: - beforeRequestHook and afterRequestHook - URL extraction from multimodal content parts (OpenAI input_audio / image_url, Anthropic source.url), regex over plain text, or context.metadata fallback (selectable via urlSource) - Optional audio source tracing (identifies TTS vendor), reverse image search, and Zero Retention Mode - Configurable threshold, polling interval / timeout, and fail-open vs fail-closed behaviour on API errors 23 jest tests cover URL extraction, evaluation, the handler happy path, polling, timeouts, fail modes, and base-URL override.

0xbrainkid · 2026-04-15T07:10:54Z

Audio deepfake detection is an important guardrail layer as voice-enabled agents become more common — a voice agent that cannot verify whether the audio it received is authentic is vulnerable to real-time voice cloning attacks.

One dimension Resemble AI Detect covers at the content layer that complements an agent identity layer: the detected deepfake tells you the audio is synthetic, but not which agent sent it. For multi-agent voice pipelines (e.g., one agent calling another via voice channel), a deepfake detection positive identifies the attack but not the attacker — attribution requires agent identity at the call layer alongside content-level detection.

A combined trust check for voice-enabled agent endpoints:

const result = await resembleDetect.scan(audioBuffer);
if (result.deepfake_score > THRESHOLD) {
    emit_security_event("audio_deepfake_detected", {
        deepfake_score: result.deepfake_score,
        caller_agent_id: request.headers["X-Agent-ID"] || "unknown",
        caller_trust_score: await satp.getScore(caller_agent_id)  // was this caller already suspect?
    });
    return deny();
}

The caller_trust_score at detection time matters: a deepfake attempt from an agent with a history of trust violations warrants immediate revocation, not just the current request denial. An agent with a clean history might be compromised rather than malicious, warranting a different response.

Happy to contribute an implementation that includes caller identity context alongside the deepfake score.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Resemble AI Detect guardrail plugin#1610

Add Resemble AI Detect guardrail plugin#1610
devshahofficial wants to merge 1 commit intoPortkey-AI:mainfrom
devshahofficial:resemble-detect-plugin

devshahofficial commented Apr 14, 2026

Uh oh!

0xbrainkid commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

devshahofficial commented Apr 14, 2026

Summary

Why

What this PR adds

Configuration

URL extraction

Fail modes

Test plan

Notes

Uh oh!

0xbrainkid commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants