11/07/2025

Silent Gap: Exposing a Hidden Weakness in Perth’s Zero-Bit Audio Watermarking

by Elena & Petar4 min read

Our exploration into the latest watermarking models led us to Perth, a nearly open-sourced model from Resemble AI's lab. We want to begin by commending Resemble AI for making this model publicly accessible. This kind of transparency is rare and extremely important in today’s AI landscape, where watermarking and attribution technologies are becoming critical for trust and accountability. Motivated by this openness, we dove into testing Perth’s zero-bit watermarking implementation to evaluate its robustness and real-world resilience.

The Invisible Weakness

Initially, the model performed promisingly against various attacks. However, as is often the case in security testing, a chain is only as strong as its weakest link, and we eventually detected a significant vulnerability.

Here's where things got interesting: we discovered that a simple notch filter, precisely tuned to remove the 350-500Hz frequency range, completely erased the watermark from our tested speech signals! The truly remarkable part? This process is virtually imperceptible to the human ear, meaning the watermark can be effortlessly removed without any noticeable degradation to the audio quality.

Interestingly, our spectrogram analysis of the original, watermarked, and difference signals did not reveal any clear pattern in the watermark's embedding. This suggests the watermark isn't confined to a particular frequency band, as illustrated in the images bellow.

Figure 1: Sample spectrogram comparison showing the original (left), watermarked (middle), and difference (right) signals.

Figure 2: Zoomed-in spectrogram of the difference signal, focused on the 0–2000 Hz range.

As displayed in Figure 1 (right), the difference spectrogram reveals two main regions of change: above 16 kHz and below 2 kHz. The high-frequency differences stem from the fact that the Perth model operates at a 32 kHz sampling rate, whereas our source audio was sampled at 44.1 kHz, resulting in the loss of details above 16 kHz during preprocessing. This effect is inaudible and unrelated to watermark embedding. The low-frequency region, up to 2000 Hz, corresponds to the operational range in which the Perth model is constrained to embed the watermark. To examine this more closely, Figure 2 shows a zoomed-in view of the difference spectrogram within the 0–2000 Hz band. Even with this focused view, it's clear there's no distinct or consistent pattern visually discernible.

This observation led us to a key hypothesis regarding the problem's root cause: we believe the detection neural network employed by this model is overfitted to a narrow frequency range. Consequently, it completely loses its ability to detect the hidden information once this specific range is removed.

Turning the Tables: Fooling the Detector with a Fake Watermark

Building on these insights, we explored whether this frequency dependency could not only be exploited to remove the watermark, but also to falsely create the appearance of one. False positives are especially dangerous in copyright protection systems, as they can lead to the wrongful attribution of ownership, potential legal disputes, or takedown actions against content that was never watermarked in the first place.

In this second type of attack, we generated a synthetic watermark by injecting a pure sine wave into the target audio, centered within the vulnerable 350–500 Hz band (e.g., around 425 Hz). This was implemented using the following formula:

fake_watermark = A * sin(2 * pi * freq * t)

attacked_signal = original_signal + fake_watermark

When the amplitude A is chosen carefully, this added signal remains inaudible to human listeners, yet is strong enough to fool the detector into classifying the audio as watermarked, despite no legitimate embedding taking place.

This form of attack is particularly effective in cases where the detector’s output on clean, unwatermarked audio is already close to the decision threshold. By introducing this narrow-band signal, the detection confidence can be artificially boosted past the watermarking threshold, resulting in a false positive.

Hard Truths, Stronger Tech: Why We Test

In conclusion, this simple yet effective attack exposed the Achilles' heel of the Perth model, proving that its perceived robustness is highly dependent on the presence of a specific frequency band.

At DeepMark, we believe that rigorous testing and transparent analysis, even of third-party systems, are essential for advancing watermarking technology. Discovering and understanding such vulnerabilities drives our commitment to building watermarking solutions that are not just robust in theory, but resilient in real-world, adversarial environments.

We applaud the openness of Resemble AI’s team in making Perth publicly accessible. It’s this kind of collaborative transparency that allows meaningful progress to flourish.

Your Content Deserves the Best Protection

Discover how our innovative AI watermarking tool can transform your digital protection strategy. Request a demo and let us guide you through features tailored to your needs.