Models
Available Rapidly models, organised by family with picking guidance.
The Rapidly SDK ships with two model families: speech denoise and speech denoise + dereverb. Each family is available at four latencies (11, 21, 32, 96 ms), plus a micro size variant of the 32 ms model for CPU-constrained scenarios.
All models are trained on audio sampled at 48 kHz. The engine accepts other sample rates and resamples automatically.
For real-time factors and other performance characteristics, see Performance.
Speech denoise + dereverb
Removes both background noise and room reverb. Each model outputs cleaned dialogue, reverb, and noise as three separate busses.
| Variant | File | Size |
|---|---|---|
| 96 ms | speech-denoise-dereverb-96ms.v1.0.rapidly | 926 KB |
| 32 ms | speech-denoise-dereverb-32ms.v1.0.rapidly | 854 KB |
32 ms micro | speech-denoise-dereverb-micro-32ms.v1.0.rapidly | 241 KB |
| 21 ms | speech-denoise-dereverb-21ms.v1.0.rapidly | 851 KB |
| 11 ms | speech-denoise-dereverb-11ms.v1.0.rapidly | 615 KB |
Pick by latency:
- 96 ms for high-quality dialogue restoration. Excellent noise and reverb reduction when low latency isn't critical.
- 32 ms for a balanced trade-off. Strong reverb and noise reduction with moderate latency. Real-time speech enhancement where clarity and responsiveness both matter.
- 32 ms
microfor scale. Strong reduction at a much higher real-time factor. Process more simultaneous streams per CPU with a small trade-off in suppression strength. - 21 ms for responsiveness. Maintains strong suppression. Some transient noise may appear in highly dynamic environments.
- 11 ms for ultra-low latency live communication. Effective, with mild artifacts possible in very noisy or reverberant conditions.
Speech denoise
Removes background noise from speech. Each model outputs cleaned dialogue and isolated noise as two separate busses.
| Variant | File | Size |
|---|---|---|
| 96 ms | speech-denoise-96ms.v1.0.rapidly | 925 KB |
| 32 ms | speech-denoise-32ms.v1.0.rapidly | 854 KB |
32 ms micro | speech-denoise-micro-32ms.v1.0.rapidly | 241 KB |
| 21 ms | speech-denoise-21ms.v1.0.rapidly | 851 KB |
| 11 ms | speech-denoise-11ms.v1.0.rapidly | 615 KB |
Pick by latency:
- 96 ms for high-fidelity noise reduction that preserves natural speech tone. Ideal for recordings or post-processing.
- 32 ms for an excellent balance between speed and quality. Real-time use and production workflows.
- 32 ms
microfor scale. Nearly identical quality at a much higher real-time factor. Run more simultaneous streams without losing clarity. - 21 ms for compact, efficient real-time use. Clean noise reduction with minimal delay. Moderate noise environments.
- 11 ms for ultra-low latency live speech and conferencing. Retains the room's natural reverb, less aggressive noise reduction.