Overview

Rapidly is a real-time audio separation SDK. The engine runs entirely on the end-user's device across Linux, Windows, macOS, iOS, and Android, with idiomatic bindings for C / C++, Python, Swift, and Kotlin. Inference latency starts at 11 ms.

Vocabulary

Three terms used across these docs.

Rapidly SDK

The package you install.

Inference

What the SDK does at runtime, on the end-user's CPU.

Engine

One running instance processing a single audio stream on a single CPU thread.

For when you need more than one engine, see One vs multiple engines and Parallel processing.

Key strengths

On-device inference

Audio never leaves the device. The full engine runs locally, with no cloud round-trip for inference.

Real-time, low latency

Latency from 11 ms to 96 ms across the model catalog. Real-time factor up to 125x on a single CPU core.

Five platforms

Linux, Windows, macOS, iOS, and Android. The same engine runs everywhere.

Four language bindings

C / C++, Python, Swift, and Kotlin. Distributed via the standard package manager for each language.

Hardware-accelerated math

Apple Accelerate (vDSP), Intel IPP, and ARM NEON paths selected automatically per platform.

Small footprint

Compact models from 241 KB. Suits CPU-constrained embedded and mobile targets.

Pick your binding

C / C++

The most direct path. Desktop, server, embedded, or any native integration.

Python

Servers, ML pipelines, and notebooks.

Swift

iOS and macOS apps via Swift Package Manager.

Kotlin

Android apps via Maven Central.

Supported platforms

Platform	Architectures	Distribution
Linux	x64, arm64	Shared library in `bin/linux-x64/` and `bin/linux-arm64/`
Windows	x64, x86	DLL plus import library in `bin/windows-x64/` and `bin/windows-x86/`
macOS	Universal (arm64 + x86_64)	`dylib` in `bin/macos/`, or the signed `RapidlyEngine.xcframework`, or Swift Package Manager
iOS	arm64 device and Simulator	`RapidlyEngine.xcframework`, or Swift Package Manager
Android	arm64-v8a	Maven Central (`io.rapidly:rapidly-sdk:1.0`), or `.aar` from the GitHub Release

Minimum requirements

iOS 14 or later, iPadOS 14 or later
macOS 11 or later
Android minSdk 26 (Android 8.0)

Linux and Windows do not have a hard minimum; any reasonably modern distribution or release should work.

What the SDK includes

Component	Purpose
Native engine binary	The cross-platform core that loads models and runs inference. One binary per platform.
Public C header (`RapidlyEngine.h`)	The stable API surface that every binding wraps.
Language bindings	Idiomatic wrappers for C / C++, Python, Swift, and Kotlin.
Pre-trained models	`.rapidly` files for speech denoising and dereverberation, in multiple latency variants. See Models.
Examples	Working integrations for file processing and embedded targets, shipped in the GitHub Release.

Distribution

The SDK ships as a GitHub Release with pre-built binaries, the Apple xcframework, the Android .aar, the public header, and the bindings source. Customers can also pull bindings directly from each language's package manager:

Channel	What's there
GitHub Release	Pre-built binaries, the `xcframework`, the `.aar`, the public header, and the bindings source.
PyPI	`pip install rapidly` for the Python binding.
Swift Package Manager	`https://github.com/rapidly-labs/rapidly-sdk` for the Swift binding.
Maven Central	`io.rapidly:rapidly-sdk:1.0` for the Kotlin binding.

Hardware acceleration

The engine selects the fastest available math path per platform at runtime:

Platform	Acceleration
Apple (macOS, iOS)	Accelerate framework (vDSP)
Intel desktop and server	Intel Performance Primitives (IPP)
ARM (Linux arm64, Android, Apple Silicon)	NEON intrinsics
Other targets	Optimised C++ fallback

No configuration is required. The engine picks the best path per architecture on its own.

Coming soon

WebAssembly. WASM is on the roadmap to enable in-browser audio processing without a server round-trip. Target use cases: web apps, SaaS products, browser-based conferencing, live streaming, and smart TVs running web-based platforms like Samsung Tizen or LG webOS.

NPU offload. Hardware-accelerated inference on chips with dedicated neural acceleration units.

Licensing model

The SDK enforces license entitlements locally. No network access is required. Without a covering license, the engine still runs and loads models, but its output is watermarked. See Pricing for licensing options.