Parallel processing

Running many Rapidly engines at once to process audio at scale, in the cloud or on multi-core hardware.

Parallel processing runs multiple Rapidly engines at the same time, one per audio stream or chunk. The pattern is essential for cloud workloads where many streams or files are processed concurrently, and for multi-core hardware that has more inputs than a single engine can serve.

Where parallel processing fits

Cloud backends

Processing audio uploads, transcription pipelines, podcast post-production, or live streams from many users at once.

Multi-guest recording servers

Processing every guest's stream in real time before mixing.

Batch file processing

Splitting a long file into chunks and processing the chunks across worker threads.

Multi-mic hardware

Running one engine per microphone input on a recorder or array device.

A worked example: long-file chunking

A customer uploads a long audio file. The backend splits it into N chunks (typically 20 to 30 seconds each), processes each chunk on its own worker with its own engine, and concatenates the cleaned chunks. This gets near-linear speedup compared to a single-engine sequential pass.

Memory and threading

Each engine is independent and loads its own copy of the model. N engines uses N copies in memory. Each engine runs on its own thread. See Performance for sizing and One vs multiple engines for the threading rules.

For the API calls, see Python (the canonical batch-pipeline binding) or C / C++.