Batch Transcription API Guide


Our Batch Transcription API allows you to accurately transcribe pre-recorded audio files. The process is asynchronous, meaning you start a transcription job and we notify you (or you can check) when it's complete.

This guide covers the entire workflow, from uploading your audio to retrieving the final transcription.

How It Works: The Workflow

The process consists of three main steps:

  1. Request an Upload URL: You tell our API about the file you want to transcribe (e.g., its format) and get a secure, temporary URL to upload it.
  2. Upload Your Audio: You upload your audio file directly to the provided URL.
  3. Retrieve Your Transcription: Once we've processed the audio, you can get the result in one of two ways:
    • Webhooks: We send the result directly to a callbackUrl you provide. (Recommended)
    • Polling: You periodically check an endpoint for the status of the transcription job.

Step 1: Request an Upload URL

First, you must request a pre-signed URL to upload your audio file. More info at Pre-signed URL

Step 2: Upload Your Audio File

Next, upload your audio file using an HTTP PUT request to the presignedUrl you received.

Important: The Content-Type header in this PUT request must match the mimeType you specified in Step 1.

Upload example:

TypeScript
async function uploadAudio(
  presignedUrl: string,
  audioFile: Blob,
  mimeType: string,
) {
  try {
    const response = await fetch(presignedUrl, {
      method: "PUT",
      headers: {
        "Content-Type": mimeType,
      },
      body: audioFile,
    });

    if (response.ok) {
      console.log("Upload successful! Transcription is now processing.");
    } else {
      console.error("Upload failed:", response.statusText);
    }
  } catch (error) {
    console.error("An error occurred during upload:", error);
  }
}
C#
using System.Net.Http;

public static async Task UploadAudio(string presignedUrl, byte[] audioFile, string mimeType)
{
    using var client = new HttpClient();
    var content = new ByteArrayContent(audioFile);
    content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue(mimeType);

    var response = await client.PutAsync(presignedUrl, content);

    if (response.IsSuccessStatusCode)
    {
        Console.WriteLine("Upload successful! Transcription is now processing.");
    }
    else
    {
        Console.WriteLine($"Upload failed: {response.ReasonPhrase}");
    }
}
Java
import java.net.http.*;
import java.net.URI;

public static void uploadAudio(String presignedUrl, byte[] audioFile, String mimeType) throws Exception {
    HttpClient client = HttpClient.newHttpClient();
    HttpRequest request = HttpRequest.newBuilder()
        .uri(URI.create(presignedUrl))
        .header("Content-Type", mimeType)
        .PUT(HttpRequest.BodyPublishers.ofByteArray(audioFile))
        .build();

    HttpResponse<String> response = client.send(
        request, HttpResponse.BodyHandlers.ofString());

    if (response.statusCode() == 200) {
        System.out.println("Upload successful! Transcription is now processing.");
    } else {
        System.out.println("Upload failed: " + response.statusCode());
    }
}
Python
import requests

def upload_audio(presigned_url: str, audio_file: bytes, mime_type: str):
    response = requests.put(
        presigned_url,
        data=audio_file,
        headers={"Content-Type": mime_type},
    )

    if response.ok:
        print("Upload successful! Transcription is now processing.")
    else:
        print(f"Upload failed: {response.reason}")

Step 3: Retrieve Your Transcription

Once the upload is complete, our servers will begin processing the audio. You can get the result using one of the following methods.

Option A: Webhooks (Recommended) If you provided a callbackUrl in Step 1, our server will send an HTTP POST request to your URL once the transcription is complete.

Your endpoint should be prepared to receive the following body:

{
  "requestId": "job-abc-123-xyz",
  "transcription": [
    {
      "speakerId": "Speaker_00",
      "text": "Hello, Doctor.",
      "start": 0.5,
      "end": 2.2
    }
  ],
  "status": "COMPLETED",
  "signature": "invox-medical-generated-signature",
  "createdAt": 1721060645
}

Security Warning: To verify that the request genuinely comes from Invox Medical, you must validate the signature.

Below is an example of how to decrypt the signature and validate it:

TypeScript
import * as CryptoJS from "crypto-js";

export const validateSignature = (signature: string): boolean => {
  try {
    const secret = process.env.APP_SECRET!;
    const appId = process.env.APP_ID!;
    const apiKey = process.env.API_KEY!;
    const expectedValue = `${appId}~${apiKey}`;

    const bytes = CryptoJS.AES.decrypt(signature, secret);
    const originalText = bytes.toString(CryptoJS.enc.Utf8);

    if (!originalText) {
      return false;
    }
    return originalText === expectedValue;
  } catch (error) {
    return false;
  }
};
C#
using System.Security.Cryptography;
using System.Text;

public static bool ValidateSignature(string signature)
{
    try
    {
        var secret = Environment.GetEnvironmentVariable("APP_SECRET")!;
        var appId = Environment.GetEnvironmentVariable("APP_ID")!;
        var apiKey = Environment.GetEnvironmentVariable("API_KEY")!;
        var expectedValue = $"{appId}~{apiKey}";

        // Decrypt AES (CryptoJS-compatible)
        var keyBytes = Encoding.UTF8.GetBytes(secret);
        using var aes = Aes.Create();
        aes.Key = keyBytes;
        aes.Mode = CipherMode.CBC;
        var cipherBytes = Convert.FromBase64String(signature);
        var iv = cipherBytes[..16];
        var encrypted = cipherBytes[16..];
        aes.IV = iv;
        using var decryptor = aes.CreateDecryptor();
        var decrypted = decryptor.TransformFinalBlock(encrypted, 0, encrypted.Length);
        var originalText = Encoding.UTF8.GetString(decrypted);

        return originalText == expectedValue;
    }
    catch
    {
        return false;
    }
}
Java
import javax.crypto.Cipher;
import javax.crypto.spec.SecretKeySpec;
import javax.crypto.spec.IvParameterSpec;
import java.util.Base64;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;

public static boolean validateSignature(String signature) {
    try {
        String secret = System.getenv("APP_SECRET");
        String appId = System.getenv("APP_ID");
        String apiKey = System.getenv("API_KEY");
        String expectedValue = appId + "~" + apiKey;

        byte[] cipherBytes = Base64.getDecoder().decode(signature);
        byte[] iv = Arrays.copyOfRange(cipherBytes, 0, 16);
        byte[] encrypted = Arrays.copyOfRange(cipherBytes, 16, cipherBytes.length);

        SecretKeySpec keySpec = new SecretKeySpec(
            secret.getBytes(StandardCharsets.UTF_8), "AES");
        Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
        cipher.init(Cipher.DECRYPT_MODE, keySpec, new IvParameterSpec(iv));
        byte[] decrypted = cipher.doFinal(encrypted);
        String originalText = new String(decrypted, StandardCharsets.UTF_8);

        return originalText.equals(expectedValue);
    } catch (Exception e) {
        return false;
    }
}
Python
import os
import base64
from Crypto.Cipher import AES
from Crypto.Util.Padding import unpad

def validate_signature(signature: str) -> bool:
    try:
        secret = os.environ["APP_SECRET"]
        app_id = os.environ["APP_ID"]
        api_key = os.environ["API_KEY"]
        expected_value = f"{app_id}~{api_key}"

        cipher_bytes = base64.b64decode(signature)
        iv = cipher_bytes[:16]
        encrypted = cipher_bytes[16:]

        cipher = AES.new(
            secret.encode("utf-8"), AES.MODE_CBC, iv)
        decrypted = unpad(
            cipher.decrypt(encrypted), AES.block_size)
        original_text = decrypted.decode("utf-8")

        return original_text == expected_value
    except Exception:
        return False

Option B: Polling for Status

If you did not provide a callbackUrl, you can periodically check the job status by polling the following endpoint: Get transcription status