Flag of Ukraine
Our /audio/concat Robot

Concatenate audio

🤖/audio/concat concatenates several audio files together.

This Robot can concatenate an almost infinite number of audio files.

Usage example

If you have a form with 3 file input fields and want to concatenate the uploaded audios in a specific order, instruct Transloadit using the name attribute of each input field. Use this attribute as the value for the fields key in the JSON, and set as to audio_[[index]]. Transloadit will concatenate the files based on the ascending index order:

  "steps": {
    "concatenated": {
      "robot": "/audio/concat",
      "use": {
        "steps": [
          { "name": ":original", "fields": "first_audio_file", "as": "audio_1" },
          { "name": ":original", "fields": "second_audio_file", "as": "audio_2" },
          { "name": ":original", "fields": "third_audio_file", "as": "audio_3" }
      "ffmpeg_stack": "v6.0.0"


  • use

    String / Array of Strings / Object required

    Specifies which Step(s) to use as input.

    • You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)

    • You can provide several Steps as input with arrays:

      "use": [

    💡 That’s likely all you need to know about use, but you can view Advanced use cases.

  • output_meta

    Object / Boolean ⋅ default: {}

    Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.

    For images, you can add "has_transparency": true in this object to extract if the image contains transparent parts and "dominant_colors": true to extract an array of hexadecimal color codes from the image.

    For videos, you can add the "colorspace: true" parameter to extract the colorspace of the output video.

    For audio, you can add "mean_volume": true to get a single value representing the mean average volume of the audio file.

    You can also set this to false to skip metadata extraction and speed up transcoding.

  • preset

    String ⋅ default: "mp3"

    Performs conversion using pre-configured settings.

    If you specify your own FFmpeg parameters using the Robot's ffmpeg parameter and you have not specified a preset, then the default mp3 preset is not applied. This is to prevent you from having to override each of the MP3 preset's values manually.

    For a list of audio presets, see audio presets.

  • bitrate

    Integer ⋅ default: auto [?]

    Bit rate of the resulting audio file, in bits per second. If not specified will default to the bit rate of the input audio file.

  • sample_rate

    Integer ⋅ default: auto [?]

    Sample rate of the resulting audio file, in Hertz. If not specified will default to the sample rate of the input audio file.

  • audio_fade_seconds

    Float ⋅ default: 1.0

    When used this adds an audio fade in and out effect between each section of your concatenated audio file. The float value is used, so if you want an audio delay effect of 500 milliseconds between each video section, you would select 0.5. Integer values can also be represented.

    This parameter does not add an audio fade effect at the beginning or end of your result audio file. If you want to do so, create an additional 🤖/audio/encode Step and use our ffmpeg parameter as shown in this demo.

FFmpeg parameters

  • ffmpeg_stack

    String ⋅ default: "v5.0.0"

    Selects the FFmpeg stack version to use for encoding. These versions reflect real FFmpeg versions. We currently recommend to use "v6.0.0".

    Supported values: "v5.0.0", "v6.0.0".

    A full comparison of video presets, per stack, can be found here.

  • ffmpeg

    Object ⋅ default: {}

    A parameter object to be passed to FFmpeg. If a preset is used, the options specified are merged on top of the ones from the preset. For available options, see the FFmpeg documentation. Options specified here take precedence over the preset options.

Related blog posts