Our /s3/import Robot

Import files from Amazon S3

🤖/s3/import imports whole directories of files from your S3 bucket.

If you are new to Amazon S3, see our tutorial on using your own S3 bucket.

The URL to the result file in your S3 bucket will be returned in the Assembly Status JSON.

Use DNS-compliant bucket names. Your bucket name must be DNS-compliant and must not contain uppercase letters. Any non-alphanumeric characters in the file names will be replaced with an underscore, and spaces will be replaced with dashes. If your existing S3 bucket contains uppercase letters or is otherwise not DNS-compliant, rewrite the result URLs using the Robot’s url_prefix parameter.

Limit access

You will also need to add permissions to your bucket so that Transloadit can access it properly. Here is an example IAM policy that you can use. Following the principle of least privilege, it contains the minimum required permissions to export a file to your S3 bucket using Transloadit. You may require more permissions (especially viewing permissions) depending on your application.

Please change {BUCKET_NAME} in the values for Sid and Resource accordingly. Also, this policy will grant the minimum required permissions to all your users. We advise you to create a separate Amazon IAM user, and use its User ARN (can be found in the "Summary" tab of a user here) for the Principal value. More information about this can be found here.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowTransloaditToImportFilesIn{BUCKET_NAME}Bucket",
      "Effect": "Allow",
      "Action": ["s3:GetBucketLocation", "s3:ListBucket"],
      "Resource": ["arn:aws:s3:::{BUCKET_NAME}", "arn:aws:s3:::{BUCKET_NAME}/*"]
    }
  ]
}

The Sid value is just an identifier for you to recognize the rule later. You can name it anything you like.

The policy needs to be separated into two parts, because the ListBucket action requires permissions on the bucket while the other actions require permissions on the objects in the bucket. When targeting the objects there's a trailing slash and an asterisk in the Resource parameter, whereas when the policy targets the bucket, the slash and the asterisk are omitted.

In order to build proper result URLs we need to know the region in which your S3 bucket resides. For this we require the GetBucketLocation permission. Figuring out your bucket's region this way will also slow down your Assemblies. To make this much faster and to also not require the GetBucketLocation permission, we have added the bucket_region parameter to the /s3/store and /s3/import Robots. We recommend using them at all times.

Please keep in mind that if you use bucket encryption you may also need to add "sts:*" and "kms:*" to the bucket policy. Please read here and here in case you run into trouble with our example bucket policy.

Keep your credentials safe. Since you need to provide credentials to this Robot, please always use this together with Templates and/or Template Credentials, so that you can never leak any secrets while transmitting your Assembly Instructions.

Note: Transloadit supports file sizes up to 200 GB. If you require a higher limit for your application, please get in touch.

Usage example

Import files from the path/to/files directory and its subdirectories:

{
  "steps": {
    "imported": {
      "robot": "/s3/import",
      "credentials": "YOUR_AWS_CREDENTIALS",
      "path": "path/to/files/",
      "recursive": true
    }
  }
}

Parameters

  • ignore_errors

    Array of Strings / Boolean ⋅ default: []

    Possible array members are "meta" and "import".

    You might see an error when trying to extract metadata from your imported files. This happens, for example, for files with a size of zero bytes. Including "meta" in the array will cause the Robot to not stop the import (and the entire Assembly) when that happens.

    Including "import" in the array will ensure the Robot does not cease to function on any import errors either.

    To keep backwards compatibility, setting this parameter to true will set it to ["meta", "import"] internally.

  • credentials

    Stringrequired

    Please create your associated Template Credentials in your Transloadit account and use the name of your Template Credentials as this parameter's value. They will contain the values for your S3 bucket, Key, Secret and Bucket region.

    While we recommend to use Template Credentials at all times, some use cases demand dynamic credentials for which using Template Credentials is too unwieldy because of their static nature. If you have this requirement, feel free to use the following parameters instead: "bucket", "bucket_region" (for example: "us-east-1" or "eu-west-2"), "key", "secret".

  • path

    String / Array of Stringsrequired

    The path in your bucket to the specific file or directory. If the path points to a file, only this file will be imported. For example: images/avatar.jpg.

    If it points to a directory, indicated by a trailing slash (/), then all files that are direct descendants to this directory will be imported. For example: images/.

    Directories are not imported recursively. If you want to import files from subdirectories and sub-subdirectories, enable the recursive parameter.

    If you want to import all files from the root directory, please use / as the value here. In this case, make sure all your objects belong to a path. If you have objects in the root of your bucket that aren't prefixed with /, you'll receive an error: A client error (NoSuchKey) occurred when calling the GetObject operation: The specified key does not exist.

    You can also use an array of path strings here to import multiple paths in the same Robot's Step.

  • recursive

    Boolean ⋅ default: false

    Setting this to true will enable importing files from subdirectories and sub-subdirectories (etc.) of the given path.

    Please use the pagination parameters page_number and files_per_page wisely here.

  • page_number

    Integer ⋅ default: 1

    The pagination page number. For now, in order to not break backwards compatibility in non-recursive imports, this only works when recursive is set to true.

    When doing big imports, make sure no files are added or removed from other scripts within your path, otherwise you might get weird results with the pagination.

  • files_per_page

    Integer ⋅ default: 1000

    The pagination page size. This only works when recursive is true for now, in order to not break backwards compatibility in non-recursive imports.

Related blog posts