Switching to official S3 CLI for enhanced file exporting
Since we started in 2009, Amazon S3 has been our primary way of doing file exports. At that time, the best Amazon CLI tool we could find was Tim Kay's aws utility. It was a low-dependency, fully-featured toolkit to do much of our AWS heavy lifting.
His tool has served us well for 5 years and we have exported close to a petabyte of files with it. But as Amazon evolves, requirements change. For instance, Amazon's Frankfurt and China datacenters do not support signature version 2. While Tim Kay has done a great job trying to keep up in addressing changes like these, we can't expect him to release updates as soon as Amazon makes an announcement, seeing as this is a project he runs for free in his spare time. Moreover, since Amazon now offers an official CLI tool, we decided to make the switch last week.
We have many tests to avoid breaking changes as we perform these kinds of heart surgeries. Unfortunately, though, some use cases were not covered, and switching our S3 engine resulted in somewhat of a bumpy ride.
If, for instance, you had non-US buckets and didn't explicitly specify a region, we would have to
run a GetBucketLocation
request on the bucket and retry the export (or import) ourselves. So far
so good. However, if you had set up a dedicated Transloadit IAM user that can only do Puts and
Lists (as we recommend), we did not have this permission nor the regional reroutes, and therefore
the entire export would fail.
We have an inefficient workaround in production to prevent these failures, but have also updated our
documentation to reflect that we need the
GetBucketLocation
permission. We recommend you to grant this permission to your Transloadit IAM
user.
Our customers also reported a few other issues that were a result of subtle changes between the underlying tools and of – what we believe to be – a utf-8 bug in the new tool.
Update Feb 7, 2015 The bug has been confirmed. As a workaround, we are now escaping non-ASCII
characters to Unicode escape sequences, so that Renan Gonçalves
will read: Renan Gon\u00E7alves
.
You will probably want to unescape this on your end.
While over 99% of our exports kept working over the course of this, we are very sorry that we neglected to write more tests, covering more special cases.
Moving forward
We have fixes and workarounds for all these problems in production now. We have also written the system tests covering regional reroutes and other special cases to avoid regression with future upgrades or changes to our stack.
Additionally, we have paused the 24h limit on our temporary storage, so that affected Assemblies can still be replayed beyond that. To give affected customers some time to deal with this, we will turn on the auto-delete again in 7 days.
Customers that were impacted in a big way have been given a serious discount on this month's invoice. We should have reached out to you already, but be sure to let us know if we didn't.
We apologize for the trouble caused, but are looking forward to running the official S3 tools in production. We hope to have a safe upgrade path to new S3 features and datacenters as we move forward.