Extract thumbnail images from documents
🤖/document/thumbs generates an image for each page in a PDF file or an animated GIF file that loops through all pages.
Things to keep in mind
- If you convert a multi-page PDF file into several images, all result images will be sorted with the first image being the thumbnail of the first document page, etc.
- You can also check the
meta.thumb_index
key of each result image to find out which page it corresponds to. Keep in mind that these thumb indices start at 0, not at 1.
Usage example
Convert all pages of a PDF document into separate 200px-wide images:
{
"steps": {
"thumbnailed": {
"use": ":original",
"robot": "/document/thumbs",
"width": 200,
"resize_strategy": "fit",
"trim_whitespace": false,
"imagemagick_stack": "v3.0.1"
}
}
}
Parameters
-
use
String / Array of Strings / Object requiredSpecifies which Step(s) to use as input.
-
You can pick any names for Steps except
":original"
(reserved for user uploads handled by Transloadit) -
You can provide several Steps as input with arrays:
"use": [ ":original", "encoded", "resized" ]
💡 That’s likely all you need to know about
use
, but you can view Advanced use cases. -
-
page
Integer / Null ⋅ default:null
The PDF page that you want to convert to an image. By default the value is
null
which means that all pages will be converted into images. -
format
String ⋅ default:"png"
The format of the extracted image(s). Supported values are
"jpeg"
,"jpg"
,"gif"
and"png"
.If you specify the value
"gif"
, then an animated gif cycling through all pages is created. Please check out this demo to learn more about this. -
delay
Integer / Null ⋅ default:null
If your output format is
"gif"
then this parameter sets the number of 100th seconds to pass before the next frame is shown in the animation. Set this to100
for example to allow 1 second to pass between the frames of the animated gif.If your output format is not
"gif"
, then this parameter does not have any effect. -
width
Integer(1
-5000
) ⋅ default: autoWidth of the new image, in pixels. If not specified, will default to the width of the input image
-
height
Integer(1
-5000
) ⋅ default: autoHeight of the new image, in pixels. If not specified, will default to the height of the input image
-
resize_strategy
String ⋅ default:"pad"
One of the available resize strategies.
-
background
String ⋅ default:"#FFFFFF"
-
alpha
String ⋅ default:""
Change how the alpha channel of the resulting image should work. Valid values are
"Set"
to enable transparency and"Remove"
to remove transparency.For a list of all valid values please check the ImageMagick documentation here.
-
density
String / Null ⋅ default:null
While in-memory quality and file format depth specifies the color resolution, the density of an image is the spatial (space) resolution of the image. That is the density (in pixels per inch) of an image and defines how far apart (or how big) the individual pixels are. It defines the size of the image in real world terms when displayed on devices or printed.
You can set this value to a specific
width
or in the formatwidth
xheight
.If your converted image has a low resolution, please try using the density parameter to resolve that.
-
antialiasing
Boolean ⋅ default:false
Controls whether or not antialiasing is used to remove jagged edges from text or images in a document.
-
colorspace
String ⋅ default:""
Sets the image colorspace. For details about the available values, see the ImageMagick documentation.
Please note that if you were using
"RGB"
, we recommend using"sRGB"
. ImageMagick might try to find the most efficientcolorspace
based on the color of an image, and default to e.g."Gray"
. To force colors, you might then have to use this parameter. -
trim_whitespace
Boolean ⋅ default:true
This determines if additional whitespace around the PDF should first be trimmed away before it is converted to an image. If you set this to
true
only the real PDF page contents will be shown in the image.If you need to reflect the PDF's dimensions in your image, it is generally a good idea to set this to
false
. -
pdf_use_cropbox
Boolean ⋅ default:true
Some PDF documents lie about their dimensions. For instance they'll say they are landscape, but when opened in decent Desktop readers, it's really in portrait mode. This can happen if the document has a cropbox defined. When this option is enabled (by default), the cropbox is leading in determining the dimensions of the resulting thumbnails.
-
output_meta
Object / Boolean ⋅ default:{}
Generally, this parameter allows you to specify a set of metadata that is more expensive on cpu power to calculate, and thus is disabled by default to keep your Assemblies processing fast.
This Robot only supports the default value of
{}
(meaning all meta data will be extracted) andfalse
. A value offalse
means that only width, height, size and thumb_index will be extracted for the result images, which would also provide a great performance boost for documents with many pages.
ImageMagick parameters
-
imagemagick_stack
String ⋅ default:"v2.0.10"
Selects the ImageMagick stack version to use for encoding. These versions do not reflect any real ImageMagick versions, they reflect our own internal (non-semantic) versioning for our custom ImageMagick builds. We currently recommend to use
"v3.0.1"
.Supported values:
"v2.0.10"
,"v3.0.1"
.A full comparison of supported formats, per stack, can be found here.
Demos
- Convert all pages of a document into an animated GIF
- Convert all pages of a document into separate images
- Convert the first page of a document into an image
Related blog posts
- Introducing new document-to-image conversion Robot November 1, 2012
- Convert PDF files into animated GIFs - animation delays now supported December 12, 2012
- Transloadit now offers SVG support for images March 23, 2013
- Adding `density` parameter to our /document/thumbs Robot April 17, 2013
- Upgrade all the things! June 12, 2013
- Kicking Transloadit into gear for the new year February 1, 2015
- Enhancing digital access to Cambridge's academic content January 16, 2018
- New pricing model for future Transloadit customers February 7, 2018
- Tutorial: using /video/merge to develop video slideshows June 14, 2019
- Convert Markdown files to HTML or PDF in seconds April 19, 2021