Monday, August 25, 2014

PhantomJS for Image Capture

I don't know if you've ever noticed, but when you share one of my comics on Twitter, Facebook, or Pinterest (hint: you should be sharing my comic) it is able to grab a preview of the comic that is a static image. This meta tags on the page tell the social media sites which image to use:

<meta property="og:image" content="http://amphibian.com/cell/25.png">

Dynamically Created Image of a Comic
Since the comics are actually SVGs positioned inside <div> tags, it is necessary to create these images somehow. Early on, I created some manually but I really needed an automated method that was integrated with the online editor. Every time I create or update a comic it needs to automatically generate new images.

The solution I came up with was integrating PhantomJS with my web application. You can do a lot of cool stuff with PhantomJS, like automated testing of web applications, but I am just using it for its image capture capabilities right now.

To make an image out of a web page, you just need to make a JavaScript file to control PhantomJS. Here is an example similar to what I use.

var page = require('webpage').create();
page.viewportSize = { width: 1202, height: 5000 };
page.open('http://amphibian.com/basic/25/1', function() {
    page.clipRect = page.evaluate(function() {
        var areaRect = document.getElementById('comicArea').getBoundingClientRect();
        var cellRect = document.getElementById('cell-0').getBoundingClientRect();
        var r = JSON.parse(JSON.stringify(areaRect));
        r.height += (r.top * 2);
        r.top = 0;
        r.left = cellRect.left - 102;
        r.right = cellRect.right + 102;
        r.width = cellRect.width + 204;
        return r;
    });
    page.render('demo.png');
    phantom.exit();
});

Line 1 is just pretty standard, to make a page object. You can read more about the different modules in the PhantomJS documentation.

One line 2, I set the viewport size. This controls how wide my "virtual" web browser client will be. Remember that my comics resize themselves based on the client, so I want to use the maximum size in order to make the rendered image very large. The social media sites almost always shrink the images down and starting with the largest size will result in the cleanest picture in the end.

Line 3 is where I make the call to open the page. When the page is opened, the callback function is called. That's where the good stuff happens.

When you want to make an image out of a web page you can either do the whole thing or just part of the page. If you want to do a partial page, you need to set the clipRect property of the page object before calling render. That's what I'm doing starting at line 4, with the call to page.evaluate.

When calling evaluate on the page object, you pass in a function that should be executed as JavaScript in the context of the page. It would be the same as if the page had JavaScript in a script block.

To make the clip rectangle that I want, I need to get the dimensions of 2 parts of the page - the comicArea div and the div of the first cell. I get those on lines 5 and 6, by simply getting the elements by id and then calling their getBoundingClientRect methods. However, the rectangle I want to return is neither of these exactly. I need to make some adjustments.

On line 7 I do something that you might find a bit strange. I create a new variable r by parsing the JSON returned from a stringifying the areaRect object. Why am I doing this? Because the rectangles returned from the calls to getBoundingClientRect are immutable and I want to make changes. By dumping the object to a JSON string and then reading that string into a new object, I get a mutable copy.

Lines 8 through 12 are the adjustments to the clip rectangle I want to return. I change the top and height of the rectangle around the comicArea and then adjust the left, right, and width values using the corresponding values from the cell as a starting point.

Try it yourself and you should get an image of a single cell of one of my comics. Or change some stuff around and try getting images of other pages.

The complete solution in my web application is slightly more complicated, because I have to dynamically generate the JavaScript for PhantomJS, call PhantomJS from Node, read the resulting image into the database, and then cleanup the temporary files. But we can talk about that some other time...

Amphibian.com comic for August 25, 2014