Sunday, May 4, 2014

Late for the (Multi)party

Last month I started making a legitimate web application with Node and Express. It's been a generally positive experience, but because I'm using the newest version of Express (4.1) I'm finding a lot of outdated information on the web about how things are accomplished. In version 4.x, Express no longer depends on Connect and most of the previously included middleware is now in separate repos. Overall, I think this is a positive change that adds more flexibility to the system.

Today I wanted to add the ability to perform file uploads to my application. I needed a way to handle the "multipart/form-data" POSTs, and most of the common examples point to the bodyParser middleware. This middleware is no longer recommended for use nor is it included by default (since 4.x) and searching for it led me to a different solution that was more appropriate for my particular needs.

There is a little confusion out there about body parsing middleware. Prior versions of Express included a middleware called "bodyParser" which could handle JSON, urlencoded, and raw request bodies. It is deprecated now in favor of the similarly-named "body-parser" middleware. So what's different about them, beside a hyphen and capitalization?

The new one (hyphenated) does not do multipart posts, which is a good thing. The old one (non hyphenated) did, but did so by writing all the uploaded files to a temp directory and making you process them from there manually. It was easy to forget to clean up the files, and then you might fill up your disk.

Bottom line: don't use bodyParser. Use body-parser. Even if you aren't using Express 4.x, get the new one and use it instead.

But now back to my actual problem. I can use body-parser from now until the cows come home and I still can't process multipart file uploads. There are other middlewares that I could use to process the uploads, so I had to be sure to pick the most appropriate one. In particular, I was looking for one which would allow me to process the stream of the POST body to get Buffer objects which I could write to my database.
I like to party!

The solution that I liked best turned out to not be a middleware at all. After much research, I decided to use Multiparty directly. Multiparty is a module for parsing multipart-form data requests which supports streams2, and doesn't require the data to be written to disk (it can, though, if you really want it to).

Here is a stripped-down version of the code I used, which keeps the uploaded files completely in memory so they can be written to a database or put on your file system somewhere. Whatever you want to do.


var express = require('express');
var multiparty = require('multiparty');

var app = express();

app.post('/images', function(req, res, next) {
 
 var uploadName = '';
 var uploadType = '';
 var chunks = [];
 var totalLength = 0;

 var form = new multiparty.Form();
 
 form.on('error', function(err) {
  // maybe handle errors here, or whatever
  next(err);
 });
 
 form.on('close', function() {
  
  var b = Buffer.concat(chunks, totalLength);
  console.log('storing file %s (%d bytes)', uploadName, b.length);
  // store the image in your database or whatever
  res.send(200);

 });

 form.on('part', function(part) {
  
  part.on('data', function(chunk) {
     chunks.push(chunk);
     totalLength += chunk.length;
  });
  part.on('end', function() {
     uploadName = part.filename;
     uploadType = part.headers['content-type'];
  });
  
 });
 
 form.parse(req);
    
});

var server = app.listen(3000, function() {
 console.log('listening on port %d', server.address().port);
});

It is fairly simple to use in this case. You just create a Multiparty Form object and parse the request with it. It will emit events as it goes.

The "part" event will be emitted for each part of the post. My posts only contain a single form parameter, which is the file, which makes this example really simple. The parameter passed to the handler for the "part" events is a Readable Stream, so it also will emit events as you process the data. I listen for the "data" events and append the chunks to an array of Buffers. When I see the "end" event, that part is over and I set the uploaded file's name and mime-type.

Typically (assuming no errors) the next event emitted by the form is the "close" event. This is where I concatenate all those Buffer chunks together to get the Buffer for the whole upload. I then store the file and return the response code. You have to make sure the listen for the "error" event just in case...if you don't and an error is encountered, your clients will hang forever waiting on a response.

If you want to handle multiple files in a single post, you'd want to move the chunks and totalLength variables into the "part" handler for the form and the processing of the Buffer into the handler for the "end" event for the part instead of the "close" event for the form like I did above. You still have to respond to the client in the "close" handler - don't forget that! It would look something like this:

 form.on('close', function() {
   res.send(200);
 });

 form.on('part', function(part) {

   var chunks = [];
   var totalLength = 0;

   part.on('data', function(chunk) {
      chunks.push(chunk);
      totalLength += chunk.length;
   });
   part.on('end', function() {
      var uploadName = part.filename;
      var uploadType = part.headers['content-type'];
      var b = Buffer.concat(chunks, totalLength);
      console.log('storing file %s (%d bytes)', uploadName, b.length);
      // store the image in your database or whatever
   });
  
 });
 
If you don't like my example, you can also use Multiparty much like the old bodyParser middleware to write the uploaded files to a temp directory if you want. See the readme for the complete API.

I would definitely recommend taking a look at this module if you desire to have a similar file-upload functionality.