Friday, November 7, 2014

Call Now, Log Later

Sometimes, you really want to cut down trees. But most of the time, logging is just having your software write information to a location where you can examine it at a later time. If you're looking for advice on chainsaws or flannel shirts, I'm sorry but this isn't the blog for you.

I can't tell you much about forestry, but I can tell you about logging from your Node.js application using the Express framework.
Logging, pre-Internet.
AZ Lumber & Timber Co. - 1937

Often, using Express means using the Morgan middleware for logging. Morgan is the HTTP request logger for Express. It allows you to easily output all the typical information you might want to see about your requests and responses. It's easy to use and allows you to customize the format or select one of several standard types.

But what if you want to log differently? I wanted to write the typical HTTP request/response data to a database table so that I could run queries against it later. For testing purposes, I also wanted to be able to optionally output the logs to the screen instead of the database.

I ended up writing my own logger middleware which can use either Morgan (for output to the screen) or a custom function (for output to my database) depending on a flag in my configuration object.

Take a look...

module.exports = function(conf) {

    var myLogger = null;

    if (conf.logToDatabase) {

        // here is where you can set up your database
        // connection or whatever you need

        myLogger = function(req, res, next) {
   
            function storeData() {

                res.removeListener('finish', storeData);
                res.removeListener('close', storeData);

                var reqIp = req.ip;
                var reqVerb = req.method;
                var reqPath = req.originalUrl || req.url;
                var resStatus = res.statusCode;
                var resSize = res.get('content-length');
                var reqRef = req.get('referer');
                var reqAgent = req.get('user-agent');

                // here is where you would call the function to
                // insert your data into the database

            };

            // defer logging until the end so we know the response stuff
            res.on('finish', storeData);
            res.on('close', storeData);   

            next();

        };

    } else {

        var morgan = require("morgan");
        myLogger = morgan("combined");

    }

    return function(req, res, next) {
        myLogger(req, res, next);
    };

};

Based on a check of the logToDatabase field in the supplied conf object (line 5), I decide which logger function to wrap in my returned function. If logToDatabase is defined and true, I create my own. Otherwise, I create an instance of Morgan.

Keep in mind that logging middleware is usually specified first in your Express application. That means the logger function is going to be called before any of your other routes get their chance to process the request. How can the log include things like the response code and content-length if those things haven't been determined yet? The trick to getting the HTTP response data into the logs is found on lines 31 and 32.

Notice that my custom logger doesn't actually do much when it is invoked. It defines a function, storeData, but doesn't actually call it. Instead, I assign the storeData function to be a listener on the "finish" and "close" response events and then just call next(). After that, all the other routes can process the request. Eventually, everybody is done with it and the Express framework ensures that at least one of the "finish" and "close" events gets fired.

It's then that storeData gets invoked. Thanks to the closure, it still has access to the request and response objects - and the response object will be fully populated with data.

On lines 14 and 15, you'll see that I remove the listener from both events. This is because I don't know which of the two events got fired, and it is possible that both will be fired. I don't want to log things twice! Since I can be certain that the function has been called at this point (since it's inside the function!) it is safe to remove both listeners.

Call now, log later.

This is a very useful pattern when developing for Node, not just when logging HTTP request data with Express. Creating function closures and using them to defer execution until an event occurs is a common theme that you'll encounter again and again. Now you'll be prepared.

Amphibian.com comic for 7 November 2014