Tuesday, October 17, 2006

Logging vs String Construction

A classic logging problem in Java is this:

logger.finest(expensiveStringConstruction());

The problem with this is that even when the logger's level is above FINEST and the message is not logged, the message string must still be constructed. In cases where constructing the string involves a lot of work, your code can be slowed down significantly even when logging is off. The standard solution is like so:

if (logger.isLoggable(Level.FINEST))
{
  logger.finest(expensiveStringConstruction());
}

By adding the if statement around the logging, the expensive string construction is only done if the message is actually going to be used. The problem with this is that if you have a lot of logging statements, your code soon ends up littered with these ifs that just obscure what is really happening. Not to mention that you have to type it all in.

This problem could have been tackled in a different way. It was a mistake for the logging methods to take String objects as parameters. This mandates that the messages must be constructed before passing them to the logger. A better approach is to delay creating the message string until the last possible moment, when we know it will be needed. One option would be to have an interface called, say, LogMessageFactory with a method String createLogMessage() and have the log methods accept the interface instead of String. But a new interface isn't necessarily needed, as good old Object already has a toString method. If loggers took Object's instead of String's, then they could just call toString when logging actually takes place.

Well, the logging API actually does let you do this. LogRecord objects can store a message string and an arbitrary set of parameter objects. The standard formatters will use MessageFormat to construct the string that is finally logged. Here's an example:

public final class TestLog
{

    public static final Logger logger =
        Logger.getLogger("test");

    public static void main(final String[] args)
    {
        final TestLog testLog = new TestLog();
        logger.finest(testLog.toString());     // 1
        logger.log(Level.FINEST, "{0}", testLog);   // 2
    }

    public String toString()
    {
        try
        {
            // expensive string construction here
            Thread.sleep(5000L);
        }
        catch (InterruptedException e)
        {
            assert false;
        }
        return "Hello, world";
    }

}

If the logger is configured to log at FINEST or below, the lines marked 1 and 2 behave identically and the message is logged. However, if the logger's level is above FINEST, then line 1 will take 5 seconds to do nothing and line 2 will just check the level and exit, without calling toString at all.

The advantage of logging this way is that you don't have to put all those ifs statements in. But there are disadvantages. You can't use the convenience logging methods like severe and info, etc. You have to use one of the log methods that accept Object parameters. Unfortunate, but not too egregious.

The bigger disadvantage is that the string generation has to be inside a toString method. While this can be done inside an anonymous inner class, the syntax for that isn't exactly convenient. If you have a bunch of log messages that are generated completely differently, then using anonymous inner classes everywhere will be far worse than just using the if statements. Where it would make sense is if you have a bunch of log statements where the expensive portion of the message creation is the same.

For example, you could replace this:

if (logger.isLoggable(Level.FINEST))
{
   logger.finest("Stuff happened " + expensiveStringCreation());
}
if (logger.isLoggable(Level.FINEST))
{
   logger.finest("Foo happened " + expensiveStringCreation());
}
if (logger.isLoggable(Level.FINEST))
{
   logger.finest("Bar happened " + expensiveStringCreation());
}

with:

final Object msgFactory = new Object()
{
    public String toString()
    {
        return expensiveStringCreation();
    }
};
logger.log(Level.FINEST, "Stuff happened {0}", msgFactory);
logger.log(Level.FINEST, "Foo happened {0}", msgFactory);
logger.log(Level.FINEST, "Bar happened {0}", msgFactory);

I'm assuming here that there's code in between these log calls that changes the state and hence the string returned by expensiveStringCreation.

There's a lot of talk at the moment about adding closures to Java. Whatever ends up being adopted, this is definitely one of the scenarios it should assist with (although what I'm describing only requires the most basic capabilities of closures). I'm a fan of the CICE proposal, although some of the objections raised seem valid. The CICE proposal requires an interface with a single method, so if we did have the LogMessageFactory interface I proposed above, we could just do:

logger.finest(LogMessageFactory()
    { return "Stuff happened " + expensiveStringCreation(); });