Sunday 5 April 2015

Testing Log4Net.Async with statsd and Graphite

Example code: You can get the example code for this post on BitBucket.

Carrying on from my previous post, Running statsd on Windows, I thought it would be interesting to try to gather some application metrics and send them to statsd. I’ve also been meaning to have a look at the Log4Net.Async library for a while and thought I might be able to combine the two objectives: having a look at Log4Net.Async, and experimenting with statsd metrics to see how the Log4Net.Async appenders compare with standard log4Net ones.

The first question is, how to get metrics out of an application and in to statsd? Not surprisingly someone has beaten me to it – quite a few people have actually – and there are a number of libraries out there. I chose the Statsd C# Client library which seems to have had a good number of downloads on NuGet.

 

I have a simple logging library of my own that can use log4Net. So, I decided to use my library to create a simple application to write to a log file in a loop and to time how long it took to do a large number of writes.

Getting the NuGet packages

The first step was to create a console application and to add Log4Net.Async and the Statsd C# Client NuGet packages.

image

image

The code

I started by creating a simple interface for a logging service. This allowed me to created separate implementations that could be assigned separate log4Net loggers and subsequently separate appenders.

namespace Logging.StatsD
{
    /// <summary>
    /// Classes implementing this interface will write to the log.
    /// </summary>
    public interface ILoggingService
    {
        /// <summary>Does the logging.</summary>
        /// <param name="numberOfTimes">The number of times to log.</param>
        void DoLogging(int numberOfTimes);
    }
}

I created 3 implementations of the ILoggingService:

  • SynchronousLoggingService
    • Logged messages from this class would be handled by a standard log4Net RollingFileAppender
  • AsynchronousLoggingService
    • Logged messages from this class would be handled by one of the Log4Net.Async appenders.
  • AsyncAwaitLoggingService
    • Just out of interest I used async await – somewhat naively – to wrap calls to the logger in a task.

Here’s an example of one of the logging services, the SynchronousLoggingService:

namespace Logging.StatsD
{
    using Andy.French.Logging;

    using StatsdClient;

    /// <summary>
    /// This class will write to the log synchronously.
    /// </summary>
    public class SynchronousLoggingService : ILoggingService
    {
        /// <summary>The logger.</summary>
        private ILogger logger;

        /// <summary>
        /// Initialises a new instance of the <see cref="SynchronousLoggingService"/> class.
        /// </summary>
        /// <param name="loggerFactory">The logger factory.</param>
        public SynchronousLoggingService(ILoggerFactory loggerFactory)
        {
            this.logger = loggerFactory.CreateInstance(typeof(SynchronousLoggingService));
        }

        /// <summary>
        /// Does the logging.
        /// </summary>
        /// <param name="numberOfTimes">The number of times to log.</param>
        public void DoLogging(int numberOfTimes)
        {
            using (Metrics.StartTimer("time-to-log-sync"))
            {
                for (int i = 0; i < numberOfTimes; i++)
                {
                    this.logger.Debug("This is test message {0} from the SynchronousLoggingService.", i);
                }
            }
        }
    }
}

You can see that the DoLogging method simply writes a Debug message to the log in a ‘for’ loop. The most interesting line is line 30. This is a call to the Statsd C# Client library and creates a timer around the code enclosed by the using statement. Note the name of the metric (time-to-log-sync). Similar code was added to the other logging services but with each one using a unique metric name.

So far we only have half of the story. What remains is the need to configure the Statsd C# Client library before it can successfully send messages to statsd. That was done in the main program.

namespace Logging.StatsD
{
    using System;

    using Andy.French.Configuration.Service;
    
    using Ninject;

    using StatsdClient;

    /// <summary>
    /// The main program.
    /// </summary>
    public class Program
    {
        /// <summary>The configuration service.</summary>
        private static IConfigurationService configurationService;

        /// <summary>The synchronous logging service.</summary>
        private static ILoggingService synchronousLoggingService;

        /// <summary>The asynchronous logging service.</summary>
        private static ILoggingService asynchronousLoggingService;

        /// <summary>The async await logging service.</summary>
        private static ILoggingService asyncAwaitLoggingService;

        /// <summary>
        /// Defines the entry point of the application.
        /// </summary>
        /// <param name="args">The arguments.</param>
        public static void Main(string[] args)
        {
            try
            {
                ConfigureMetrics();
                ResolveDependencies();

                var numberOfTimes = configurationService.GetInt("Number.Of.Times");

                asynchronousLoggingService.DoLogging(numberOfTimes);
                synchronousLoggingService.DoLogging(numberOfTimes);
                asyncAwaitLoggingService.DoLogging(numberOfTimes);
            }
            catch (Exception ex)
            {
                Console.WriteLine("An error occurred: {0}", ex.Message);
            }
        }

        /// <summary>
        /// Configures the <c>statsd</c> client.
        /// </summary>
        private static void ConfigureMetrics()
        {
            var metricsConfig = new MetricsConfig
            {
                StatsdServerName = "localhost",
                StatsdServerPort = 8125
            };

            Metrics.Configure(metricsConfig);
        }

        /// <summary>
        /// Resolves the dependencies.
        /// </summary>
        private static void ResolveDependencies()
        {
            var kernel = new StandardKernel(new ProgramModule());
            configurationService = kernel.Get<IConfigurationService>();
            synchronousLoggingService = kernel.Get<ILoggingService>("Synchronous");
            asynchronousLoggingService = kernel.Get<ILoggingService>("Asynchronous");
            asyncAwaitLoggingService = kernel.Get<ILoggingService>("AsyncAwait");
        }
    }
}

The Statsd C# Client library is configured in the call to the ConfigureMetrics method. Here we specify the statsd server name and port by creating a new instance of MetricsConfig and passing it to the static Metrics.Configure method.

Note – To find the statsd port I went back into the Chef cookbooks created when I set up the virtual machine. In <statsd-graphite-vm repository directory>\cookbooks\statsd\attributes there’s a default.rb file. The default.rb file contains the settings used to setup statsd in the virtual machine, including the port number.

default["statsd"]["dir"]                          = "/usr/share/statsd"
default["statsd"]["conf_dir"]                     = "/etc/statsd"
default["statsd"]["repository"]                   = "https://github.com/etsy/statsd.git"
default["statsd"]["reference"]                    = "master"
default["statsd"]["flush_interval"]               = 10000
default["statsd"]["percent_threshold"]            = 90
default["statsd"]["address"]                      = "0.0.0.0"
default["statsd"]["port"]                         = 8125
default["statsd"]["mgmt_address"]                 = "0.0.0.0"
default["statsd"]["mgmt_port"]                    = 8126
default["statsd"]["graphite_host"]                = "127.0.0.1"
default["statsd"]["graphite_port"]                = 2003
default["statsd"]["graphite_role"]                = "graphite_server"
default["statsd"]["graphite_query"]               = "roles:#{node['statsd']['graphite_role']} AND chef_environment:#{node.chef_environment}"
default["statsd"]["delete_idle_stats"]            = false
default["statsd"]["delete_timers"]                = false
default["statsd"]["delete_gauges"]                = false
default["statsd"]["delete_sets"]                  = false
default["statsd"]["delete_counters"]              = false
default["statsd"]["username"]                     = "statsd"
default["statsd"]["debug"]                        = false
default["statsd"]["dump_messages"]                = false

# Graphite storage config
default["statsd"]["graphite"]["legacy_namespace"] = true
default["statsd"]["graphite"]["global_prefix"]    = "stats"
default["statsd"]["graphite"]["global_suffix"]    = ""
default["statsd"]["graphite"]["prefix_counter"]   = "counters"
default["statsd"]["graphite"]["prefix_timer"]     = "timers"
default["statsd"]["graphite"]["prefix_gauge"]     = "gauges"
default["statsd"]["graphite"]["prefix_set"]       = "sets"

# nodejs
default["statsd"]["nodejs_bin"] = "#{node["nodejs"]["dir"]}/bin/node"

Configuring log4Net

At this point I had the code to write to a log but I still needed to configure log4Net. The log4Net configuration was all in  App.config.

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <configSections>
    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />
  </configSections>
  
  <log4net>
    <appender name="RollingFileAppender" type="log4net.Appender.RollingFileAppender">
      <file value="log.txt" />
      <immediateFlush value="true" />
      <appendToFile value="true" />
      <maximumFileSize value="100MB" />
      <maxSizeRollBackups value="2" />
      <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%date [%thread] %-5level %logger - %message%newline" />
      </layout>
      <lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
    </appender>

    <appender name="AsyncForwarder" type="Log4Net.Async.AsyncForwardingAppender,Log4Net.Async">
      <appender-ref ref="RollingFileAppender" />
    </appender>

    <logger name="Logging.StatsD.AsynchronousLoggingService">
      <level value="DEBUG" />
      <appender-ref ref="AsyncForwarder" />
      <!--<appender-ref ref="AsyncRollingFileAppender" />-->
    </logger>

    <logger name="Logging.StatsD.SynchronousLoggingService">
      <level value="DEBUG" />
      <appender-ref ref="RollingFileAppender" />
    </logger>

    <logger name="Logging.StatsD.AsyncAwaitLoggingService">
      <level value="DEBUG" />
      <appender-ref ref="RollingFileAppender" />
    </logger>
    
  </log4net>

  <appSettings>
    <add key="Number.Of.Times" value="10000" />
  </appSettings>
  
  <startup> 
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5.1" />
  </startup>
</configuration>

Starting on line 24 there are 3 loggers, one for each of the logging service classes I created. Each logger was configured to use one of the appenders defined on lines 8 and 20:

  • RollingFileAppender – a standard log4Net rolling file appender
  • AsyncForwarder – A Log4Net.Async appender. This appender uses a ring buffer and it is possible for message loss to take place. More on that later.

 

A note about Log4Net.Async appenders

It pays to read documentation and that’s certainly true of the Log4Net.Async library. In my test application the logger created for the AsynchronousLoggingService is set to use the AsyncForwarder appender. This appender – one of the Log4Net.Async appenders - provides high performance at the expense of potential lost messages.

And lost messages is what I got. When running the example I saw occasional log entries like this:

2015-04-05 10:15:13,886 [AsyncForwarder Forwarding Appender Thread] ERROR AsyncForwardingAppender - Buffer overflow. 13 logging events have been lost in the last 30 seconds. [BufferSize: 1000]

This isn’t an error; it’s by design.

“This appender utilizes a ring buffer with a a default limit of 1000 messages. If the head of the buffer overtakes the tail, message loss occurs. This behavior prioritizes application performance over logging fidelity. The current implementation uses a 10ms polling (sleep) period on the background thread.” - https://github.com/cjbhaines/Log4Net.Async

If you need lossless messaging another option is to use the ParallelForwardingAppender.

“This appender utilizes System.Collections.Concurrent.BlockingCollection(T) and other facets of the Task Parallel Library (TPL) to implement a lossless message queue. This implementation does not use polling but rather waits until new messages are available to append. This results in less CPU overhead for queue polling and appending starts without having to wait on sleep expiration.” - https://github.com/cjbhaines/Log4Net.Async

Trying things out

I ran the application a number of times over the course of a few minutes and then viewed the Graphite UI in my browser (to get to the Graphite I went to http://localhost:8888 as described in my previous post). By expanding the Graphite > stats > timers folder on the Graphite page I was able to see the 3 metrics I added to the application. That confirmed that the metrics were in fact being forwarded to statsd. Cool!

image

By expanding each of the timers I could see various values which I could select and add to the graph. I chose the ‘upper’ value for each of the 3 timers.

image

Once the values had been chosen for each of the metrics the results appeared in the graph. I ran the test application approximately once a minute over a 10 minute period so to make things clearer I configured the graph to only show data for the last 10 minutes. I also changed the line type to be a connected line.

image

image

Now I have to say I was blown away by what I saw. The version of the logging service that was using the Log4Net.Async forwarding appender was blisteringly fast. Check out the graph for yourself.

image

I’m a bit suspicious so I did some additional tests using a StopWatch in the code but the results were confirmed. The down-side was lost messages (roughly between 8 and 20 per run).

My naive async await implementation certainly wasn’t any faster – it was actually slower - although I could see that logging work was being assigned to different threads.

As an experiment I switched to a different Log4Net.Async appender, the ParallelForwardingAppender, which guarantees message delivery and tried again.

image

Not as fast as the AsyncForwardingAppender but still much faster than the standard log4Net appender.

Wrapping up

So there we have it. We’ve managed to get metrics from an application into statsd using the Statsd C# Client library and we’ve seen that the Log4Net.Async appenders offer performance gains over the standard log4Net equivalents. We’ve also managed to view the metrics in Graphite.

Result! Glass of wine anybody?