I added the tests so you guys could run them and take a look. I have no problem with the changes being reverted.

As I think I said, I expected most of the async appenders to back up. I expected them to be a bit slower, but I didn’t expect them to be as slow as they are when the queue is full. I also don’t understand why AsyncLogger is slower as it should have a large ring buffer if I understand correctly.


On Feb 11, 2017, at 4:36 PM, Remko Popma <remko.popma@gmail.com> wrote:

I don't think it is a good idea to mix async tests with sync tests. 

We already have separate benchmarks for the various ways to log asynchronously, if you want to compare them just run those also. 

JMH makes it super easy to build benchmarks but we have to be very careful that the test really measures what we want to measure. 

Asynchronous logging has two "states", queue full and space available. Do we want to measure the queue full state or the space available state (or the transition between the two)?

With a normal FileAppender JMH will iterate so fast that the queue immediately fills up. This will likely happen during the warmup iterations (no guarantee of course). 

What actually happens in that state? We used to block until space becomes available in the queue. What we do now depends on the configured AsyncQueueFullPolicy. Because blocking caused deadlocks in some scenarios, our current default is to bypass the queue and log to the FileAppender directly. I expect that to be slower than the simple FileAppender because of the additional lock contention on the tail of the queue during the attempted thread handover. 

The "queue full" state is not the normal logging state for an application. If we want to measure this we should move these tests to a separate benchmark that is clearly marked "QueueFullAsyncBenchmark" or something like that. 
Otherwise people reading these benchmark results will misinterpret them and get confused. 

The existing Async benchmarks ensure they measure the "queue space available" state. 


Sent from my iPhone

On Feb 12, 2017, at 4:37, Apache <ralph.goers@dslextreme.com> wrote:

Just for fun I decided to add some more tests to the file appender benchmark. I’ve checked them in. Please review them to see if everything is configured so the tests make sense. 

Note that I would expect the async appenders to reduce to the speed of the file appender, since once the queue fills up that is what they are waiting on. But I didn’t set a buffer size for the disruptor or async logger tests so I would have expected those to be quite a bit faster than the regular file test. 

The one thing that is definitely worth noting is how truly terrible the JUL file appender is. I have to assume that it must be doing an immediate flush on every write.

This is on my MacBook Pro - what Ceki would call Fast CPU/Fast SSD

Benchmark                                                  Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile                 thrpt       10    69.546 ±   2.635  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File              thrpt       10   783.006 ±  28.138  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2AsyncAppender     thrpt       10   939.605 ±  38.655  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2AsyncDisruptor    thrpt       10  1446.522 ±  45.485  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2AsyncLogger       thrpt       10  1269.014 ±  27.236  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File              thrpt       10  1475.605 ±  74.675  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF               thrpt       10  2131.240 ± 114.184  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF               thrpt       10  1499.667 ±  39.668  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackAsyncFile        thrpt       10   326.969 ±   2.690  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile             thrpt       10   940.527 ±  34.090  ops/ms

And this is on my old MacBook Pro - it uses a hard drive so isn’t very fast.

Benchmark                                                  Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile                 thrpt       10    15.722 ±  15.557  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File              thrpt       10   530.668 ±  54.193  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2AsyncAppender     thrpt       10   498.620 ± 178.693  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2AsyncDisruptor    thrpt       10   454.541 ± 145.505  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2AsyncLogger       thrpt       10   527.784 ± 150.269  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File              thrpt       10   587.605 ±  97.769  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF               thrpt       10  1966.092 ± 431.196  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF               thrpt       10   364.694 ±  34.602  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackAsyncFile        thrpt       10   258.220 ±   1.936  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile             thrpt       10   560.958 ±  36.982  ops/ms


On Feb 9, 2017, at 1:39 PM, Matt Sicker <boards@gmail.com> wrote:

Also, there are some potential issues with logback-perf:

* JMH is way out of date (1.11.3 versus 1.17.4)
* Less warmup iterations than we do

Anyways, results for 32 threads (8 core environment):

Benchmark                           Mode  Cnt     Score    Error   Units
FileAppenderBenchmark.log4j1File   thrpt   10   695.774 ±  9.567  ops/ms
FileAppenderBenchmark.log4j2File   thrpt   10  1300.091 ± 17.579  ops/ms
FileAppenderBenchmark.log4j2RAF    thrpt   10  1365.118 ± 17.656  ops/ms
FileAppenderBenchmark.logbackFile  thrpt   10   766.294 ± 10.121  ops/ms

On 9 February 2017 at 14:37, Matt Sicker <boards@gmail.com> wrote:
That value does look messed up. I'll re-run the 32 thread tests. Also, I'm not on the logback lists yet, so I'll sign up this afternoon.

On 9 February 2017 at 14:35, Apache <ralph.goers@dslextreme.com> wrote:
What is up with that score for 32 threads?  That can’t possibly be correct.


On Feb 9, 2017, at 12:45 PM, Matt Sicker <boards@gmail.com> wrote:

I ran the logback-perf repo on the same AWS instance. Here's the CSV data. It appears as soon as more than one thread comes into play, log4j2 has better scores.

"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"
"Benchmark","Mode","Threads","Samples","Score","Score Error (99.9%)","Unit"

On 9 February 2017 at 13:02, Apache <ralph.goers@dslextreme.com> wrote:
You might try running Ceki’s benchmark project on AWS and publish those numbers here. He also asked people to publish numbers on the Logback user’s list so he can add them to his spreadsheet.

From your numbers and the numbers I’ve been getting, it is clear to me that the SSDs in Apple’s MacBook’s are pretty awesome. From the hardware Remko is listing I’d say his machine is about as old as my other MacBook except that he has an SSD that is slightly faster than my hard drive.


On Feb 9, 2017, at 11:12 AM, Matt Sicker <boards@gmail.com> wrote:

Ran on an AWS instance (CentOS 7.2), cpuinfo says 8-core Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz, not super sure about all the params involved in making the instance, but here's some data (same strangeness with MMF):

Benchmark                                         Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       10    86.867 ±   4.502  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       10   671.156 ±   7.099  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       10  1221.814 ±  22.130  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       10  1178.407 ± 960.141  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       10  1220.746 ±  34.421  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       10   898.122 ±   8.128  ops/ms

On 9 February 2017 at 12:02, Matt Sicker <boards@gmail.com> wrote:
Run on a MacBook Pro (Retina, 15-inch, Mid 2015) 2.5 GHz Intel Core i7. Can find out more hardware specs if needed. I also noticed that the memory mapped file starts out fast and slows down over time (somewhat seen by the error value here).

Benchmark                                         Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       10    96.540 ±   7.875  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       10   766.286 ±  11.461  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       10  1787.620 ±  36.695  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       10  1506.588 ± 956.354  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       10  1934.966 ±  50.089  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       10  1285.066 ±  12.674  ops/ms

On 9 February 2017 at 11:44, Remko Popma <remko.popma@gmail.com> wrote:
My results on Windows 10 64-bit laptop (java 1.8.0_51 64bit), i5-3317u CPU @ 1.70GHz (dual core with hyperthreading for 4 virtual cores), SSD hard disk:

java -jar log4j-perf/target/benchmarks.jar ".*FileAppenderBenchmark.*" -f 1 -wi 10 -i 20 -t 4 -tu ms

# Run complete. Total time: 00:03:58

Benchmark                                         Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       20    37.646 ±   0.876  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       20   405.305 ±   6.596  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       20   751.949 ±  16.055  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       20  1250.666 ± 168.757  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       20   728.743 ±  23.909  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       20   676.926 ±  19.518  ops/ms

Logback config without immediateFlush=false:

# Run complete. Total time: 00:03:44

Benchmark                                         Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       20    37.949 ±   1.220  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       20   404.042 ±   8.450  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       20   690.393 ± 115.537  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       20  1221.681 ±  82.205  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       20   823.059 ±  41.512  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       20    83.352 ±  11.911  ops/ms

On Fri, Feb 10, 2017 at 1:05 AM, Mikael Ståldal <mikael.staldal@magine.com> wrote:
I guess that with a memory mapped file, you leave it to the OS to do the best it can, and you lose any direct control over how it is actually done.

On Thu, Feb 9, 2017 at 4:52 PM, Apache <ralph.goers@dslextreme.com> wrote:
On my Mac Pro with the slower external SSD I now got:

Benchmark                                         Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       10    73.739 ±   0.740  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       10   683.129 ±   9.407  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       10   991.293 ± 193.049  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       10  3072.250 ±  63.475  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       10  1056.256 ± 137.673  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       10   784.723 ± 153.226  ops/ms

and on the same machine with the faster internal SSD I got:

Benchmark                                         Mode  Samples     Score     Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       10    74.661 ±   0.232  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       10   647.041 ±   2.994  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       10  1333.887 ±  13.921  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       10  3025.726 ± 210.414  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       10  1433.620 ±  11.194  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       10  1026.319 ±  13.347  ops/ms

I will continue to run this on a few other configurations. I think I would also like to add the async appenders/loggers to this test so that one can see all the various options. 

It is really quite interesting to me to see how the memory mapped appender behaves so differently from one machine to another. I don’t know under what circumstances I would recommend using it though.


On Feb 9, 2017, at 7:03 AM, Apache <ralph.goers@dslextreme.com> wrote:

After modifying the configuration the new results on my laptop are:

Benchmark                                         Mode  Samples     Score      Error   Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       10    92.580 ±    3.698  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       10   828.707 ±   55.006  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       10  1647.230 ±  125.682  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2MMF      thrpt       10  1465.002 ± 1284.943  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       10  1765.340 ±  149.707  ops/ms
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       10  1192.594 ±   57.777  ops/ms

I will try the other machines later and post those results.


On Feb 9, 2017, at 5:22 AM, Apache <ralph.goers@dslextreme.com> wrote:

Ceki replied on twitter that the immediateFlush option is now a parameter of the appended, not the encoder, so it looks like the confit needs to be changed and the test rerun.


On Feb 9, 2017, at 3:08 AM, Remko Popma <remko.popma@gmail.com> wrote:

FYI, The write and flush methods in BufferedOutputStream are also synchronized, so we won't be able to do away with synchronization completely. 

In OutputStreamManager we synchronize multiple methods but these are nested calls. I thought reentrant synchronization had negligible overhead but I haven't measured this myself. 

Sent from my iPhone

On Feb 9, 2017, at 2:31, Apache <ralph.goers@dslextreme.com> wrote:

I’m pretty sure the problem we have is that a) we are synchronizing many methods and b) we are synchronizing more than just the write. Unfortunately, I can’t figure out how to reduce that based on how dispersed the code has gotten.


On Feb 8, 2017, at 10:14 AM, Apache <ralph.goers@dslextreme.com> wrote:

I tried to modify FileManager to just use a BufferedOutputStream but discovered I couldn’t as the layouts now require the ByteBuffer. 


On Feb 8, 2017, at 12:14 AM, Apache <ralph.goers@dslextreme.com> wrote:

The append method isn’t synchronized but the writeBytes method acquires a lock. His code is actually a lot simpler than ours in that it just uses a BufferedOutputStream and he only obtains the lock when he is writing to it.


On Feb 6, 2017, at 5:23 PM, Matt Sicker <boards@gmail.com> wrote:

I'm not sure if I'm looking in the right place, but a major difference now between Logback's appenders and Log4j's is that Logback isn't synchronized on the append method.

On 6 February 2017 at 18:18, Matt Sicker <boards@gmail.com> wrote:
Is this something we can improve performance on by implementing a file appender based on FileChannel or AsynchronousFileChannel instead of OutputStream?

On 6 February 2017 at 17:50, Apache <ralph.goers@dslextreme.com> wrote:
Ceki has updated his numbers to include those reported on the mailing list. https://docs.google.com/spreadsheets/d/1cpb5D7qnyye4W0RTlHUnXedYK98catNZytYIu5D91m0/edit#gid=0

I haven’t run the tests with Logback 1.2-SNAPSHOT but my numbers for my two MacBooks are at https://docs.google.com/spreadsheets/d/1L67IhmUVvyLBWtC6iq0TMj-j0vrbKsUKWuZV0Nlqisk/edit?usp=sharing


On Feb 6, 2017, at 9:33 AM, Apache <ralph.goers@dslextreme.com> wrote:

Yes, that is still the standard approach most people use and is the only way to provide a head-to-head comparison against the logging frameworks.


On Feb 6, 2017, at 8:02 AM, Matt Sicker <boards@gmail.com> wrote:

This is all in a synchronous appender, right? Either way, that's rather interesting.

On 6 February 2017 at 07:54, Apache <ralph.goers@dslextreme.com> wrote:
Someone posted numbers on the Logback user’s list that match mine.  It shows Logback 1.1.9 was pretty terrible, 1.1.10 is somewhat better and 1.2-SNAPSHOT is on par or slightly better than Log4j 2.


On Feb 5, 2017, at 3:25 PM, Matt Sicker <boards@gmail.com> wrote:

I think we need some comparisons on the log4j side: file appender with 256k buffer size, random access file appender with 256k buffer size (which appears to be the default), and memory mapped file appender. It'd be cool to see how these compose with async logging enabled in both log4j and logback.

On 5 February 2017 at 16:06, Apache <ralph.goers@dslextreme.com> wrote:
You should run the code at https://github.com/ceki/logback-perf to compare your results to Ceki’s.  You also should capture the cpubenchmark speed of your processor and get the speed of your hard drive. I used Blackmagic speed test on my Mac. I am capturing my results in a Google spreadsheet. I will post the like once I have it.


On Feb 5, 2017, at 1:48 PM, Gary Gregory <garydgregory@gmail.com> wrote:

If you want, I can run tests on Windows once the build works on Windows again.

Let me know what args/command line...


On Feb 5, 2017 11:58 AM, "Apache" <ralph.goers@dslextreme.com> wrote:
I guess my MacBook Pro must fit in the Slow CPU/Fast Hard drive category. With Logback 1.10 and -t 4  now get

Benchmark                                         Mode  Samples        Score       Error  Units
o.a.l.l.p.j.FileAppenderBenchmark.julFile        thrpt       20    98187.673 ±  4935.712  ops/s
o.a.l.l.p.j.FileAppenderBenchmark.log4j1File     thrpt       20   842374.496 ±  6762.712  ops/s
o.a.l.l.p.j.FileAppenderBenchmark.log4j2File     thrpt       20  1853062.583 ± 67032.225  ops/s
o.a.l.l.p.j.FileAppenderBenchmark.log4j2RAF      thrpt       20  2036011.226 ± 53208.281  ops/s
o.a.l.l.p.j.FileAppenderBenchmark.logbackFile    thrpt       20   999667.438 ± 12074.003  ops/s

I’ll have to try this on one my VMs at work. We don’t run anything directly on bare metal any more.


On Feb 5, 2017, at 9:40 AM, Apache <ralph.goers@dslextreme.com> wrote:

Ceki finally fixed some of the performance problems in the FileAppender. See https://logback.qos.ch/news.html and https://docs.google.com/spreadsheets/d/1cpb5D7qnyye4W0RTlHUnXedYK98catNZytYIu5D91m0/edit#gid=0. I suspect we have a few optimizations we can make.


Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>


Mikael Ståldal
Senior software developer

Magine TV
Grev Turegatan 3  | 114 46 Stockholm, Sweden  |   www.magine.com

Privileged and/or Confidential Information may be contained in this message. If you are not the addressee indicated in this message
(or responsible for delivery of the message to such a person), you may not copy or deliver this message to anyone. In such case, 
you should destroy this message and kindly notify the sender by reply email.   

Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>

Matt Sicker <boards@gmail.com>