cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Folke Behrens (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1214) Make standard IO the default
Date Mon, 09 Aug 2010 14:58:22 GMT


Folke Behrens commented on CASSANDRA-1214:

How does the JNA approach behave if there is no C library (Windows?) or mlockall doesn't exist
(OS X?)
In case of  Mac OS X an UnsatisfiedLinkError will be thrown. Windows? I don't know. Maybe
a JNA-specific exception, maybe a ULE, too. OS's can be easily detected with Platform.isXXX()
and dealt with accordingly. 

something as simple as "grab errno" became a holy mess of portability concerns.
Yes, but errno is a particularly hard case. The "inventors" messed up big time with this.
That's why the JNA developers provide two ways to check errno: you either mark your methods
with "throws LastErrorException" or you ask Native.getLastError(). This works under Windows,

The proposed JNA patch seems to suffer from exactly this problem as far as I can see, making
assumptions about what the concrete values are of MCL_CURRENT and MCL_FUTURE.
Theoretically, you're right, in practice, however, I can't find a single POSIX system that
assigns different values to MCL_CURRENT or MCL_FUTURE, and I think it's highly unlikely that
these will change in the future. If so, Cassandra's code can be adjusted.

As far as I can tell, once one has gotten over the initial one-time hurdle of using JNI and
the associated building issues, you have a much more correct/standards-compliant access to
the native platform than through JNA since you're in compile time with access to appropriate
headers etc.
Please do correct me if I'm wrong, since the idea of avoiding compile time/build issues is
certainly very attractive and the reason why I tried to find an acceptable solution with JNA
in the past.
You're absolutely right, and your JNI code is really superb. If Cassandra needs to bind a
couple more native functions I'd say JNI is the way to go. But not just yet.

> Make standard IO the default
> ----------------------------
>                 Key: CASSANDRA-1214
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: James Golick
>         Attachments: mlockall-jna.patch.txt, Read Throughput with mmap.jpg, trunk-1214.txt
> The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive
buffers without any care for bounding the total size of the program's buffers. As the node's
dataset grows, this *will* lead to swapping and instability.
> This is a dangerous and wrong default for a couple of reasons.
> 1) People are likely to test cassandra with the default settings. This issue is insidious
because it only appears when you have sufficient data in a certain node, there is absolutely
no way to control it, and it doesn't at all respect the memory limits that you give to the
> That can all be ascertained by reading the code, and people should certainly do their
homework, but nevertheless, cassandra should ship with sane defaults that don't break down
when you cross some magic unknown threshold.
> 2) It's deceptive. Unless you are extremely careful with capacity planning, you will
get bit by this. Most people won't really be able to use this in production, so why get them
excited about performance that they can't actually have?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message