cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
Date Mon, 01 Dec 2014 23:29:15 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230675#comment-14230675
] 

Ariel Weisberg edited comment on CASSANDRA-7438 at 12/1/14 11:28 PM:
---------------------------------------------------------------------

Look pretty nice.

Suggestions:
* Push the stats into the segments and gather them the way you do free capacity and cleanup
count. You can drop the volatile (technically you will have to synchronize on read). Inside
each OffHeapMap put the stats members (and anything mutable) as the first declared fields.
In practice this can put them on the same cache line as the lock field in the object header.
It will also be just one flush at the end of the critical section. Stats collection should
be free so no reason not to leave it on all the time.
* I am not sure batch cleanup makes sense. When inserting an item into the cache would blow
the size requirement I would just evict elements until inserting it wouldn't. Is there a specific
efficiency you think you are going to get from doing it in batches?
* Cache is the wrong API to use since it doesn't allow lazy deserialization and zero copy.
Since entries are refcounted there is no need make a copy. Might be something to save for
later since everything upstream expects a POJO of some sort.
* Key buffer might be worth a thread local sized to a high watermark

Do we have a decent way to do line level code review? I can't  leave comments on github unless
there is a pull request. Line level stuff
* Don't catch exceptions and handle inside the map. Let them all propagate to the caller and
use try/finally to do cleanup. I know you have to wrap and rethrow some things, but avoid
where possible.
* Compare key compares 8 bytes at a time, how does it handle trailing bytes and alignment?
* Agrona has an Unsafe ByteBuffer implementation that looks like it makes a little better
use of various intrinsics then AbstractDataOutput. Does some other nifty stuff as well. https://github.com/real-logic/Agrona/blob/master/src/main/java/uk/co/real_logic/agrona/concurrent/UnsafeBuffer.java
* In OffHeapMap.touch lines 439 and 453 are not covered by tests. Coverage looks a little
weird in that a lot of the cases are always hit but some don't touch both branches. If lruTail
== hashEntryAddr maybe assert next is null.
* Rename mutating OffHeapMap lruNext and lruPrev to reflect that they mutate. In general rename
mutating methods to reflect they do that such as the two versions of first
* I don't see why the cache can't use CPU endianness since the key/value are just copied.
* Did you get the UTF encoded string stuff from somewhere? I see something similar in the
jdk, can you get that via inheritance?
* HashEntryInput, AbstractDataOutput  are low on the coverage scale and have no tests for
some pretty gnarly UTF8 stuff.
* Continuing on that theme there is a lot of unused code to satisfy the interfaces being implemented,
would be nice to avoid that.
* By hashing the key yourself you prevent caching the hash code in the POJO. Maybe hashes
should be 32-bits and provided by the POJO?
* If an allocation fails maybe throw OutOfMemoryError with a message
* If an entry is too large maybe return an error of some sort? Seems like caller should decide
if not caching is OK.
* In put, why catch VirtualMachineError and not error?  Seems like it wants a finally, and
it shouldn't throw checked exceptions.
* If a key serializer is necessary throw in the constructor and remove other checks
* Hot N could use a more thorough test?
* In practice how is hot N used in C*? When people save the cache to disk do they save the
entire cache?
* In the value loading case, I think there is some subtlety to the concurrency of invocations
to the loader in that it doesn't call it on all of them in a race. It might be a minor change
in behavior compared to Guava.
* Maybe do the value loading timing in nanoseconds? Performance is the same but precision
is better.
* OffHeapMap.Table.removeLink(long,long) has no test coverage of the second branch that walks
a bucket to find the previous entry
* I don't think storage for 16 million keys is enough? For 128 bytes per entry that is only
2 gigabytes. You would have to run a lot of segments which is probably fine, but that presents
a configuration issue. Maybe allow more than 24 bits of buckets in each segment?
* SegmentedCacheImpl contains duplicate code fro dereferencing and still has to delegate part
of the work to the OffHeapMap. Maybe keep it all in OffHeapMap?
* Unit test wise there are some things not tested. The value loader interface, various things
like putAll or invalidateAll.
* Release is not synchronized. Release should null pointers out so you get a good clean segfault.
Close should maybe lock and close one segment at a time and invalidate as part of that.








was (Author: aweisberg):
Look pretty nice.

Suggestions:
* Push the stats into the segments and gather them the way you do free capacity and cleanup
count. You can drop the volatile (technically you will have to synchronize on read). Inside
each OffHeapMap put the stats members (and anything mutable) as the first declared fields.
In practice this can put them on the same cache line as the lock field in the object header.
It will also be just one flush at the end of the critical section. Stats collection should
be free so no reason not to leave it on all the time.
* I am not sure batch cleanup makes sense. When inserting an item into the cache would blow
the size requirement I would just evict elements until inserting it wouldn't. Is there a specific
efficiency you think you are going to get from doing it in batches?
* Cache is the wrong API to use since it doesn't allow lazy deserialization and zero copy.
Since entries are refcounted there is no need make a copy. Might be something to save for
later since everything upstream expects a POJO of some sort.
* Key buffer might be worth a thread local sized to a high watermark

Do we have a decent way to do line level code review? I can't  leave comments on github unless
there is a pull request. Line level stuff
* Don't catch exceptions and handle inside the map. Let them all propagate to the caller and
use try/finally to do cleanup. I know you have to wrap and rethrow some things, but avoid
where possible.
* Compare key compares 8 bytes at a time, how does it handle trailing bytes and alignment?
* Agrona has an Unsafe ByteBuffer implementation that looks like it makes a little better
use of various intrinsics then AbstractDataOutput. Does some other nifty stuff as well. https://github.com/real-logic/Agrona/blob/master/src/main/java/uk/co/real_logic/agrona/concurrent/UnsafeBuffer.java
* In OffHeapMap.touch lines 439 and 453 are not covered by tests. Coverage looks a little
weird in that a lot of the cases are always hit but some don't touch both branches. If lruTail
== hashEntryAddr maybe assert next is null.
* Rename mutating OffHeapMap lruNext and lruPrev to reflect that they mutate. In general rename
mutating methods to reflect they do that such as the two versions of first
* I don't see why the cache can't use CPU endianness since the key/value are just copied.
* Did you get the UTF encoded string stuff from somewhere? I see something similar in the
jdk, can you get that via inheritance?
* HashEntryInput, AbstractDataOutput  are low on the coverage scale and have no tests for
some pretty gnarly UTF8 stuff.
* Continuing on that theme there is a lot of unused code to satisfy the interfaces being implemented,
would be nice to avoid that.
* By hashing the key yourself you prevent caching the hash code in the POJO. Maybe hashes
should be 32-bits and provided by the POJO?
* If an allocation fails maybe throw OutOfMemoryError with a message
* If an entry is too large maybe return an error of some sort? Seems like caller should decide
if not caching is OK.
* put on allocation failure calls removeInternal, but the key doesn't appear to be in the
map yet? Is that to handle the put invalidating the previous entry?
* In put, why catch VirtualMachineError and not error?  Seems like it wants a finally, and
it shouldn't throw checked exceptions.
* If a key serializer is necessary throw in the constructor and remove other checks
* Hot N could use a more thorough test?
* In practice how is hot N used in C*? When people save the cache to disk do they save the
entire cache?
* In the value loading case, I think there is some subtlety to the concurrency of invocations
to the loader in that it doesn't call it on all of them in a race. It might be a minor change
in behavior compared to Guava.
* Maybe do the value loading timing in nanoseconds? Performance is the same but precision
is better.
* OffHeapMap.Table.removeLink(long,long) has no test coverage of the second branch that walks
a bucket to find the previous entry
* I don't think storage for 16 million keys is enough? For 128 bytes per entry that is only
2 gigabytes. You would have to run a lot of segments which is probably fine, but that presents
a configuration issue. Maybe allow more than 24 bits of buckets in each segment?
* SegmentedCacheImpl contains duplicate code fro dereferencing and still has to delegate part
of the work to the OffHeapMap. Maybe keep it all in OffHeapMap?
* Unit test wise there are some things not tested. The value loader interface, various things
like putAll or invalidateAll.
* Release is not synchronized. Release should null pointers out so you get a good clean segfault.
Close should maybe lock and close one segment at a time and invalidate as part of that.







> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
>                 Key: CASSANDRA-7438
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Linux
>            Reporter: Vijay
>            Assignee: Vijay
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in JVM heap as
BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better results, but
this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off heap and
use JNI to interact with cache. We might want to ensure that the new implementation match
the existing API's (ICache), and the implementation needs to have safe memory access, low
overhead in memory and less memcpy's (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message