cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Pfau <>
Subject Re: Storing pre-sorted data
Date Tue, 18 Oct 2011 21:31:05 GMT
Hi David,
encrypting and Decrypting data in cassandra is not an option to us. 
However, I read your elaboration on best practices for adding a custom 
feature to cassandra with a lot of interest. Thank you!

Please see below for the answers to your questions.

Kind regards

On 10/18/2011 08:53 AM, David Jeske wrote:
> On Mon, Oct 17, 2011 at 2:39 AM, Matthias Pfau <
> <>> wrote:
>     We would be very happy if cassandra would give us an option to
>     maintain the sort order on our own (application logic). That is why
>     it would be interesting to hear from any of the developers if it
>     would be easily possible to add such a feature to cassandra.
> What you are describing above is option (b), you would do this by
> building your sort-order, encryption, and decryption into Cassandra. Let
> me elaborate...
> The database always has to know how to compute sort order for items.
> Deferring it to your code can only happen two ways, in-process, or
> out-of-process. Deferring sort-order comparisons to out-of-process code
> would have diasterous effects on performance, as they are used multiple
> times for every single operation the database does. Therefore, short of
> an application where performance is irrelevant, the feasable method to
> allow your code to maintain sort-order is "option b", to build your
> sort-order/encryption/decryption into the database. Cassandra would have
> to initialize it at startup to read your database.
> Cassandra is open-source, so you can do this work on your own right now.
> Aaron's message provided some pointers.
> If you do go this route, you'll probably want to separate your
> sort-order-and-encryption-handler into a separate JAR, and add some code
> to Cassandra to load-and-register your classes when the database starts.
> You'd submit this "stable data-format plug-in-API" patch to Cassandra,
> and hopefully find a way to get it accepted into the main codebase. This
> would make it easier for you to update to new versions, as you would
> only be dependent only on the public-API, rather than a private fork of
> Cassandra.
>     Otherwise, it seems like we have to implement sth. based on strategy
>     (a) because (b) is not feasible for us and (c) is a rather young
>     research topic which is slowly gaining more attention.
> Certainly (option a) is the most straightforward method if you wish to
> keep your codebase completely separate from your database (whether
> Cassandra or not). Whether this is an acceptable security risk or not is
> up to you.
> --------
> Pulling back from implementation issues, I wonder if you might share a
> bit more about the reason you need this functionality for your
> application. Here are a few questions I'm curious about:
> 1) Is the data all-encrypted with a single key, or do different records
> use different keys?

We are building a zero knowledge application. The data (of this single 
CF) is encrypted with a single key.

> 2) If a single key, would adding a file/block/record-level encryption to
> Cassandra solve this problem? If not, why not? Is there something
> special about your encryption methods?

There is nothing special about our encryption methods but will never be 
able to encrypt or decrypt data on our server as the keys will always 
remain on the clients. Therefore, we would not profit from built-in 
cassandra encryption support. However, this would probably be a good 
feature for many other users.

> 3) Is the compression of the data somehow special, such that block-level
> compression (either zlib, snappy, or even a custom-implemented scheme)
> is not viable? If so, why?

No, we just have to compress before encryption because it wouldn't make 
much sense afterwards.

> 4) Is there something special about the sorting that makes it hard to
> expose the sort order to a database? (other than cassandra's lack of
> general composite key sorting)

No, Cassandra would be able to sort the data in unencrypted form. 
However, as the data is encrypted, we can not make use of cassandras 
sorting capabilities.

Kind regards

View raw message