accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: compression of keys for a sequential scan over an inverted index
Date Thu, 22 Oct 2015 15:54:32 GMT
Jonathan Wonders wrote:
> I have been digging into some details of Accumulo to model the disk and
> network costs associated with various types of scan patterns and I have
> a few questions regarding compression.
> Assuming an inverted index table with rows following the pattern of
> <key><value><id>
> and a scan that specifies an exact key and value so as to constrain the
> range, it seems that the dominant factor in network utiltization would
> be sending key-value pairs from the tablet server to the client and a
> secondary factor would be transmitting data from non-local RFiles
> (assuming no caching).

Sounds about right to me.

> Is my understanding correct that the on-disk compression of this type of
> table is predominantly a function of the average number of differing
> bits between adjacent ids?  Or, has anyone observed a significant
> improvement with gz or lzo vs no additional compression?  I'm
> considering running some experiments to measure the difference for a few
> types of ids (uuid, snowflake-like, content based hashes), but I'm
> curious if anyone else has done similar experiments.

There is definitely a noticed difference between gzip, lzo/snappy 
(they're about on par with speed and size), and no compression. I'm not 
sure what the deltas are off the top of my head, but I would expect that 
you see noticeable differences.

You can also notice differences in "densities" when the lexicographical 
delta between two keys. Sequential keys that differ very little will 
result in very dense RFiles (lots of keys), where keys that vary greatly 
will result in less keys per file. We've had some discussions about the 
run-length encoding in RFile lately -- have you stumbled on that?

> Given a scan that specifies a range for an exact key and value, is there
> any transport compression performed for tablet server to client
> communication beyond the Key.compress method which appears to only
> compress equivalent rows, columns, etc as opposed to those that share a
> common prefix?

No, there are no compression algorithms applied to the data before 
sending it. By default we use the TCompactProtocol from Thrift. If 
you're interested in the specifics, Thrift has some documentation on the 

I do recall some experiments I tried previously in which a logical 
"object" in Accumulo was comprised of multiple key-value pairs which 
were returned in one key-value pair (very similar to how the 
WholeRowIterator serializes things). In this case, I experimented with 
compressing the serialized Key-Values before returning from the Iterator 
stack. Sadly, I don't recall what the takeaway was, but I don't believe 
it was outright "bad" :)

> It seems possible to implement a more specialized compression algorithm
> with the iterator framework, performing the decompression on the client
> side, but I'm curious if it could lead to general scan performance
> improvements if the default compression also involved run-length encoding.

This would be really neat to test.

Some rigor in designing an experiment would be the first step, IMO. If 
you want to spear-head this, I'm sure there are many who would be happy 
to give some feedback. It would be prudent to figure out what the 
variables in the "equation" would be, specifically the distribution of 
Key-Value pairs -- amount stored, compression on disk, cache sizes, 
query workload. First out what you want to test, what you will tweak, 
and define a hypothesis you want to test.

You could consider using the continuous ingest framework for 
data-generation and query workloads. You could also look at YCSB [1]
for some inspiration. Either of these would make your life easier in not 
having to generate the read/write workloads.


> Any insight on this subject is much appreciated.
> V/R
> Jonathan

View raw message