accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <>
Subject Re: peformance
Date Fri, 03 May 2013 18:20:05 GMT
Hey Drew,

This could be a very broad question, so I'll give a partial answer and
encourage you to come back for more details.

Impala is a mechanism that sits on top of HBase or HDFS that is design to
filter and process large quantities of data. People generally like Impala
because it supports a subset of SQL and because it is optimized to reduce
the latency that might be incurred by starting up a job in a bulk
synchronous processing framework. Instead, it uses a series of daemon
processes and a custom API to reduce overhead.

With Accumulo, our approach to low-latency queries is generally to use a
table structure that incorporates some type of index. With appropriate
indexing techniques, Accumulo can achieve sub-second query latencies even
over multi-petabyte sized corpuses. Some of these table designs are
described in the manual:

Regarding the SQL piece, Accumulo does not natively support an SQL
interface. For that you would need to wrap it in a processing framework,
like Hive ( To make a
shameless plug, Sqrrl ( also offers that functionality.


On Fri, May 3, 2013 at 12:39 PM, Drew Pierce <> wrote:

> does anyone have any anecdotal results (nothing formal) for queries to
> speak to the likes of impala and near low-latency.
> Sent from my Android
> Sorry if brief

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message