accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Parisi <m...@accumulo.net>
Subject Re: Accumulo Direct Reader
Date Wed, 17 Oct 2012 14:03:24 GMT
RFileOperations.getInstance() will return an instance of FileOperations,
which will allow you to call the open reader method and open any arbitrary
r file. The issue might be locating the r files, which are part of a given
row; however, this would be quit simple by going through the Metadata table
and looking for the rfiles associated with that given tablet. By doing this
you can bypass the entire iterator stack. I have an example of this on my
github, but in reality, those methods I mentioned above are all you really
need.

On Wed, Oct 17, 2012 at 9:46 AM, Denis <denis@camfex.cz> wrote:

>     Hi.
>
>     I am thinking about creating a Direct Reader for Accumulo.
>
>     A library which has API compatible with the Accumulo client but
> reads .rf-files directly from HDFS, bypassing tservers.
>
>     Motivation is:
>
>     1. To have a possibility to quickly read stalled data when the
> tserver is busy (with re-balancing, reading logs, etc) or just went
> down and its tablets are not redistributed yet.
>
>     2. If the table is read-only or can afford eventual consistency,
> many readers can work in parallel with no bottleneck of tserver. Also,
> the table's data becomes local on three (number of HDFS replicas)
> servers instead of one.
>
>     3. Distribution of data: analytics can download .rf-files (even to
> a laptop) and run their software locally.
>
>     Any suggestions ?
>
>     Thanks.
>

Mime
View raw message