accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <john.w.vi...@ugov.gov>
Subject Re: querying the tablet server for given row (to get locality)?
Date Sun, 01 Jul 2012 18:37:09 GMT
The tablet location is stored in the !METADATA table with the column family
loc. You can use that information to have locality for your external
processes. Keep in mind that the master will migrate tablets around, so you
should have to periodically recheck to make sure your locality is still
present.

John

On Sun, Jul 1, 2012 at 2:20 PM, William Slacum <wslacum@gmail.com> wrote:

> A tablet will contain at minimum one row. So, if you shard/partition,
> eventually your data will grow to the point that each tablet will
> essentially be one row.
> On Jul 1, 2012 2:17 PM, "Sukant Hajra" <qn2b6c2b9w@snkmail.com> wrote:
>
>> I've been considering using distributed messaging service (Akka in my
>> case).
>> To get some throughput on ingesting data, I was going to shard computation
>> across multiple servers, but the backend is still Accumulo.
>>
>> What bothers me is that I don't know the mapping from row IDs to tablet
>> servers, so every one of my nodes is talking ostensibly to every tablet
>> server,
>> which is a lot of needless network traffic.
>>
>> What I'd really like to do is collocate my computation on the relevant
>> tablet
>> server to get the same benefits of locality Accumulo gets with HDFS.
>>
>> I feel Accumulo has to have this information internally, but I haven't dug
>> deeply into the source to see if it's exposed to Accumulo clients.  Is it
>> there?  If it is exposed, is it supported?
>>
>> Thanks for the help,
>> Sukant
>>
>

Mime
View raw message