accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <busbey+li...@cloudera.com>
Subject Re: ISAM file location vs. read performance
Date Sun, 12 Jan 2014 23:17:39 GMT
On Sun, Jan 12, 2014 at 4:42 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> Some data on short circuit reads would be great to have.
>
>
What kind of data are you looking for? Just HDFS read rates? or
specifically Accumulo when set up to make use of it?



> I'm unsure of how correct the "compaction leading to eventual locality"
> postulation is. It seems, to me at least, that in the case of a multi-block
> file, the file system would eventually try to distribute those blocks
> rather than leave them all on a single host.
>
>
>

I know in HBase set ups, it's common to either disable the HDFS Balancer or
just disable for a namespace containing the part of the filesystem that
handles HBase. Otherwise, when the blocks are moved off to other hosts you
get performance degradation until compaction can happen again. I would
expect the same thing ought to be done for Accumulo.

Mime
View raw message