accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <billie.rina...@gmail.com>
Subject Re: How does Accumulo process a r-files for bulk ingesting?
Date Wed, 07 Oct 2015 14:49:23 GMT
I think that calculation assumes that the table has exactly one tablet per
server.  If that's not the case, the bound would be lower, like max(0, n -
m*number of tablets per server).

On Wed, Oct 7, 2015 at 6:50 AM, Jeff Kubina <jeff.kubina@gmail.com> wrote:

> So if the HDFS has a replication factor of m and an r-file has a range
> that intersects n tablets, then data-locality will never be achieved for
> max(0,n-m) of the r-files, that is, they will never be on the same node as
> their tablet server until compaction, correct?
>
> --
> Jeff Kubina
> 410-988-4436
>
>
> On Wed, Oct 7, 2015 at 9:35 AM, Josh Elser <josh.elser@gmail.com> wrote:
>
>>
>> On Oct 7, 2015 8:47 AM, "Jeff Kubina" <jeff.kubina@gmail.com> wrote:
>> >
>> > How does Accumulo process an r-file for bulk ingesting when the key
>> range of an r-file is within one tablet's key range and when the key range
>> of an r-file spans two or more tablets?
>> >
>> > If the r-file is within one tablet's range I thought the file was "just
>> renamed" and added to the tablet's list of r-files. Is that correct?
>>
>> Bingo
>>
>> > If the key range of the r-file spans two or more files is the r-file
>> partitioned into separate r-files for each appropriate tablet server or are
>> the records "batch-written" to each appropriate tablet in memory?
>>
>> They're logically partitioned if memory serves (the files are not
>> rewritten). So you would see multiple entries in the metadata table for a
>> single file with certain offsets. No replaying of mutations by batch
>> writers.
>>
>
>

Mime
View raw message