accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: How does Accumulo process a r-files for bulk ingesting?
Date Fri, 16 Oct 2015 21:18:09 GMT
Sorry... I clicked "Send" too quickly. The answer is yes, the tablet for
the range 2-3, will also get assigned this file, because we only check the
bounds of the RFile when determining where to assign it. The bounds are
stored in the RFile's metadata within its footer, so we don't have to read
the whole file during assignments. The file will continue to be associated
with that tablet, even though that tablet doesn't use any of its data,
until that tablet compacts and removes its reference.

On Fri, Oct 16, 2015 at 5:14 PM Christopher <ctubbsii@apache.org> wrote:

> Yes.
>
> On Fri, Oct 16, 2015 at 4:00 PM Jeff Kubina <jeff.kubina@gmail.com> wrote:
>
>> So if a table has splits 0, 1, 2, 3, 4 and I create an rfile with only
>> splits in the range 1-2 and 3-4, after bulk ingesting will the tablet with
>> range 2-3 also be assigned the rfile?
>>
>> --
>> Jeff Kubina
>> 410-988-4436
>>
>>
>> On Wed, Oct 7, 2015 at 12:05 PM, Jeff Kubina <jeff.kubina@gmail.com>
>> wrote:
>>
>>> Moved this thread to dev@.
>>>
>>> --
>>> Jeff Kubina
>>> 410-988-4436
>>>
>>>
>>> On Wed, Oct 7, 2015 at 11:18 AM, Josh Elser <josh.elser@gmail.com>
>>> wrote:
>>>
>>>> I'd say email is good for discussion on design (dev@ instead of user@
>>>> would probably be more appropriate though). JIRA works better when it gets
>>>> down to implementation of some design.
>>>>
>>>> That said, I don't think anyone is going to upset with you whichever
>>>> you choose :)
>>>>
>>>> Jeff Kubina wrote:
>>>>
>>>>> Okay, shall I flesh out some details of the performance testing code
in
>>>>> this thread or a JIRA?
>>>>>
>>>>> --
>>>>> Jeff Kubina
>>>>> 410-988-4436
>>>>>
>>>>>
>>>>> On Wed, Oct 7, 2015 at 10:57 AM, Josh Elser <josh.elser@gmail.com
>>>>> <mailto:josh.elser@gmail.com>> wrote:
>>>>>
>>>>>     Jeff Kubina wrote:
>>>>>
>>>>>         So has testing been done to determine how much a lack of data
>>>>>         locality
>>>>>         of bulk ingest files effects query performance?
>>>>>
>>>>>
>>>>>     I am not aware of any such writeup. I'm positive we'd be happy to
>>>>>     accept any findings you observe.
>>>>>
>>>>>
>>>>>
>>>
>>

Mime
View raw message