accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Fwd: Tuning & Compactions
Date Thu, 06 Dec 2012 21:07:39 GMT
Keith noted that my response didn't go back to the whole list.

-Eric

---------- Forwarded message ----------
From: Eric Newton <eric.newton@gmail.com>
Date: Tue, Dec 4, 2012 at 2:25 PM
Subject: Re: Tuning & Compactions
To: chris@burrell.me.uk


By "small indexes"... I mean they are small to read off disk.  If you write
a gigabyte of indexes, it's going to take some time to read them into RAM.
The index is a sub-set of all the keys in the RFile.  If you have lots of
keys in the index, the lookups can be faster, but it takes more time to
load those keys into RAM.  Keep your keys small, and try to keep the
sub-set of keys in the index small so that first lookup is fast.  A million
index keys for a billion key/values is not unreasonable.  We have used even
smaller ratios, especially when the files to be imported are constructed to
fit the current split points.

You can have an infinite number of families and qualifiers.  However, if
you ever want to put families into locality groups, it's easier to
configure them if the number of families you want in the group is a small
number.  A group separates families by name.

Using the example from the google BigTable paper: you can store small
indexed items, like URLs, separately from large value items, like whole web
pages, which will give you faster search over the small items, while
logically keeping them in the same sorted index.  URLs would go into one
group, which would be stored separately from another group containing the
whole web page and maybe something like image data.  A search on URLs would
not need to decompress and skip over large values while scanning.  Further,
URLs are more similar to themselves, than they are to images, and so are
likely to compress better when stored together.

To complicate things further, Accumulo does not create separate files for
each family group, as implied in the BigTable paper.  They are stored in
separate sections of the RFile.  They are also created lazily: as the data
is re-written, they will gradually be organized according to the locality
group specifications.  You can force a re-write, if you like.

If you find yourself wanting to put extensions in the column family that
have nothing to do with locality groups, just move it over to the column
qualifier.  We put carefully structured, binary data in the column
qualifier all the time.

-Eric



On Tue, Dec 4, 2012 at 1:06 PM, Chris Burrell <chris@burrell.me.uk> wrote:

> Thanks for all the comments below. Very helpful!
>
> On the last point, around "small indexes", do you mean if your set of keys
> is small, but having many column-families and column qualifiers? What order
> of magnitude would you consider to be small? A few million keys/billion
> keys? Or in another way, keys with 10s/100s of column families/qualifiers.
>
> I have another question around the use of column families and qualifiers.
> Would it be good or bad practice to have many column families/qualifiers
> per row.  I was just wondering if there would be any point in using these
> almost as extensions to the keys, i.e. the column family/qualifier would
> end up being the last part of the key. I understand column families can
> also be used to control how the data gets stored to maximize scanning too.
> I was just wondering if there would be drawbacks on having many of these.
>
> Chris
>
>
>
> On 28 November 2012 20:31, Eric Newton <eric.newton@gmail.com> wrote:
>
>> Some comments inlined below:
>>
>> On Wed, Nov 28, 2012 at 2:49 PM, Chris Burrell <chris@burrell.me.uk>wrote:
>>
>>> Hi
>>>
>>> I am trialling Accumulo on a small (tiny) cluster and wondering how the
>>> best way to tune it would be. I have 1 master + 2 tservers. The master has
>>> 8Gb of RAM and the tservers have each 16Gb each.
>>>
>>> I have set the walogs size to be 2Gb with an external memory map of 9G.
>>> The ratio is still the defaulted to 3. I've also upped the heap sizes of
>>> each tserver to 2Gb heaps.
>>>
>>> I'm trying to achieve high-speed ingest via batch writers held on
>>> several other servers. I'm loading two separate tables.
>>>
>>> Here are some questions I have:
>>> - Does the config above sound sensible? or overkill?
>>>
>>
>> Looks good to me, assuming you aren't doing other things (like
>> map/reduce) on the machines.
>>
>>
>>> - Is it preferable to have more servers with lower specs?
>>>
>> Yes.  Mostly to get more drives.
>>
>>
>>> - Is this the best way to maximise use of the memory?
>>>
>> It's not bad.  You may want to have larger block caches and a smaller
>> in-memory map.  But if you want to write-mostly, read-little, this is good.
>>
>>
>>> - Does the fact I have 3x2Gb walogs, means that the remaining 3Gb in the
>>> external memory map can be used while compactions occur?
>>>
>>
>> Yes.  You will want to increase the size or number of logs.  With that
>> many servers, failures will hopefully be very rare.  I would go with
>> changing 3 to 8.  Having lots of logs on a tablet is no big deal if you
>> have disk space, and don't expect many failures.
>>
>>
>>> - When minor compactions occur, does this halt ingest on that particular
>>> tablet? or tablet server?
>>>
>> Only if memory fills before the compactions finish. The monitor page will
>> indicate this by displaying "hold time."  When this happens the tserver
>> will self-tune and start minor compactions earlier with future ingest.
>>
>>
>>> - I have pre-split the tables six-ways, but not entirely sure if that's
>>> preferable if I only have 2 servers while trying it out? Perhaps 2 ways
>>> might be better?
>>>
>> Not for that reason, but to be able to use more cores concurrently.  Aim
>> for 50-100 tablets/node.
>>
>>
>>> - Does the batch upload through the shell client give significantly
>>> better performance stats?
>>>
>>
>> Using map/reduce to create RFiles is more efficient. But it also
>> increases latency: you only can see the data when the whole file is loaded.
>>
>> When a file is batch-loaded, its index is read, and the file is assigned
>> to matching tablets.  With small indexes, you can batch-load terabytes in
>> minutes.
>>
>> -Eric
>>
>>
>

Mime
View raw message