accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: Write ahead log growth and reduction
Date Wed, 11 Apr 2012 15:18:41 GMT
When data is written to the accumulo its written to memory and the
write ahead logs.  The data in memory is sorted and the data in the
write ahead logs is written asis (unsorted).  When the data in memory
is flushed to HDFS, the write ahead logs that also contain that data
are no longer needed.

On Wed, Apr 11, 2012 at 11:12 AM, Eric Newton <eric.newton@gmail.com> wrote:
> Logs are only put into HDFS during a recovery.
>
> Flush removes references to WALs, and the accumulo gc will ask the loggers
> to delete them when there are no references to them.
>
> -Eric
>
>
> On Wed, Apr 11, 2012 at 11:02 AM, Kristopher Kane <kkane.list@gmail.com>
> wrote:
>>
>>
>>
>> On Wed, Apr 11, 2012 at 10:48 AM, Keith Turner <keith@deenlo.com> wrote:
>>>
>>> How big is the partition?  Are the same number of logger servers
>>> running as tablet servers?
>>>
>>>
>>> You can scan the metadata table to look for tablets that have alot of
>>> write-ahead logs. I think the command below will show you how many
>>> write-ahead logs each tablet has.  Look for any tablets that have too
>>> many. I think it should sort the tablets with the most tablets to the
>>> top, but not positive.
>>>
>>>   ./bin/accumulo shell -u root -p secret -e 'scan -t !METADATA -c
>>> log' | cut -f 1 -d ' ' | uniq -c  | sort -r -n
>>>
>>> I think the following command will show you how many active log each
>>> logger has.  This should be even.
>>>
>>>   ./bin/accumulo shell -u root -p secret -e 'scan -t !METADATA -c
>>> log' | cut -f 2 -d ' ' | cut -d ':' -f 2 | sort | uniq -c
>>>
>>> You can use the "flush -p" command in the shell to force data in
>>> memory to disk and stop referencing write-ahead logs.  Maybe execute
>>> the commands above before and after flushing.
>>>
>>> Keith
>>>
>>>
>>
>>
>>
>> Thanks for the replies.  I read about the flush command in the docs but
>> didn't make a connection between "memory" to the write ahead logs.  Is
>> that correct?  Flush writes write ahead log data to hdfs?
>>
>> Thanks!
>>
>> -Kris
>
>

Mime
View raw message