accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: why not check TTL interval
Date Sat, 06 Jun 2015 06:36:13 GMT
True, there would be a manual step required if you had new tables you
were adding (your original message said you only had one).

I'm not sure what you mean by interval-automatic by ttl value.
Compaction is the only operation which will trigger the iterators and
remove data past the TTL. Thus, the most logical feature to consider
adding would be scheduled compactions. You could then configure
compactions to run on certain intervals for each table. I think this
would be a good feature.

The only notion of automatic-compactions we have now (that I'm aware
of) is relative to mutations being written to a table
(table.compaction.major.everything.idle).

On Sat, Jun 6, 2015 at 1:49 AM, Lu Qin <luq.java@gmail.com> wrote:
> Accumulo do minor-compaction and major-compaction depends on a thresold value ,why not
do age-off interval-automatic by ttl value.
> If I do it use crontab,when I add a new table ,I must update my crontab
>
>
>> 在 2015年6月6日,13:37,Josh Elser <josh.elser@gmail.com> 写道:
>>
>> The decrease in performance you see is probably because the iterator must read a
significant amount of old data. If you don't write new data to a table, Accumulo will not
run any compactions and no data will age-off in the files on HDFS.
>>
>> I think it would be fairly common to use crontab to regularly schedule compactions
over your table so that data is automatically deleted (e.g. nightly). Accumulo doesn't contain
any means to automate this internally.
>>
>> Lu Qin wrote:
>>> I have a big table about 38B entries, and I set a ageoff iterator with a ttl
about 3 days,I set the iteratorPriority is 10 and apply it in all-scopes.
>>>
>>> I stop write data into it about one week,and now I scan it ,but it wait so long.
I check the monitor page,it show me that the scan speed is 80w entries/s.
>>>
>>> I think the ageoff is a diferent iterator than others,if all data is out the
ttl,when I scan the table,it will scan all data in the table and decide to remove it,right?
Why not do this interval ?
>>>
>>> Thanks
>

Mime
View raw message