accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <vi...@apache.org>
Subject Re: question on AgeOffFilter
Date Wed, 10 Jun 2015 18:10:04 GMT
First question -

You only attached the age off filter to minor compaction scope, not scan
time (nor majC time). This is why you still see it.

Second question-

Current time is more applicable for when you attach it scan time, I think.
To ensure you have a consistent view across all servers.

Third question -
Attach the iterator and scan and major compaction scopes and you should be
fine.

On Wed, Jun 10, 2015 at 1:54 PM z11373 <z11373@outlook.com> wrote:

> Hi,
> I have questions on using AgeOffFilter. Earlier I ran following:
>
> root@dev> createtable testiter
> root@dev testiter> insert 1 cf col1 foo
> root@dev testiter> scan
> 1 cf:col1 []    foo
>
> Then two hours later, I ran:
>
> root@dev testiter> setiter -ageoff -t testiter -p 15 -minc
> AgeOffFilter removes entries with timestamps more than <ttl> milliseconds
> old
> ----------> set AgeOffFilter parameter negate, default false keeps k/v that
> pass accept method, true rejects k/v that pass accept method:
> ----------> set AgeOffFilter parameter ttl, time to live (milliseconds):
> 30000
> ----------> set AgeOffFilter parameter currentTime, if set, use the given
> value as the absolute time in milliseconds as the current time of day:
>
> root@dev testiter> scan
> 1 cf:col1 []    foo
>
> root@dev testiter> flush -w
> 2015-06-10 17:10:12,124 [shell.Shell] INFO : Flush of table testiter
> completed.
>
> root@dev testiter> scan
> 1 cf:col1 []    foo
>
> *First question*, why that key/value still exists? I'd think since I set
> the
> TTL to 30 seconds, and that key/value has been created more than 2 hours
> ago, so it should be gone after table flush (minc)?
>
> Then later I did following:
>
> root@dev testiter> insert 2 cf col1 bar
> root@dev testiter> scan
> 1 cf:col1 []    foo
> 2 cf:col1 []    bar
>
> Wait for more than 30 seconds, then ran:
>
> root@dev testiter> flush -w
> 2015-06-10 17:16:38,903 [shell.Shell] INFO : Flush of table testiter
> completed.
> root@dev testiter> scan
> 1 cf:col1 []    foo
>
> This is correct as the second key/value pair no longer exist, but why the
> first one still there?
>
> *Second question*, I still don't fully understand the currentTime argument.
> Since I didn't specify any long value (when being prompted), I'd assume it
> took current time when I set the iterator, is it true? I am not sure
> because
> if that is the case then key/value items inserted later won't get aged off
> since they will have later timestamp than that value set by iterator. This
> is also not true as shown in my example above (which second item was gone).
> I hope someone can enlighten me on this.
>
> *Third question*, which is kind of related to 2nd question. If I want to
> have data in a table retained for 6 months, i.e. if compaction runs
> everyday, then all key/value items with timestamp six months older than
> that
> day will be gone, how can I achieve this? I guess that AgeOffFilter is the
> right way to do, but the results from #1 and #2 above are confusing me, and
> think it doesn't work as I wanted to.
>
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/question-on-AgeOffFilter-tp14386.html
> Sent from the Users mailing list archive at Nabble.com.
>

Mime
View raw message