accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Using Iterator To Toss Unchanged Values
Date Thu, 12 Jul 2012 12:47:41 GMT
I'd like to track field level changes for a given record (say,
author). So I create a table without a VersioningIterator. And I
insert a few records:

insert "JOHN" "ATTRIBUTE" "AGE" "34"
insert "JOHN" "ATTRIBUTE" "HEIGHT" "67"
insert "JOHN" "BOOKS" "TITLE" "THE RISE OF ACCUMULO"

The next action is that some ingest process happens and does this:

insert "JOHN" "ATTRIBUTE" "AGE" "34"

Since there is no VersioningIterator, there are two AGES both with
"34" as the value.

I would like an DropUnchangedValueIterator which removes the last
inserted record. Removing the last record lets me use the n-1
timestamp as a LastUpdated value for the key-value pair. But as soon
as a record is deleted, the previous records are not available
anymore? What if the timestamp is set to MAX-timestamp so the records
are sorted backwards? Does that avoid the blocking tombstones? I'd
look at the source code before asking but I don't have that luxury for
the next week or two and the question is rattling around my head.

Naturally, I could query the database before the ingest insert. But,
referring to slide 19 in Adam's presentation at
http://people.apache.org/~afuchs/slides/accumulo_table_design.pdf, the
read-modify-write design is not optimal.

Mime
View raw message