accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Cordova <>
Subject Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation
Date Thu, 22 Dec 2011 21:49:25 GMT
I think it's fine to consider different versions of 'identical keys', meaning row,colfam,colqual,
because in that case the implementation still treats two keys that only differ by timestamp
as two unique keys. But I don't think we should allow multiple identical _versions_ of identical
keys, to use your terminology. I think we should throw all but one away if the user does happen
to try to insert them and if the user wants to aggregate across values, he or she must use
different version numbers or timestamps or whatever.

If generating unique timestamps within mutations that want to perform several updates to the
same row,colfam,colqual is a problem, why don't we allow the user to 'put()' multiple updates
into a mutation, and on the server then assign slightly different timestamps to the identical
row,colfam,colqual triples that are found in a mutation. Would that make everyone happy?

On Dec 22, 2011, at 4:35 PM, Keith Turner wrote:

> Big table has versions.  Does the big table paper actually describe
> the behavior of inserting two identical keys at different times when
> the table is set to show two versions?  If these keys were in two
> separate map files/sstables then something would have to make a
> decision to suppress one of them.  I am not sure the big table paper
> got that specific.  You could suppress one of the keys, or just
> consider them to be two versions.  We have been considering them to be
> versions.

View raw message