Return-Path: X-Original-To: apmail-incubator-accumulo-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-accumulo-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC66A9BA5 for ; Thu, 22 Dec 2011 21:49:56 +0000 (UTC) Received: (qmail 45796 invoked by uid 500); 22 Dec 2011 21:49:56 -0000 Delivered-To: apmail-incubator-accumulo-dev-archive@incubator.apache.org Received: (qmail 45768 invoked by uid 500); 22 Dec 2011 21:49:56 -0000 Mailing-List: contact accumulo-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: accumulo-dev@incubator.apache.org Delivered-To: mailing list accumulo-dev@incubator.apache.org Received: (qmail 45760 invoked by uid 99); 22 Dec 2011 21:49:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Dec 2011 21:49:56 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.216.175] (HELO mail-qy0-f175.google.com) (209.85.216.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Dec 2011 21:49:47 +0000 Received: by qcqw6 with SMTP id w6so5027482qcq.6 for ; Thu, 22 Dec 2011 13:49:26 -0800 (PST) Received: by 10.229.78.197 with SMTP id m5mr4832479qck.48.1324590566616; Thu, 22 Dec 2011 13:49:26 -0800 (PST) Received: from new-host.home (pool-108-28-39-84.washdc.fios.verizon.net. [108.28.39.84]) by mx.google.com with ESMTPS id r10sm19713007qaz.7.2011.12.22.13.49.26 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 22 Dec 2011 13:49:26 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1251.1) Subject: Re: [jira] [Commented] (ACCUMULO-227) Improve in memory map counts to provide cell level uniqueness for repeated columns in mutation From: Aaron Cordova In-Reply-To: Date: Thu, 22 Dec 2011 16:49:25 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <87F7F0EB-827E-483B-A510-EE450EEAA5BA@cordovas.org> References: <329427523.36572.1324494691127.JavaMail.tomcat@hel.zones.apache.org> <501796898.39732.1324575090792.JavaMail.tomcat@hel.zones.apache.org> To: accumulo-dev@incubator.apache.org X-Mailer: Apple Mail (2.1251.1) X-Virus-Checked: Checked by ClamAV on apache.org I think it's fine to consider different versions of 'identical keys', = meaning row,colfam,colqual, because in that case the implementation = still treats two keys that only differ by timestamp as two unique keys. = But I don't think we should allow multiple identical _versions_ of = identical keys, to use your terminology. I think we should throw all but = one away if the user does happen to try to insert them and if the user = wants to aggregate across values, he or she must use different version = numbers or timestamps or whatever. If generating unique timestamps within mutations that want to perform = several updates to the same row,colfam,colqual is a problem, why don't = we allow the user to 'put()' multiple updates into a mutation, and on = the server then assign slightly different timestamps to the identical = row,colfam,colqual triples that are found in a mutation. Would that make = everyone happy? On Dec 22, 2011, at 4:35 PM, Keith Turner wrote: > Big table has versions. Does the big table paper actually describe > the behavior of inserting two identical keys at different times when > the table is set to show two versions? If these keys were in two > separate map files/sstables then something would have to make a > decision to suppress one of them. I am not sure the big table paper > got that specific. You could suppress one of the keys, or just > consider them to be two versions. We have been considering them to be > versions.