Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF5379C1A for ; Tue, 27 Mar 2012 17:34:25 +0000 (UTC) Received: (qmail 15716 invoked by uid 500); 27 Mar 2012 17:34:24 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 15656 invoked by uid 500); 27 Mar 2012 17:34:24 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 15648 invoked by uid 99); 27 Mar 2012 17:34:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Mar 2012 17:34:24 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Mar 2012 17:34:16 +0000 Received: by obbta14 with SMTP id ta14so185013obb.14 for ; Tue, 27 Mar 2012 10:33:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=FmSEVnU4RIvrgXkcn7g+3DphVaxFy4tarAIZLjVI1M0=; b=mBxYHUm3RMhVZiYFMJsP8m3lcTkX28hkuDyWRzsdRWZp9pglXiHwQwdAXP7GSSkRw0 BtVJjJTPY/JwnmgfDhehfwc+ze5S1uY6yXKUvKw29yBiU/FzRmMcltihranBYAYgQ/Eb DWdPTIz3T8EeqJX868z11eZ7N3xFPW/LveAd78XuA4kn8hLJgciRhtQZThkdJ62PJw95 cW1kpakSe2N8QF2kYaem4qVOcwt9iz11fG7Y5wQHAGYMcgzXAFgZ3QcGTX4AWnTl9P4O lCIR6rFfQcoVRE6f9wuQskyJFrIE5d+NVkC27aCkJHTo6WP85fr7FGr4jDhN4jjM4xDv nI+g== MIME-Version: 1.0 Received: by 10.182.179.73 with SMTP id de9mr23427433obc.44.1332869634761; Tue, 27 Mar 2012 10:33:54 -0700 (PDT) Received: by 10.60.50.72 with HTTP; Tue, 27 Mar 2012 10:33:54 -0700 (PDT) X-Originating-IP: [69.136.131.224] In-Reply-To: <1332868768.80321.YahooMailNeo@web121702.mail.ne1.yahoo.com> References: <1332868768.80321.YahooMailNeo@web121702.mail.ne1.yahoo.com> Date: Tue, 27 Mar 2012 13:33:54 -0400 Message-ID: Subject: Re: Still Seeing Old Data After a Delete From: Shawn Quinn To: user@hbase.apache.org, lars hofhansl Content-Type: multipart/alternative; boundary=e89a8f64715b1ca4c904bc3ce53f X-Gm-Message-State: ALoCoQmnQSBtIGpzo2ew4BczZ1YiI20t+92XeJNkXVhGxOAso4S2pLuvnOyJRqJ2dK+r5ceWZkF7 --e89a8f64715b1ca4c904bc3ce53f Content-Type: text/plain; charset=ISO-8859-1 Hi Lars, Thanks for the quick reply! In this case we we're doing a column delete like so: Delete delete = new Delete(rowKey); delete.deleteColumn(Bytes.toBytes("thing"), Bytes.toBytes(value)); table.delete(delete); However, your response caused me to notice the "Delete.deleteColumns()" method in the JavaDoc instead of simply "Delete.deleteColumn()". Calling the "deleteColumns" instead of "deleteColumn" fixes the problem we were seeing. That wasn't immediately obvious to me after reading the book, but after reading the JavaDoc I now understand the distinction between the two methods. I may be the only one who missed that at first, but in case others have a similar confusion it might be worth a comment in the book that "deleteColumn()" is really only for deleting a single version and "deleteColumns()" is for deleting all versions. E.g. the second type noted in the book currently is listed as "Delete column: for all versions of a column". But, from the API perspective that's really the "deleteColumns()" method. (Whereas, my incorrect intuition when just looking at the API was that the "deleteColumns()" method would likely be for deleting multiple different columns.) Thanks again for the quick follow up, -Shawn On Tue, Mar 27, 2012 at 1:19 PM, lars hofhansl wrote: > Hey Shawn, > > how exactly did you delete the column? > There are three types of delete markers: family, column, version. > Your observation would be consistent with having used a version delete > marker, which just marks are a specific version (the latest by default) for > delete. > > Check out the HBase Reference Guide: > http://hbase.apache.org/book.html#version.delete > > Also, if you don't mind the plug see a more detailed discussion here: > http://hadoop-hbase.blogspot.com/2011/12/deletion-in-hbase.html > > -- Lars > > > ----- Original Message ----- > From: Shawn Quinn > To: user@hbase.apache.org > Cc: > Sent: Tuesday, March 27, 2012 10:01 AM > Subject: Still Seeing Old Data After a Delete > > Hello, > > In a couple of situations we were noticing some odd problems with old data > appearing in the application, and I finally found a reproducible scenario. > Here's what we're seeing in one basic case: > > 1. Using a scan in hbase shell one of our column cells (both the column > name and value are simple long's) looks like so: > > column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332795701976, > value=\x00\x00\x00\x00\x00\x00\x00s > > 2. If we then use a "Put" to update that cell to a new value it looks as > we'd expect like so: > > column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332866682295, > value=\x00\x00\x00\x00\x00\x00\x00u > > 3. If we then use a "Delete" to remove that column, instead of the column > no longer being included in the scan we instead see the following again: > > column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332795701976, > value=\x00\x00\x00\x00\x00\x00\x00s > > So, for some reason, at least in this case, the tombstone/delete marker > doesn't appear to be preventing new scans from no longer seeing the old > data. > > Note that this is a small development cluster of HBase (version: > hbase-0.90.4-cdh3u2) which contains one master and three region servers, > and I have confirmed that the clocks are synchronized properly between the > four machines. Also note that we're using the Java client API to run the > Put/Delete commands noted above. > > Any ideas on how old data could still appear in a Get/Scan like this, and > if there are any workarounds we could try? I saw HBASE-4536, but after > reading that thread it didn't seem pertinent to this more basic scenario. > > Thanks in advance for any pointers! > > -Shawn > > --e89a8f64715b1ca4c904bc3ce53f--