Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 45E75200B6B for ; Thu, 25 Aug 2016 11:41:58 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 44685160A94; Thu, 25 Aug 2016 09:41:58 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6447D160A93 for ; Thu, 25 Aug 2016 11:41:57 +0200 (CEST) Received: (qmail 32549 invoked by uid 500); 25 Aug 2016 09:41:56 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 32537 invoked by uid 99); 25 Aug 2016 09:41:55 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Aug 2016 09:41:55 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 64948180593 for ; Thu, 25 Aug 2016 09:41:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.778 X-Spam-Level: *** X-Spam-Status: No, score=3.778 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FUZZY_MILLION=2.599, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Tq0tGQrCCsLL for ; Thu, 25 Aug 2016 09:41:52 +0000 (UTC) Received: from mail-it0-f52.google.com (mail-it0-f52.google.com [209.85.214.52]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 69FDE5F24F for ; Thu, 25 Aug 2016 09:41:52 +0000 (UTC) Received: by mail-it0-f52.google.com with SMTP id e63so261357953ith.1 for ; Thu, 25 Aug 2016 02:41:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=h29ZlhLmcTOiKOI2akJ2GUt0uzyJLLIHRWP95P4wM4M=; b=H5kTBnkKCm7c/4HIsvZ3HpubMOqxdMTyNJwmYk6vCl0e6FEc/XgybIuRd6i00b+0Ft LYfV3qoXZJJcpaFe4HHp+7bbTkTNKd99BjVspdJ4258ODLxo1xc6+FjNhgREewq5Za5z rW8e9TZQzYjyQjiszfzx7MXM+E9FbSvAw5rywmeWu7s8Em5yhClNm+SHik0JdwT/xkfH AH2/DVveIqc1bCiWLKmAjQTFPAJDbWS4Ajvv2Z1mjhAzARE2e0s+5mEKiZ0KvKmM4TNW Jzz5jAoylS708KAgFk4gcKRmP6SumAByLn2mllHxHgBDf/KQfWMg/VO7C2bCIrOXpKNy RTkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=h29ZlhLmcTOiKOI2akJ2GUt0uzyJLLIHRWP95P4wM4M=; b=AAum4t3+ylXK75/aztIFM2OBMXPTWNxZcji7pAMEv3B8I3rJrhFAg4J7nswZNUViPB B9Aj4eLOrpebBUcgGa1XIa1kFd4LXxGDFnJsX9Y4zszxbHpOvAjEx9LKTYWb6Prl1bli b82XO/QrjZL7+T5MO4W3el4PZ4lI+bR/V0f90ewEn5agZaA8uckuIuxMGF07BgLtKYDa 6MfOrqI7i2UrAUF7nELzXIyqMp4qoOmB/QKzaA/O6k7Pu7pItTGdC3qQ6AX5WbsB5Uab gNtcPS5Rlfixlp9292s5IiapeN9UnMCSfW7oQxJdhU3r+wEqj5dTsUwIkkjUWwmYr3BM 41vQ== X-Gm-Message-State: AEkoouu2DwJJ6C2LqTJt1wiiGRkD8+hREiLpqc32JUQsYWQL+MQKx/oHPM/V88ImThgLp+yuzn6oVmYs4p/Egw== X-Received: by 10.107.129.97 with SMTP id c94mr9282686iod.102.1472118111612; Thu, 25 Aug 2016 02:41:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.95.19 with HTTP; Thu, 25 Aug 2016 02:41:51 -0700 (PDT) In-Reply-To: References: From: ramkrishna vasudevan Date: Thu, 25 Aug 2016 15:11:51 +0530 Message-ID: Subject: Re: How to get Last 1000 records from 1 millions records To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a113e8ab21c3562053ae23514 archived-at: Thu, 25 Aug 2016 09:41:58 -0000 --001a113e8ab21c3562053ae23514 Content-Type: text/plain; charset=UTF-8 Hi Manjeet For your first question regarding fetching last 1000 records First in your scan you set your start Row with the bytes corresponding to ( A_9811111111_) and let the end byte be the byte representation of A_9811111111 + 1 . I mean add +1 to the last byte of what comes out of (A_9811111111_). So this will ensure you scan only the rows corresponding to (A_9811111111_). Just thinking the first thing that I can see is that it may be easier to do this with CPs than Filters. Because filters deals with per cell or that row. Adding the results and maintaing the last 10k records may be difficult. I have to see in detail if possible. Do you know the number of columns you have? If there are multiple columns then it is quite tricky. But if you have only one column per row then or you want only the row keys You can implement an User Coprocessor and in that you can implement preStoreScannerOpen(). Take for eg. you have only one family so in that case in you preStoreScannerOpen you will create your own StoreScanner and in the StoreScanner.next() you can just skip all KeyValues and during that process keep collecting your cells. Ensure you keep collecting the cells row wise by adding to a list. You will have to have only the latest 10000 cells in the list any time. Every time keep checking if the row has reached the stopRow that is set in the scan (so may be it moves to A_9811111112_). Once you see this condition you may have to replace the list given by the StoreScanner.next() call with the list that you have collected and send it to the client. I have not yet tried it but it can give you an idea with CPs. With filters am not sure as I said as I need to read the flow and see if there are any such APIs to mimic the above. PS. Don't take this as a working algo. There may be reasons why it may not work but you can see and read about CPs to see if something like above can work out. Regards Ram On Thu, Aug 25, 2016 at 2:16 PM, Manjeet Singh wrote: > Hi All > > I have one another question for same case > > below is my sample Hbase data as we all know that hbase store data on the > basis of rowkey (sorted) > below is IP as we can see 2.168.129.81_1 is in last what I am expecting it > shuld come just after 1.168.129.81_2 > > > > 1.168.129.81_0 > column=c2:D_com.stackoverflow/questions/4, timestamp=1472104396288, > value=4 > 1.168.129.81_1 > column=c2:D_com.stackoverflow/questions/1, timestamp=1472104396288, > value=1 > 1.168.129.81_1 > column=c2:D_com.stackoverflow/questions/2, timestamp=1472104396288, > value=2 > 1.168.129.81_2 > column=c2:D_com.stackoverflow/questions/0, timestamp=1472104396288, > value=0 > 192.168.129.81_1 > column=c2:D_com.stackoverflow/questions/2, timestamp=1472104386671, > value=2 > 192.168.129.81_1 > column=c2:D_com.stackoverflow/questions/4, timestamp=1472104386671, > value=4 > 192.168.129.81_2 > column=c2:D_com.stackoverflow/questions/1, timestamp=1472104386671, > value=1 > 192.168.129.81_3 > column=c2:D_com.stackoverflow/questions/0, timestamp=1472104386671, > value=0 > 192.168.129.81_3 > column=c2:D_com.stackoverflow/questions/3, timestamp=1472104386671, > value=3 > 2.168.129.81_1 > column=c2:D_com.stackoverflow/questions/0, timestamp=1472104404609, > value=0 > 2.168.129.81_1 > column=c2:D_com.stackoverflow/questions/1, timestamp=1472104404609, > value=1 > 2.168.129.81_1 > column=c2:D_com.stackoverflow/questions/2, timestamp=1472104404609, > value=2 > 2.168.129.81_3 > column=c2:D_com.stackoverflow/questions/4, timestamp=1472104404609, > value=4 > > > > On Thu, Aug 25, 2016 at 12:36 PM, Manjeet Singh < > manjeet.chandhok@gmail.com> > wrote: > > > I am using some logical salt say I have mobile number in my row key so I > > am using some algo and fitting this mobile number into some ASCII char > > So each time I know what will be the salt so its clear to me and it will > > never change the order > > example > > if based on my algo I get A for 9811111111 > > so each time it will always return me A for 9811111111 > > so if I have my row key Like > > A_9811111111_101 > > A_9811111111_102 > > A_9811111111_103 > > A_9811111111_104 > > A_9811111111_105 > > A_9811111111_106 > > A_9811111111_107 > > A_9811111111_108 > > > > it will sort my row key in same manner as showing above now these are > > millions of record now i want to get last 10000 records > > is their any way to get it, my concern is to perform all calcuation on > > server side not client side. > > > > > > Thanks > > Manjeet > > > > > > On Thu, Aug 25, 2016 at 1:06 AM, Esteban Gutierrez > > > wrote: > > > >> As long as new rows are added to the latest region that "might" work. > But > >> if the table is using hashed keys or rows are added randomly to the > table > >> then retrieving the last million will be trickier and you will have to > >> scan > >> based on timestamp (if not modified) and then filter one more time. > >> > >> esteban. > >> > >> > >> -- > >> Cloudera, Inc. > >> > >> > >> On Wed, Aug 24, 2016 at 12:31 PM, Ted Yu wrote: > >> > >> > The following API should help in your case: > >> > > >> > public Scan setReversed(boolean reversed) { > >> > > >> > Cheers > >> > > >> > On Wed, Aug 24, 2016 at 12:05 PM, Manjeet Singh < > >> > manjeet.chandhok@gmail.com> > >> > wrote: > >> > > >> > > Hi all > >> > > > >> > > Hbase didnt provide sorting on column but rowkey store in sorted > form > >> > > like small value first and greater value last > >> > > > >> > > example > >> > > 1 > >> > > 2 > >> > > 3 > >> > > 4 > >> > > 5 > >> > > 6 > >> > > 7 > >> > > and so on > >> > > > >> > > Assume I have 1 Miilions record but i want to look last 1000 records > >> only > >> > > Is their any way to do this? I don't want to perform any calculation > >> on > >> > > client side so may be any filter can help on it? > >> > > > >> > > Thanks > >> > > Manjeet > >> > > > >> > > -- > >> > > luv all > >> > > > >> > > >> > > > > > > > > -- > > luv all > > > > > > -- > luv all > --001a113e8ab21c3562053ae23514--