Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1067B189B0 for ; Mon, 9 Nov 2015 15:45:50 +0000 (UTC) Received: (qmail 61086 invoked by uid 500); 9 Nov 2015 15:45:44 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 61039 invoked by uid 500); 9 Nov 2015 15:45:44 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 61023 invoked by uid 99); 9 Nov 2015 15:45:44 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Nov 2015 15:45:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 30A9FC094C for ; Mon, 9 Nov 2015 15:45:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.193 X-Spam-Level: **** X-Spam-Status: No, score=4.193 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id TAMHTM7XrL_7 for ; Mon, 9 Nov 2015 15:45:37 +0000 (UTC) Received: from mail-yk0-f170.google.com (mail-yk0-f170.google.com [209.85.160.170]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 76ACC44194 for ; Mon, 9 Nov 2015 15:45:37 +0000 (UTC) Received: by ykek133 with SMTP id k133so270921580yke.2 for ; Mon, 09 Nov 2015 07:45:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=aCdw8qdz8a5z/siu02wd5TkmbpK//AwVZi0mNOr6X9c=; b=goTwTe+AcmcrEGxGiirwh+3NZXf1Bx71MSAZMqiFFEzWazLT8MFaX+HrLrlOU4pf69 8ICvbr7lFi8ibU5OxhRc/4ToAiF9ecVnX769GMG9oJfvtmuApbCIltSg6o5JVKoQtmvr DrjGTURMChHAl0lh6LiAi0jIe4XQKD9zvLMjoVK3j0pHweyizlPG35TRg/C0A/QIqnu8 byQ5pFBGM28QwsfbYqQMNQoEsTJzJcYukRqIc8IhBEfKdtQ9qd6NLK8ETQY2XingtuLL YhqeEXoBn5T6DCsxl+ifP4sa13MT7xOP6+147e5E2BP9pfd8qR7bq4ijsINw43u5OSOr V8Bw== MIME-Version: 1.0 X-Received: by 10.129.34.4 with SMTP id i4mr10028608ywi.155.1447083936750; Mon, 09 Nov 2015 07:45:36 -0800 (PST) Received: by 10.129.157.141 with HTTP; Mon, 9 Nov 2015 07:45:36 -0800 (PST) In-Reply-To: References: <1447081229522-15484.post@n5.nabble.com> <5640BAD9.4090301@gmail.com> Date: Mon, 9 Nov 2015 10:45:36 -0500 Message-ID: Subject: Re: total table rows From: William Slacum To: dev Content-Type: multipart/alternative; boundary=001a1140867802a6af05241d7ca3 --001a1140867802a6af05241d7ca3 Content-Type: text/plain; charset=UTF-8 Pranked... you can't use a CountingIterator, because it can't be init'd. Can we get rid of that limitation? On Mon, Nov 9, 2015 at 10:43 AM, William Slacum wrote: > An interator stack of FirstEntryInRowIterator + CountingIterator will > return the count of rows in each tablet, which can then be combined on the > client side. > > On Mon, Nov 9, 2015 at 10:25 AM, Josh Elser wrote: > >> Yeah, there's no explicit tracking of all rows in Accumulo, you're stuck >> with enumerating them (or explicitly tracking them yourself at ingest time). >> >> The easiest approach you can take is probably using the >> FirstEntryInRowIterator and counting each row on the client-side. >> >> You could do another summation in a second iterator but this is a little >> tricky to get correct. I tried to touch on this a little in a blog post[1]. >> If this is a one-off question you want to answer, doing the summation on >> the client side is likely not to take excessively longer than a server-side >> summation. >> >> [1] >> https://blogs.apache.org/accumulo/entry/thinking_about_reads_over_accumulo >> >> >> z11373 wrote: >> >>> I want to get total rows of a table (likely has more than 100M rows), I >>> think >>> to get that information, Accumulo would have to iterate all rows :-( This >>> may not be typical Accumulo scenario. >>> >>> Is there a more efficient way to get total number of rows in a table? >>> When Accumulo iterating those items, does it mean it will pull the data >>> to >>> the client? If yes, is there a way to ask it to return just the number, >>> since that's the only data I care. >>> >>> Thanks, >>> Z >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-accumulo.1065345.n5.nabble.com/total-table-rows-tp15484.html >>> Sent from the Developers mailing list archive at Nabble.com. >>> >> > --001a1140867802a6af05241d7ca3--