Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8BB3E176A0 for ; Thu, 26 Feb 2015 14:23:06 +0000 (UTC) Received: (qmail 28903 invoked by uid 500); 26 Feb 2015 14:23:06 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 28840 invoked by uid 500); 26 Feb 2015 14:23:06 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 28828 invoked by uid 99); 26 Feb 2015 14:23:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Feb 2015 14:23:06 +0000 Date: Thu, 26 Feb 2015 14:23:06 +0000 (UTC) From: "Hadoop QA (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-13071) Hbase Streaming Scan Feature MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338439#comment-14338439 ] Hadoop QA commented on HBASE-13071: ----------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12701068/HBASE-13071-v1.patch against master branch at commit 1c957b65b16a8706caee140c18b84ea48a0dc0aa. ATTACHMENT ID: 12701068 {color:red}-1 @author{color}. The patch appears to contain 2 @author tags which the Hadoop community has agreed to not allow in code contributions. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12978//console This message is automatically generated. > Hbase Streaming Scan Feature > ---------------------------- > > Key: HBASE-13071 > URL: https://issues.apache.org/jira/browse/HBASE-13071 > Project: HBase > Issue Type: New Feature > Affects Versions: 0.98.11 > Reporter: Eshcar Hillel > Attachments: HBASE-13071-v1.patch, HBaseStreamingScanDesign.pdf, HbaseStreamingScanEvaluation.pdf > > > A scan operation iterates over all rows of a table or a subrange of the table. The synchronous nature in which the data is served at the client side hinders the speed the application traverses the data: it increases the overall processing time, and may cause a great variance in the times the application waits for the next piece of data. > The scanner next() method at the client side invokes an RPC to the regionserver and then stores the results in a cache. The application can specify how many rows will be transmitted per RPC; by default this is set to 100 rows. > The cache can be considered as a producer-consumer queue, where the hbase client pushes the data to the queue and the application consumes it. Currently this queue is synchronous, i.e., blocking. More specifically, when the application consumed all the data from the cache --- so the cache is empty --- the hbase client retrieves additional data from the server and re-fills the cache with new data. During this time the application is blocked. > Under the assumption that the application processing time can be balanced by the time it takes to retrieve the data, an asynchronous approach can reduce the time the application is waiting for data. > We attach a design document. > We also have a patch that is based on a private branch, and some evaluation results of this code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)