Return-Path: X-Original-To: apmail-apex-dev-archive@minotaur.apache.org Delivered-To: apmail-apex-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3B6E19210 for ; Tue, 26 Apr 2016 12:15:16 +0000 (UTC) Received: (qmail 96815 invoked by uid 500); 26 Apr 2016 12:15:16 -0000 Delivered-To: apmail-apex-dev-archive@apex.apache.org Received: (qmail 96748 invoked by uid 500); 26 Apr 2016 12:15:16 -0000 Mailing-List: contact dev-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.incubator.apache.org Delivered-To: mailing list dev@apex.incubator.apache.org Received: (qmail 96737 invoked by uid 99); 26 Apr 2016 12:15:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2016 12:15:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 4CDA4C0715 for ; Tue, 26 Apr 2016 12:15:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.221 X-Spam-Level: X-Spam-Status: No, score=-3.221 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id zDaGPnwnEA7O for ; Tue, 26 Apr 2016 12:15:15 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with SMTP id 5361B5FAEB for ; Tue, 26 Apr 2016 12:15:14 +0000 (UTC) Received: (qmail 95849 invoked by uid 99); 26 Apr 2016 12:15:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2016 12:15:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 319772C1F73 for ; Tue, 26 Apr 2016 12:15:13 +0000 (UTC) Date: Tue, 26 Apr 2016 12:15:13 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: dev@apex.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (APEXMALHAR-1957) Improve HBasePOJOInputOperator with support for threaded read MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/APEXMALHAR-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257973#comment-15257973 ] ASF GitHub Bot commented on APEXMALHAR-1957: -------------------------------------------- Github user sandeepdeshmukh commented on a diff in the pull request: https://github.com/apache/incubator-apex-malhar/pull/212#discussion_r61074411 --- Diff: contrib/src/main/java/com/datatorrent/contrib/hbase/HBaseScanOperator.java --- @@ -40,27 +51,106 @@ * @tags hbase, scan, input operator * @since 0.3.2 */ -public abstract class HBaseScanOperator extends HBaseInputOperator +public abstract class HBaseScanOperator extends HBaseInputOperator implements Operator.ActivationListener { + public static final int DEF_HINT_SCAN_LOOKAHEAD = 2; + public static final int DEF_QUEUE_SIZE = 1000; + public static final int DEF_SLEEP_MILLIS = 10; + + private String startRow; + private String endRow; + private String lastReadRow; + private int hintScanLookahead = DEF_HINT_SCAN_LOOKAHEAD; + private int queueSize = DEF_QUEUE_SIZE; + private int sleepMillis = DEF_SLEEP_MILLIS; + private Queue resultQueue; + + @AutoMetric + protected long tuplesRead; + + // Transients + protected transient Scan scan; + protected transient ResultScanner scanner; + protected transient Thread readThread; + + @Override + public void setup(OperatorContext context) + { + super.setup(context); + resultQueue = Queues.newLinkedBlockingQueue(queueSize); + } + + @Override + public void activate(Context context) + { + startReadThread(); + } + + protected void startReadThread() + { + try { + scan = operationScan(); + scanner = getStore().getTable().getScanner(scan); + } catch (IOException e) { + throw new RuntimeException(e); + } + readThread = new Thread(new Runnable() { + @Override + public void run() + { + try { + Result result; + while ((result = scanner.next()) != null) { + while (!resultQueue.offer(result)) { + Thread.sleep(sleepMillis); + } + } + } catch (Exception e) { + logger.debug("Exception in fetching results {}", e.getMessage()); + throw new RuntimeException(e); + } finally { + scanner.close(); + } + } + }); + readThread.start(); + } + + @Override + public void beginWindow(long windowId) + { + super.beginWindow(windowId); + tuplesRead = 0; + } @Override public void emitTuples() { + if (!readThread.isAlive() && resultQueue.isEmpty()) { + startReadThread(); --- End diff -- Better to throw RuntimeException and redeploy the operator. > Improve HBasePOJOInputOperator with support for threaded read > ------------------------------------------------------------- > > Key: APEXMALHAR-1957 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-1957 > Project: Apache Apex Malhar > Issue Type: Task > Reporter: Bhupesh Chawda > Assignee: Bhupesh Chawda > > Add the following support to Hbase POJO Input Operator: > * Add support for threaded read > * Allow to specify a set of "column family: column" and fetch data only for these columns > * Allow to specify an end row key to stop scanning > * Add metrics -- This message was sent by Atlassian JIRA (v6.3.4#6332)