Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EF9C3200BB1 for ; Thu, 3 Nov 2016 18:05:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id EE449160AFF; Thu, 3 Nov 2016 17:05:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 42970160AE5 for ; Thu, 3 Nov 2016 18:05:01 +0100 (CET) Received: (qmail 42284 invoked by uid 500); 3 Nov 2016 17:05:00 -0000 Mailing-List: contact issues-help@carbondata.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@carbondata.incubator.apache.org Delivered-To: mailing list issues@carbondata.incubator.apache.org Received: (qmail 42272 invoked by uid 99); 3 Nov 2016 17:05:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2016 17:05:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E04F618030C for ; Thu, 3 Nov 2016 17:04:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -6.218 X-Spam-Level: X-Spam-Status: No, score=-6.218 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id dsgCOBfWN6RR for ; Thu, 3 Nov 2016 17:04:59 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with SMTP id D635F5FB12 for ; Thu, 3 Nov 2016 17:04:58 +0000 (UTC) Received: (qmail 40488 invoked by uid 99); 3 Nov 2016 17:04:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2016 17:04:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 585992C0087 for ; Thu, 3 Nov 2016 17:04:58 +0000 (UTC) Date: Thu, 3 Nov 2016 17:04:58 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@carbondata.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CARBONDATA-308) Use CarbonInputFormat in CarbonScanRDD compute MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 03 Nov 2016 17:05:02 -0000 [ https://issues.apache.org/jira/browse/CARBONDATA-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15633470#comment-15633470 ] ASF GitHub Bot commented on CARBONDATA-308: ------------------------------------------- Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/262#discussion_r86393676 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java --- @@ -311,80 +278,6 @@ private void addSegmentsIfEmpty(JobContext job, AbsoluteTableIdentifier absolute return result; } - /** - * get total number of rows. Same as count(*) - * - * @throws IOException - * @throws IndexBuilderException - */ - public long getRowCount(JobContext job) throws IOException, IndexBuilderException { --- End diff -- This method is useful for count(*) query as we can return number of rows from driver itself , currently we are pushing down to executor, better keep this method it will be useful. > Use CarbonInputFormat in CarbonScanRDD compute > ---------------------------------------------- > > Key: CARBONDATA-308 > URL: https://issues.apache.org/jira/browse/CARBONDATA-308 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration > Reporter: Jacky Li > Assignee: Jacky Li > Fix For: 0.2.0-incubating > > > Take CarbonScanRDD as the target RDD, modify as following: > 1. In driver side, only getSplit is required, so only filter condition is required, no need to create full QueryModel object, so we can move creation of QueryModel from driver side to executor side. > 2. use CarbonInputFormat.createRecordReader in CarbonScanRDD.compute instead of use QueryExecutor directly -- This message was sent by Atlassian JIRA (v6.3.4#6332)