carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CARBONDATA-308) Use CarbonInputFormat in CarbonScanRDD compute
Date Thu, 03 Nov 2016 17:04:58 GMT

    [ https://issues.apache.org/jira/browse/CARBONDATA-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15633470#comment-15633470
] 

ASF GitHub Bot commented on CARBONDATA-308:
-------------------------------------------

Github user kumarvishal09 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/262#discussion_r86393676
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
    @@ -311,80 +278,6 @@ private void addSegmentsIfEmpty(JobContext job, AbsoluteTableIdentifier
absolute
         return result;
       }
     
    -  /**
    -   * get total number of rows. Same as count(*)
    -   *
    -   * @throws IOException
    -   * @throws IndexBuilderException
    -   */
    -  public long getRowCount(JobContext job) throws IOException, IndexBuilderException {
    --- End diff --
    
    This method is useful for count(*) query as we can return number of rows from driver itself
, currently we are pushing down to executor, better keep this method it will be useful.


> Use CarbonInputFormat in CarbonScanRDD compute
> ----------------------------------------------
>
>                 Key: CARBONDATA-308
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-308
>             Project: CarbonData
>          Issue Type: Sub-task
>          Components: spark-integration
>            Reporter: Jacky Li
>            Assignee: Jacky Li
>             Fix For: 0.2.0-incubating
>
>
> Take CarbonScanRDD as the target RDD, modify as following:
> 1. In driver side, only getSplit is required, so only filter condition is required, no
need to create full QueryModel object, so we can move creation of QueryModel from driver side
to executor side.
> 2. use CarbonInputFormat.createRecordReader in CarbonScanRDD.compute instead of use QueryExecutor
directly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message