carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CARBONDATA-301) 6. Add SortProcessorStep which sorts the data as per dimension order and write the sorted files to temp location.
Date Fri, 21 Oct 2016 16:13:59 GMT

    [ https://issues.apache.org/jira/browse/CARBONDATA-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15595554#comment-15595554
] 

ASF GitHub Bot commented on CARBONDATA-301:
-------------------------------------------

Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/247#discussion_r84510340
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/sort/impl/CarbonParallelReadMergeSorterImpl.java
---
    @@ -0,0 +1,223 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing,
    + * software distributed under the License is distributed on an
    + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    + * KIND, either express or implied.  See the License for the
    + * specific language governing permissions and limitations
    + * under the License.
    + */
    +package org.apache.carbondata.processing.newflow.sort.impl;
    +
    +import java.io.File;
    +import java.util.Iterator;
    +import java.util.concurrent.Callable;
    +import java.util.concurrent.ExecutorService;
    +import java.util.concurrent.Executors;
    +import java.util.concurrent.TimeUnit;
    +
    +import org.apache.carbondata.common.CarbonIterator;
    +import org.apache.carbondata.common.logging.LogService;
    +import org.apache.carbondata.common.logging.LogServiceFactory;
    +import org.apache.carbondata.core.constants.CarbonCommonConstants;
    +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory;
    +import org.apache.carbondata.processing.newflow.DataField;
    +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException;
    +import org.apache.carbondata.processing.newflow.row.CarbonRow;
    +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch;
    +import org.apache.carbondata.processing.newflow.sort.CarbonSorter;
    +import org.apache.carbondata.processing.sortandgroupby.exception.CarbonSortKeyAndGroupByException;
    +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortDataRows;
    +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortIntermediateFileMerger;
    +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortParameters;
    +import org.apache.carbondata.processing.store.SingleThreadFinalSortFilesMerger;
    +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException;
    +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
    +
    +/**
    + * It parallely reads data from array of iterates and do merge sort.
    + * First it sorts the data and write to temp files. These temp files will be merge sorted
to get
    + * final merge sort result.
    + */
    +public class CarbonParallelReadMergeSorterImpl implements CarbonSorter {
    +
    +  private static final LogService LOGGER =
    +      LogServiceFactory.getLogService(CarbonParallelReadMergeSorterImpl.class.getName());
    +
    +  private SortParameters sortParameters;
    +
    +  private SortIntermediateFileMerger intermediateFileMerger;
    +
    +  private ExecutorService executorService;
    +
    +  private SingleThreadFinalSortFilesMerger finalMerger;
    +
    +  private DataField[] inputDataFields;
    +
    +  public CarbonParallelReadMergeSorterImpl(DataField[] inputDataFields) {
    +    this.inputDataFields = inputDataFields;
    +  }
    +
    +  @Override
    +  public void initialize(SortParameters sortParameters) {
    +    this.sortParameters = sortParameters;
    +    intermediateFileMerger = new SortIntermediateFileMerger(sortParameters);
    +    String storeLocation = CarbonDataProcessorUtil
    --- End diff --
    
    I guess PR 217 is not merged and once it is merged those changes would be reflected here.
And jira 287 would be sufficient for it.


> 6. Add SortProcessorStep which sorts the data as per dimension order and write the sorted
files to temp location.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-301
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-301
>             Project: CarbonData
>          Issue Type: Sub-task
>            Reporter: Ravindra Pesala
>            Assignee: Ravindra Pesala
>             Fix For: 0.2.0-incubating
>
>
> Add SortProcessorStep which sorts the data as per dimension order and write the sorted
files to temp location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message