spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cloud-fan <...@git.apache.org>
Subject [GitHub] spark pull request #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFa...
Date Mon, 29 Jan 2018 13:13:02 GMT
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20397#discussion_r164425992
  
    --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReaderFactory.java
---
    @@ -22,21 +22,23 @@
     import org.apache.spark.annotation.InterfaceStability;
     
     /**
    - * A read task returned by {@link DataSourceV2Reader#createReadTasks()} and is responsible
for
    - * creating the actual data reader. The relationship between {@link ReadTask} and {@link
DataReader}
    + * A reader factory returned by {@link DataSourceV2Reader#createDataReaderFactories()}
and is
    + * responsible for creating the actual data reader. The relationship between
    + * {@link DataReaderFactory} and {@link DataReader}
      * is similar to the relationship between {@link Iterable} and {@link java.util.Iterator}.
      *
    - * Note that, the read task will be serialized and sent to executors, then the data reader
will be
    - * created on executors and do the actual reading. So {@link ReadTask} must be serializable
and
    - * {@link DataReader} doesn't need to be.
    + * Note that, the reader factory will be serialized and sent to executors, then the data
reader
    + * will be created on executors and do the actual reading. So {@link DataReaderFactory}
must be
    + * serializable and {@link DataReader} doesn't need to be.
      */
     @InterfaceStability.Evolving
    -public interface ReadTask<T> extends Serializable {
    +public interface DataReaderFactory<T> extends Serializable {
     
       /**
    -   * The preferred locations where this read task can run faster, but Spark does not
guarantee that
    -   * this task will always run on these locations. The implementations should make sure
that it can
    -   * be run on any location. The location is a string representing the host name.
    +   * The preferred locations where this data reader returned by this reader factory can
run faster,
    +   * but Spark does not guarantee that this task will always run on these locations.
    --- End diff --
    
    `not guarantee to always run the data reader on these locations.` 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message