spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takeshi Yamamuro (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-8000) SQLContext.read.load() should be able to auto-detect input data
Date Wed, 17 Feb 2016 11:33:18 GMT

    [ https://issues.apache.org/jira/browse/SPARK-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150348#comment-15150348
] 

Takeshi Yamamuro commented on SPARK-8000:
-----------------------------------------

This ticket covers files exported by other systems such as Impala?
I'm not sure how to automatically detect these kinds of unknown files in Spark.
One idea; we try to read a format-specific header and detect it in ResolvedDataSource.

> SQLContext.read.load() should be able to auto-detect input data
> ---------------------------------------------------------------
>
>                 Key: SPARK-8000
>                 URL: https://issues.apache.org/jira/browse/SPARK-8000
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>
> If it is a parquet file, use parquet. If it is a JSON file, use JSON. If it is an ORC
file, use ORC. If it is a CSV file, use CSV.
> Maybe Spark SQL can also write an output metadata file to specify the schema & data
source that's used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message