spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takeshi Yamamuro (JIRA)" <>
Subject [jira] [Commented] (SPARK-8000) should be able to auto-detect input data
Date Wed, 17 Feb 2016 11:33:18 GMT


Takeshi Yamamuro commented on SPARK-8000:

This ticket covers files exported by other systems such as Impala?
I'm not sure how to automatically detect these kinds of unknown files in Spark.
One idea; we try to read a format-specific header and detect it in ResolvedDataSource.

> should be able to auto-detect input data
> ---------------------------------------------------------------
>                 Key: SPARK-8000
>                 URL:
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
> If it is a parquet file, use parquet. If it is a JSON file, use JSON. If it is an ORC
file, use ORC. If it is a CSV file, use CSV.
> Maybe Spark SQL can also write an output metadata file to specify the schema & data
source that's used.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message