spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yana Kadiyska <>
Subject Re: SQLcontext changing String field to Long
Date Sat, 10 Oct 2015 14:55:56 GMT
can you show the output of df.printSchema? Just a guess but I think I ran
into something similar with a column that was part of a path in parquet.
E.g. we had an account_id in the parquet file data itself which was of type
string but we also named the files in the following manner
/somepath/account_id=.../file.parquet. Since Spark uses the paths for
partition discovery, it was actually inferring that account_id is a numeric
type and upon reading the data, we ran into the exception you're describing
(this is in Spark 1.4)..

On Fri, Oct 9, 2015 at 7:55 PM, Abhisheks <> wrote:

> Hi there,
> I have saved my records in to parquet format and am using Spark1.5. But
> when
> I try to fetch the columns it throws exception*
> java.lang.ClassCastException: java.lang.Long cannot be cast to
> org.apache.spark.unsafe.types.UTF8String*.
> This filed is saved as String while writing parquet. so here is the sample
> code and output for the same..
>"troubling thing is ::" +
> sqlContext.sql(fileSelectQuery).schema().toString());
> DataFrame df= sqlContext.sql(fileSelectQuery);
> JavaRDD<Row> rdd2 = df.toJavaRDD();
> First Line in the code (Logger) prints this:
> troubling thing is ::StructType(StructField(batch_id,StringType,true))
> But the moment after it the execption comes up.
> Any idea why it is treating the filed as Long? (yeah one unique thing about
> column is it is always a number e.g. Time-stamp).
> Any help is appreciated.
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message