hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From siva kumar <siva165...@gmail.com>
Subject Spark streaming
Date Wed, 10 Feb 2016 12:34:07 GMT
Hi,
       I'm pulling some twitter data and trying to save the data into
persistent table.This is the code written.

case class Tweet(createdAt:Long, text:String)
twt.map(status=>
  Tweet(status.getCreatedAt().getTime()/1000, status.getText())
).foreachRDD(rdd=>
 rdd.toDF().saveAsTable("stream",SaveMode.Append)
)
When I go to spark-sql an check , i can see the table created. When im
trying to retrieve data im getting below error.


* java.lang.RuntimeException:
file:/user/hive/warehouse/stream/_temporary/0/_temporary/attempt_201602101609_0383_r_000014_0/part-r-00664.parquet
is not a Parquet file (too small)*

Is this the correct way to store the streaming data into a persistent table?

Any help?
Thanks in Advance
Siva.

Mime
View raw message