spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "john (JIRA)" <>
Subject [jira] [Commented] (SPARK-26000) Missing block when reading HDFS Data from Cloudera Manager
Date Sat, 10 Nov 2018 02:49:00 GMT


john commented on SPARK-26000:

I have Cloudera Manager in Environment A which has HDFS component and Spark in B. I am doing
a very sample read and write to/from HDFS. Writing to HDFS Cloudera Manager is working as
expected when reading back i m getting below issues:


"java.lang.reflect.InvocationTargetException" Caused By: "org.apache.spark.sql.AnalysisException:
Unable to infer schema for Parquet. It must be specified manually.;"

Caused By: " 60000 millis timeout while waiting for channel
to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/SparkNode_IP_PORT_NoO remote=/NameNode:50010:"

java Sample code


// writing 


// read;





> Missing block when reading HDFS Data from Cloudera Manager
> ----------------------------------------------------------
>                 Key: SPARK-26000
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.2
>            Reporter: john
>            Priority: Major
> I am able to write to Cloudera Manager HDFS through Open Source Spark which runs separately.
but not able to read the Cloudera Manger HDFS data .
> I am getting missing block location, socketTimeOut.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message