spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-17572) Write.df is failing on spark cluster
Date Sat, 17 Sep 2016 15:24:20 GMT

    [ https://issues.apache.org/jira/browse/SPARK-17572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15499201#comment-15499201
] 

Sean Owen commented on SPARK-17572:
-----------------------------------

I don't know the cause; it may not have to do with permissions. But it is an NFS error, right?

> Write.df is failing on spark cluster
> ------------------------------------
>
>                 Key: SPARK-17572
>                 URL: https://issues.apache.org/jira/browse/SPARK-17572
>             Project: Spark
>          Issue Type: Bug
>          Components: SparkR
>    Affects Versions: 2.0.0
>            Reporter: Sankar Mittapally
>
> Hi,
>  We have spark cluster with four nodes, all four nodes have NFS partition shared(there
is no HDFS), We have same uid on all servers. When we are trying to write data we are getting
following exceptions. I am not sure whether it is a error or not and not sure will I lost
the data in the output.
> The command which I am using to save the data.
> saveDF(banking_l1_1,"banking_l1_v2.csv",source="csv",mode="append",schema="true")
> 16/09/17 08:03:28 ERROR InsertIntoHadoopFsRelationCommand: Aborting job.
> java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv;
isDirectory=false; length=436486316; replication=1; blocksize=33554432; modification_time=1474099400000;
access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/nfspartition/sankar/banking_l1_v2.csv/part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv
>     at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:371)
>     at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:384)
>     at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:326)
>     at org.apache.spark.sql.execution.datasources.BaseWriterContainer.commitJob(WriterContainer.scala:222)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:144)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
>     at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115)
>     at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
>     at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
>     at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>     at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>     at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>     at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
>     at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>     at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
>     at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
>     at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
>     at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
>     at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:487)
>     at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
>     at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
>     at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86)
>     at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38)
>     at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>     at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>     at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>     at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>     at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
>     at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>     at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>     at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>     at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>     at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>     at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>     at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>     at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>     at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>     at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>     at java.lang.Thread.run(Thread.java:745)
> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or dir [/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task_201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv.crc]:
it still exists.
> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or dir [/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv]:
it still exists.
> 16/09/17 08:03:28 ERROR DefaultWriterContainer: Job job_201609170803_0000 aborted.
> 16/09/17 08:03:28 ERROR RBackendHandler: save on 625 failed
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
>   org.apache.spark.SparkException: Job aborted.
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:149)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
>     at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
>     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115)
>     at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
>     at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
>     at org.apache.spark.sql.execution.command.ExecutedCommandExec.doE
> Thanks
> Sankar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message