spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nayan sharma <nayansharm...@gmail.com>
Subject Re: Spark Druid Ingestion
Date Thu, 22 Mar 2018 07:37:27 GMT
Hey Jorge,

Thanks for responding.

Can you elaborate on the user permission part ? HDFS or local ?

As of now, hdfs path -> hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_libs__8247917347016008883.zip
<hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_libs__8247917347016008883.zip>
already has complete access for yarn user and my job is also running from the same user.


Thanks,
Nayan


> On Mar 22, 2018, at 12:54 PM, Jorge Machado <jomach@me.com> wrote:
> 
> Seems to me permissions problems  ! Can you check your user / folder permissions ? 
> 
> Jorge Machado
> 
> 
> 
> 
> 
>> On 22 Mar 2018, at 08:21, nayan sharma <nayansharma13@gmail.com <mailto:nayansharma13@gmail.com>>
wrote:
>> 
>> Hi All,
>> As druid uses Hadoop MapReduce to ingest batch data but I am trying spark for ingesting
data into druid taking reference from https://github.com/metamx/druid-spark-batch <https://github.com/metamx/druid-spark-batch>
>> But we are stuck at the following error.
>> Application Log:—>
>> 2018-03-20T07:54:28,782 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Will allocate AM container, with 896 MB memory including 384 MB overhead
>> 2018-03-20T07:54:28,782 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Setting up container launch context for our AM
>> 2018-03-20T07:54:28,785 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Setting up the launch environment for our AM container
>> 2018-03-20T07:54:28,793 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Preparing resources for our AM container
>> 2018-03-20T07:54:29,364 WARN [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries
under SPARK_HOME.
>> 2018-03-20T07:54:29,371 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Uploading resource file:/hdfs1/druid-0.11.0/var/tmp/spark-49af67df-1a21-4790-a02b-c737c7a44946/__spark_libs__8247917347016008883.zip
-> hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_libs__8247917347016008883.zip
<hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_libs__8247917347016008883.zip>
>> 2018-03-20T07:54:29,607 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Uploading resource file:/hdfs1/druid-0.11.0/var/tmp/spark-49af67df-1a21-4790-a02b-c737c7a44946/__spark_conf__2240950972346324291.zip
-> hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_conf__.zip
<hdfs://n2pl-pa-hdn220.xxx.xxx:8020/user/yarn/.sparkStaging/application_1521457397747_0013/__spark_conf__.zip>
>> 2018-03-20T07:54:29,673 INFO [task-runner-0-priority-0] org.apache.spark.SecurityManager
- Changing view acls to: yarn
>> 2018-03-20T07:54:29,673 INFO [task-runner-0-priority-0] org.apache.spark.SecurityManager
- Changing modify acls to: yarn
>> 2018-03-20T07:54:29,673 INFO [task-runner-0-priority-0] org.apache.spark.SecurityManager
- Changing view acls groups to: 
>> 2018-03-20T07:54:29,673 INFO [task-runner-0-priority-0] org.apache.spark.SecurityManager
- Changing modify acls groups to: 
>> 2018-03-20T07:54:29,673 INFO [task-runner-0-priority-0] org.apache.spark.SecurityManager
- SecurityManager: authentication disabled; ui acls disabled; users  with view permissions:
Set(yarn); groups with view permissions: Set(); users  with modify permissions: Set(yarn);
groups with modify permissions: Set()
>> 2018-03-20T07:54:29,679 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Submitting application application_1521457397747_0013 to ResourceManager
>> 2018-03-20T07:54:29,709 INFO [task-runner-0-priority-0] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl
- Submitted application application_1521457397747_0013
>> 2018-03-20T07:54:29,713 INFO [task-runner-0-priority-0] org.apache.spark.scheduler.cluster.SchedulerExtensionServices
- Starting Yarn extension services with app application_1521457397747_0013 and attemptId None
>> 2018-03-20T07:54:30,722 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- Application report for application_1521457397747_0013 (state: FAILED)
>> 2018-03-20T07:54:30,729 INFO [task-runner-0-priority-0] org.apache.spark.deploy.yarn.Client
- 
>> 	 client token: N/A
>> 	 diagnostics: Application application_1521457397747_0013 failed 2 times due to AM
Container for appattempt_1521457397747_0013_000002 exited with  exitCode: -1000
>> For more detailed output, check the application tracking page: http://n-pa-hdn220.xxx.xxxx:8088/cluster/app/application_1521457397747_0013
<http://n-pa-hdn220.xxx.xxxx:8088/cluster/app/application_1521457397747_0013> Then click
on links to logs of each attempt.
>> Diagnostics: No such file or directory
>> ENOENT: No such file or directory
>> 	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
>> 	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
>> 	at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:756)
>> 	at org.apache.hadoop.fs.DelegateToFileSystem.setPermission(DelegateToFileSystem.java:211)
>> 	at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:252)
>> 	at org.apache.hadoop.fs.FileContext$11.next(FileContext.java:1003)
>> 	at org.apache.hadoop.fs.FileContext$11.next(FileContext.java:999)
>> 	at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>> 	at org.apache.hadoop.fs.FileContext.setPermission(FileContext.java:1006)
>> 	at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload.java:421)
>> 	at org.apache.hadoop.yarn.util.FSDownload$3.run(FSDownload.java:419)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:422)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>> 	at org.apache.hadoop.yarn.util.FSDownload.changePermissions(FSDownload.java:419)
>> 	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:365)
>> 	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
>> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> 	at java.lang.Thread.run(Thread.java:748)
>> 
>> 
>> As far as I can understand there is something wrong with the job submission through
Yarn.
>> 
>> On local machine it is running but HDP cluster it is giving error.
>> 
>> 
>> <yarnlogs.txt>
>> 
>> Thanks,
>> Nayan
>> 
>> 
>> 
>> 
> 


Mime
View raw message