kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From " Kaige Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3201) java.lang.RuntimeException: native lz4 library not available
Date Mon, 29 Jan 2018 05:40:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16342919#comment-16342919
] 

 Kaige Liu commented on KYLIN-3201:
-----------------------------------

Try this:

Add "*kylin.engine.spark-conf.executor.extraJavaOptions -Dhdp.version=2.6.2.3-1 Djava.library.path=/usr/hdp/current/hadoop-client/lib/native/*"
in kylin.propertes

Above example is for HDP 2.6. Modify it according to your Hadoop distribution.

 

> java.lang.RuntimeException: native lz4 library not available
> ------------------------------------------------------------
>
>                 Key: KYLIN-3201
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3201
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v2.1.0
>            Reporter: Keith Chen
>            Priority: Critical
>
> When i build cube with spark , the job was failed.
> It report some exceptions about lz4, but i do not set lz4 properties.
>  
> 18/01/26 13:18:40 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, executor
1): java.lang.RuntimeException: native lz4 library not available
>  at org.apache.hadoop.io.compress.Lz4Codec.getDecompressorType(Lz4Codec.java:195)
>  at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
>  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1983)
>  at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1878)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1827)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1841)
>  at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:49)
>  at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64)
>  at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:252)
>  at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:251)
>  at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>  at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>  at org.apache.spark.scheduler.Task.run(Task.scala:99)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  
> java.lang.RuntimeException: error execute org.apache.kylin.engine.spark.SparkCubingByLayer
>  at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
>  at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
>  Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2
in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 9, executor
1): java.lang.RuntimeException: native lz4 library not available
>  at org.apache.hadoop.io.compress.Lz4Codec.getDecompressorType(Lz4Codec.java:195)
>  at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
>  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1983)
>  at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1878)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1827)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1841)
>  at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:49)
>  at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64)
>  at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:252)
>  at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:251)
>  at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>  at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>  at org.apache.spark.scheduler.Task.run(Task.scala:99)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Driver stacktrace:
>  at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
>  at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
>  at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
>  at scala.Option.foreach(Option.scala:257)
>  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)
>  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)
>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1925)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1938)
>  at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
>  at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1158)
>  at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
>  at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
>  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>  at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>  at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1085)
>  at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
>  at org.apache.kylin.engine.spark.SparkCubingByLayer.saveToHDFS(SparkCubingByLayer.java:238)
>  at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:192)
>  at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
>  ... 6 more
>  Caused by: java.lang.RuntimeException: native lz4 library not available
>  at org.apache.hadoop.io.compress.Lz4Codec.getDecompressorType(Lz4Codec.java:195)
>  at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
>  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1983)
>  at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1878)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1827)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1841)
>  at org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:49)
>  at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64)
>  at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:252)
>  at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:251)
>  at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211)
>  at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>  at org.apache.spark.scheduler.Task.run(Task.scala:99)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message