horn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@gmail.com>
Subject Re: Sample training OutOfMemory failure
Date Sun, 01 Nov 2015 23:53:15 GMT
How can that use so much memory?
The matrix is 20k x 16, that should be at most 2-3mb.

Can you please share your code?

On 1 November 2015 at 23:48, Edward J. Yoon <edwardyoon@apache.org> wrote:

> Hello,
>
> Sorry for inconvenience. Can you try your program on your IDE such as
> Eclipe? You can put the max heap option in Run configurations. Then,
> it will works.
>
> To run it on your cluster, you need to install the Apache Hama
> cluster[1] at this moment.
>
> 1. wiki.apache.org/hama/GettingStarted
>
>
> On Mon, Nov 2, 2015 at 5:18 AM, Babaeizadeh Malamir, Mohammad
> <mb2@illinois.edu> wrote:
> >
> > Hello,
> >
> > I'm trying to run the sample code for training a neural network and it's
> failing with the following error. I'm not sure if I have the write input
> file format though as I couldn't find any example or documentation. I'm
> using a conversion of UCI letter recog dataset<
> https://archive.ics.uci.edu/ml/datasets/Letter+Recognition> to
> SequenceFile,
> >
> > I also tried increasing the Hadoop max heap memory to 2G but that didn't
> help either.
> >
> > Any help is appreciated.
> > Thanks,
> > MBZZ
> >
> >
> >
> > 15/11/01 04:22:05 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> > 15/11/01 04:22:07 INFO mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> > 15/11/01 04:22:07 INFO mortbay.log: Number of tasks: 5
> >
> > 15/11/01 04:22:07 INFO bsp.FileInputFormat: Total input paths to process
> : 1
> > 15/11/01 04:22:07 INFO bsp.BSPJobClient: Run pre-partitioning job
> > 15/11/01 04:22:07 INFO Configuration.deprecation: user.name is
> deprecated. Instead, use mapreduce.job.user.name
> > 15/11/01 04:22:07 WARN conf.Configuration:
> org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream@5cc6574b
> :an<mailto:org.apache.hadoop.fs.ChecksumFileSystem
> $FSDataBoundedInputStream@5cc6574b:an> attempt to override final
> parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> > 15/11/01 04:22:07 WARN conf.Configuration:
> org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream@5cc6574b
> :an<mailto:org.apache.hadoop.fs.ChecksumFileSystem
> $FSDataBoundedInputStream@5cc6574b:an> attempt to override final
> parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
> > 15/11/01 04:22:07 INFO Configuration.deprecation: user.name is
> deprecated. Instead, use mapreduce.job.user.name
> > 15/11/01 04:22:07 INFO bsp.BSPJobClient: Running job:
> job_localrunner_0001
> > 15/11/01 04:22:08 INFO bsp.LocalBSPRunner: Setting up a new barrier for
> 1 tasks!
> > 15/11/01 04:22:08 INFO mortbay.log: Begin to train
> > 15/11/01 04:22:08 INFO mortbay.log: End of training, number of
> iterations: 1.
> >
> > 15/11/01 04:22:08 INFO mortbay.log: Write model back to hdfs://
> 172.16.124.131:8020/mbz/models.csv
> >
> > 15/11/01 04:22:08 INFO Configuration.deprecation:
> mapred.cache.localFiles is deprecated. Instead, use
> mapreduce.job.cache.local.files
> > 15/11/01 04:22:08 ERROR bsp.LocalBSPRunner: Exception during BSP
> execution!
> > java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError:
> Java heap space
> >                 at
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
> >                 at
> java.util.concurrent.FutureTask.get(FutureTask.java:188)
> >                 at
> org.apache.hama.bsp.LocalBSPRunner$ThreadObserver.run(LocalBSPRunner.java:313)
> >                 at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.lang.OutOfMemoryError: Java heap space
> >                 at
> org.apache.hama.commons.math.DenseDoubleVector.<init>(DenseDoubleVector.java:44)
> >                 at
> org.apache.hama.commons.io.VectorWritable.readVector(VectorWritable.java:118)
> >                 at
> org.apache.hama.commons.io.VectorWritable.readFields(VectorWritable.java:55)
> >                 at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
> >                 at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
> >                 at
> org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:2247)
> >                 at
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2220)
> >                 at
> org.apache.hama.bsp.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:111)
> >                 at
> org.apache.hama.bsp.SequenceFileRecordReader.next(SequenceFileRecordReader.java:87)
> >                 at
> org.apache.hama.bsp.TrackedRecordReader.moveToNext(TrackedRecordReader.java:63)
> >                 at
> org.apache.hama.bsp.TrackedRecordReader.next(TrackedRecordReader.java:49)
> >                 at
> org.apache.hama.bsp.BSPPeerImpl.readNext(BSPPeerImpl.java:634)
> >                 at
> org.apache.hama.ml.ann.SmallLayeredNeuralNetworkTrainer.calculateUpdates(SmallLayeredNeuralNetworkTrainer.java:156)
> >                 at
> org.apache.hama.ml.ann.SmallLayeredNeuralNetworkTrainer.bsp(SmallLayeredNeuralNetworkTrainer.java:106)
> >                 at
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:256)
> >                 at
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
> >                 at
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
> >                 at
> java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >                 at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >                 at
> java.util.concurrent.FutureTask.run(FutureTask.java:262)
> >                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >                 ... 1 more
> > 15/11/01 04:22:10 INFO bsp.BSPJobClient: Current supersteps number: 0
> > 15/11/01 04:22:10 INFO bsp.BSPJobClient: Job failed.
> > 15/11/01 04:22:10 INFO mortbay.log: Reload model from hdfs://
> 172.16.124.131:8020/mbz/models.csv.
>
>
>
> --
> Best Regards, Edward J. Yoon
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message