flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: KMeans job gets stuck and never completes
Date Sun, 22 Jun 2014 12:19:00 GMT
Workers waiting in "LocalBufferPool.requestBuffer()" is usually a sign for
a distributed deadlock.
Can you send me some instructions on how to get the same input data you
have (download url? generator settings?) and what configuration parameters
you are using (max iteration limit, k, ?) when calling the K-Means example.
I would like to try it on our cluster.

Just out of curiosity, what hardware are you using? Is it the IBM Power
cluster at TU Berlin?

Robert


On Sun, Jun 22, 2014 at 1:53 PM, Sebastian Schelter <ssc.open@googlemail.com
> wrote:

> You could try to increase the number of buffers available to the network
> stack. That solved similar problems for me in the past.
>
> -s
> Am 22.06.2014 13:48 schrieb "José Luis López Pino" <jllopezpino@gmail.com
> >:
>
> > It seems like the thread reading the points file is locked waiting for a
> > buffer from the global buffer pool that doesn't come. What could be
> causing
> > this?
> >
> >    java.lang.Thread.State: TIMED_WAITING (on object monitor)
> >  at java.lang.Object.wait(Native Method)
> > - waiting on <0x6b985888> (a java.util.ArrayDeque)
> > at
> >
> >
> eu.stratosphere.runtime.io.network.bufferprovider.LocalBufferPool.requestBuffer(LocalBufferPool.java:160)
> >  - locked <0x6b985888> (a java.util.ArrayDeque)
> > at
> >
> >
> eu.stratosphere.runtime.io.network.bufferprovider.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:101)
> >  at
> >
> >
> eu.stratosphere.runtime.io.gates.InputGate.requestBufferBlocking(InputGate.java:333)
> > at
> >
> >
> eu.stratosphere.runtime.io.channels.InputChannel.requestBufferBlocking(InputChannel.java:426)
> >  at
> >
> >
> eu.stratosphere.runtime.io.network.ChannelManager.dispatchFromOutputChannel(ChannelManager.java:441)
> > at
> >
> >
> eu.stratosphere.runtime.io.channels.OutputChannel.sendBuffer(OutputChannel.java:74)
> >  at
> >
> eu.stratosphere.runtime.io.gates.OutputGate.sendBuffer(OutputGate.java:49)
> > at
> >
> >
> eu.stratosphere.runtime.io.api.BufferWriter.sendBuffer(BufferWriter.java:35)
> >  at
> eu.stratosphere.runtime.io.api.RecordWriter.emit(RecordWriter.java:96)
> > at
> >
> >
> eu.stratosphere.pact.runtime.shipping.OutputCollector.collect(OutputCollector.java:82)
> >  at
> >
> >
> eu.stratosphere.pact.runtime.task.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:71)
> > at
> >
> >
> eu.stratosphere.pact.runtime.task.DataSourceTask.invoke(DataSourceTask.java:228)
> >  at
> >
> >
> eu.stratosphere.nephele.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:284)
> > at java.lang.Thread.run(Thread.java:744)
> >
> >
> > Thanks for your help Sebastian.
> >
> > Regards // Saludos // Mit Freundlichen Grüßen // Bien cordialement,
> > Pino
> >
> >
> > On 22 June 2014 13:38, Sebastian Schelter <ssc.open@googlemail.com>
> wrote:
> >
> > > Have you looked at a jstack dump on one of the workera? That typically
> > > helps finding out, where the processes are stuck.
> > >
> > > -s
> > > Am 22.06.2014 13:32 schrieb "José Luis López Pino" <
> > jllopezpino@gmail.com
> > > >:
> > >
> > > > Hi,
> > > >
> > > > I'm running the KMeans java and scala examples in two nodes. It works
> > > fine
> > > > with very small files (3MB) but when I try with files of 30MB or
> bigger
> > > the
> > > > process never ends. After several hours, the DataChain process that
> is
> > > > reading the input points is still working.
> > > >
> > > > I have tried before with way bigger files in the same environment
> and I
> > > had
> > > > no issue. I have already tried:
> > > > - Check that the process is not locked using all the CPU time.
> > > > - Format the datanodes.
> > > > - Compile the last version available on github.
> > > > - The debug log mode doesn't give any additional information.
> > > >
> > > > Could someone give me a hint where to look at that? Thanks for your
> > help!
> > > >
> > > > Regards // Saludos // Mit Freundlichen Grüßen // Bien cordialement,
> > > > Pino
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message