giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Luis Larroque <larroques...@gmail.com>
Subject Re: Lease renewer daemon for [] with renew id 1 exited - HDFS Issue regarding Giraph?
Date Tue, 30 Aug 2016 15:47:58 GMT
Each time that i use a combiner, it throws me that exception, but doing
same test without it, don't.

Test with combiner:

/home/hadoop/bin/yarn jar /home/hadoop/giraph/giraph.jar
ar.edu.info.unlp.tesina.lectura.grafo.algoritmos.masivos.BusquedaDeCaminosNavegacionalesWikiquotesMasivo
/tmp/vertices.txt 4 -@- 1
ar.edu.info.unlp.tesina.lectura.grafo.BusquedaDeCaminosNavegacionalesWikiquote
-vif
ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueInputFormat
-vip /user/hduser/input/grafo-wikipedia.txt -vof
ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueOutputFormat
-op /user/hduser/output/caminosNavegacionales -w 4 -yh 120000 -ca
giraph.useOutOfCoreMessages=true,giraph.metrics.enable=true,giraph.isStaticGraph=true,giraph.numComputeThreads=4,giraph.async.message.store=true,giraph.async.message.store.threads=11,

giraph.messageCombinerClass=ar.edu.info.unlp.tesina.lectura.grafo.algoritmos.masivos.
CombinadorDeMensajes,
giraph.clientReceiveBufferSize=327680,giraph.clientSendBufferSize=5242880


Test without it:

/home/hadoop/bin/yarn jar /home/hadoop/giraph/giraph.jar
ar.edu.info.unlp.tesina.lectura.grafo.algoritmos.masivos.BusquedaDeCaminosNavegacionalesWikiquotesMasivo
/tmp/vertices.txt 4 -@- 1
ar.edu.info.unlp.tesina.lectura.grafo.BusquedaDeCaminosNavegacionalesWikiquote
-vif
ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueInputFormat
-vip /user/hduser/input/grafo-wikipedia.txt -vof
ar.edu.info.unlp.tesina.vertice.estructuras.IdTextWithComplexValueOutputFormat
-op /user/hduser/output/caminosNavegacionales -w 4 -yh 120000 -ca
giraph.useOutOfCoreMessages=true,giraph.metrics.enable=true,giraph.isStaticGraph=true,giraph.numComputeThreads=4,giraph.clientReceiveBufferSize=327680,giraph.clientSendBufferSize=5242880


I can't understant why the combiner is giving this problem. Without using
it, containers get killed for using too much memory when making requests,
and the options giraph.maxNumberOfOpenRequests and
giraph.waitRequestsForConfirmation doesn't help either for stop this from
happening.


I don't know what else should i do. Any help?


Bye

Jose


2016-08-29 23:46 GMT-03:00 José Luis Larroque <larroquester@gmail.com>:

> Hi guys,
>
> I have an application that gives me error.
>
> Lease renewer daemon for [] with renew id 1 exited
>
> In the superstep 2 of a Giraph Application. For some reason, after finishing a superstep,
the worker don't do much, only gives this as output:
>
>
> 16/08/30 00:48:35 INFO netty.NettyClient: waitAllRequests: Finished all requests. MBytes/sec
received = 0.0041, MBytesReceived = 0, ave received req MBytes = 0, secs waited = 0.002
> MBytes/sec sent = 0.0079, MBytesSent = 0, ave sent req MBytes = 0, secs waited = 0.002
> 16/08/30 00:48:35 DEBUG aggregators.OwnerAggregatorServerData: reset: Ready for next
superstep
> 16/08/30 00:48:35 DEBUG worker.WorkerAggregatorHandler: finishSuperstep: Aggregators
finished
> 16/08/30 00:48:56 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for [] with renew id
1 executed
> 16/08/30 00:48:56 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for [] with renew id
1 expired
> 16/08/30 00:48:56 DEBUG hdfs.LeaseRenewer: *Lease renewer daemon for [] with renew id
1 exited*
>
> The only thing i know about this, is that hdfs is creating a Daemon, but for some reason
fails and enter in a finally clause:
>
> synchronized void put(final String src, final DFSOutputStream out,
>       final DFSClient dfsc) {
>     if (dfsc.isClientRunning()) {
>       if (!isRunning() || isRenewerExpired()) {
>         //start a new deamon with a new id.
>         final int id = ++currentId;
>         daemon = new Daemon(new Runnable() {
>           @Override
>           public void More ...run() {
>             try {
>               if (LOG.isDebugEnabled()) {
>                 LOG.debug("Lease renewer daemon for " + clientsString()
>                     + " with renew id " + id + " started");
>               }
>               LeaseRenewer.this.run(id);
>             } catch(InterruptedException e) {
>               if (LOG.isDebugEnabled()) {
>                 LOG.debug(LeaseRenewer.this.getClass().getSimpleName()
>                     + " is interrupted.", e);
>               }
>             } finally {
>               synchronized(LeaseRenewer.this) {
>                 Factory.INSTANCE.remove(LeaseRenewer.this);
>               }
>               if (LOG.isDebugEnabled()) {
>
> *  LOG.debug("Lease renewer daemon for " + clientsString()                    + " with
renew id " + id + " exited");*
>               }
>             }
>           }
>
>           @Override
>           public String More ...toString() {
>             return String.valueOf(LeaseRenewer.this);
>           }
>         });
>         daemon.start();
>       }
>       dfsc.putFileBeingWritten(src, out);
>       emptyTime = Long.MAX_VALUE;
>     }
>   }
>
>
> In the same worker, but a little earlier, the worker can create the HDFS Daemon successfully:
>
> 1970,writeDone=true,writeSuccess=true).  Waiting on 77 requests*16/08/30 00:48:26 DEBUG
hdfs.LeaseRenewer: Lease renewer daemon for [] with renew id 1 executed*
> 16/08/30 00:48:26 DEBUG handler.RequestEncoder: write: Client 4, requestId 534, size
= 524891, SEND_WORKER_MESSAGES_REQUEST took 15943119 ns
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 720996472
> 16/08/30 00:48:26 DEBUG netty.OutboundByteCounter: write: [id: 0x9db0a2f6, /172.31.45.213:44192
=> ip-172-31-45-213.ec2.internal/172.31.45.213:30000] buffer size = 524891, total bytes
= 555502090
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 721258616
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 721520760
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 721782904
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0xb5d6ea70, /172.31.45.213:46826
=> /172.31.45.213:30004] buffer size = 66511, total bytes = 721849415
> 16/08/30 00:48:26 DEBUG handler.RequestDecoder: decode: Client 0, requestId 493, SEND_WORKER_MESSAGES_REQUEST,
with size 524888 took 581852 ns
> 16/08/30 00:48:26 DEBUG handler.RequestServerHandler: messageReceived: Processing client
0, requestId 493, SEND_WORKER_MESSAGES_REQUEST took 31752 ns
> 16/08/30 00:48:26 DEBUG netty.OutboundByteCounter: write: [id: 0xb5d6ea70, /172.31.45.213:46826
=> /172.31.45.213:30004] buffer size = 13, total bytes = 18876
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x9db0a2f6, /172.31.45.213:44192
=> ip-172-31-45-213.ec2.internal/172.31.45.213:30000] buffer size = 13, total bytes = 13013
> 16/08/30 00:48:26 DEBUG handler.ResponseClientHandler: messageReceived: Completed (taskId
= 0)(reqId=459,destAddr=ip-172-31-45-213.ec2.internal:30000,elapsedNanos=1481591427,started=Thu
Jan 01 01:49:56 UTC 1970,writeDone=true,writeSuccess=true).  Waiting on 77 requests
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 722111559
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 722373703
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 262144, total bytes = 722635847
> 16/08/30 00:48:26 DEBUG netty.InboundByteCounter: channelRead: [id: 0x74b0ba68, /172.31.45.214:56039
=> /172.31.45.213:30004] buffer size = 252147, total bytes = 722887994
>
>
> Any help will be greatly appreciated!!
>
> The only reference on this error that i could find regards to Spark user list, and confuses
me a lot :D
>
> Bye!
>
> Jose
>
>
>
>
>

Mime
View raw message