zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tobe <tobeg3oo...@gmail.com>
Subject Re: Too many "Broken pipe" in ZooKeeper log
Date Wed, 17 Sep 2014 08:55:42 GMT
@patrick Thanks for your concern and I feel sorry about this. The reason of
my problem is nothing about ZooKeeper. It's TaoKeeper which is used to
monitor ZooKeeper and send these commands periodically. Large number of
watches(more than 700K) may cause Broken Pipe when using wchs to get the
detailed status of znodes.

The large number of znodes are from HBase. We will store all the files for
replication in ZooKeeper and it will be large when the replication is out
of date. And I'm wondering if the HBase cluster scales to thousands of
servers, can ZooKeeper handle all the meta data of HBase?


On Wed, Sep 17, 2014 at 2:47 PM, tobe <tobeg3oogle@gmail.com> wrote:

> @patrick The ports of these commands are sequential. Here're 48668, 48669,
> 48670. So is it the periodically check of ZooKeeper? We mainly use
> ZooKeeper for HDFS and HBase. Other processes in this server will not sent
> 4lw to ZooKeeper.
>
>
> On Wed, Sep 17, 2014 at 2:19 PM, tobe <tobeg3oogle@gmail.com> wrote:
>
>>
>> I have found that the command is called from the same host and run
>> `stat`, `wchs`, `wchc` and `cons` as well. Here'is log from another
>> ZooKeeper cluster. It works well because it doesn't have a lot of znodes.
>> But I wonder why it call the commands frequently? We're using
>> ZooKeeper-3.4.4 and any patch about this?
>>
>> Log from 10.0.4.161 ZooKeeper:
>> 2014-09-17,08:26:38,336 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Processing stat command from /10.0.4.161:48668
>> 2014-09-17,08:26:38,336 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Stat command output
>> 2014-09-17,08:26:38,337 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Closed socket connection for client /10.0.4.161:48668 (no
>> session established for client)
>> 2014-09-17,08:26:38,398 INFO
>> org.apache.zookeeper.server.NIOServerCnxnFactory: [myid:0] Accepted socket
>> connection from /10.0.4.172:62393
>> 2014-09-17,08:26:38,398 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] Client attempting to establish new session at /10.0.4.172:62393
>> 2014-09-17,08:26:38,399 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] Established session 0x4833538b30989d with negotiated timeout 30000
>> for client /10.0.4.172:62393
>> 2014-09-17,08:26:38,400 INFO
>> org.apache.zookeeper.server.auth.SaslServerCallbackHandler: [myid:0]
>> Successfully authenticated client:
>> authenticationID=hbase_srv/hadoop@XIAOMI.HADOOP;
>> authorizationID=hbase_srv/hadoop@XIAOMI.HADOOP.
>> 2014-09-17,08:26:38,400 INFO
>> org.apache.zookeeper.server.auth.SaslServerCallbackHandler: [myid:0]
>> Setting authorizedID: hbase_srv
>> 2014-09-17,08:26:38,400 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] adding SASL authorization for authorizationID: hbase_srv
>> 2014-09-17,08:26:38,403 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Closed socket connection for client /10.0.4.172:62393 which had
>> sessionid 0x4833538b30989d
>> 2014-09-17,08:26:38,526 INFO
>> org.apache.zookeeper.server.NIOServerCnxnFactory: [myid:0] Accepted socket
>> connection from /10.0.4.161:48669
>> 2014-09-17,08:26:38,526 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Processing wchs command from /10.0.4.161:48669
>> 2014-09-17,08:26:38,527 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Closed socket connection for client /10.0.4.161:48669 (no
>> session established for client)
>> 2014-09-17,08:26:38,587 INFO
>> org.apache.zookeeper.server.NIOServerCnxnFactory: [myid:0] Accepted socket
>> connection from /10.0.4.171:50026
>> 2014-09-17,08:26:38,588 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] Client attempting to establish new session at /10.0.4.171:50026
>> 2014-09-17,08:26:38,588 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] Established session 0x4833538b30989e with negotiated timeout 30000
>> for client /10.0.4.171:50026
>> 2014-09-17,08:26:38,589 INFO
>> org.apache.zookeeper.server.auth.SaslServerCallbackHandler: [myid:0]
>> Successfully authenticated client:
>> authenticationID=hbase_srv/hadoop@XIAOMI.HADOOP;
>> authorizationID=hbase_srv/hadoop@XIAOMI.HADOOP.
>> 2014-09-17,08:26:38,589 INFO
>> org.apache.zookeeper.server.auth.SaslServerCallbackHandler: [myid:0]
>> Setting authorizedID: hbase_srv
>> 2014-09-17,08:26:38,589 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] adding SASL authorization for authorizationID: hbase_srv
>> 2014-09-17,08:26:38,592 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Closed socket connection for client /10.0.4.171:50026 which had
>> sessionid 0x4833538b30989e
>> 2014-09-17,08:26:38,613 INFO
>> org.apache.zookeeper.server.NIOServerCnxnFactory: [myid:0] Accepted socket
>> connection from /10.0.4.172:62397
>> 2014-09-17,08:26:38,614 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] Client attempting to establish new session at /10.0.4.172:62397
>> 2014-09-17,08:26:38,614 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] Established session 0x4833538b30989f with negotiated timeout 30000
>> for client /10.0.4.172:62397
>> 2014-09-17,08:26:38,615 INFO
>> org.apache.zookeeper.server.auth.SaslServerCallbackHandler: [myid:0]
>> Successfully authenticated client:
>> authenticationID=hbase_srv/hadoop@XIAOMI.HADOOP;
>> authorizationID=hbase_srv/hadoop@XIAOMI.HADOOP.
>> 2014-09-17,08:26:38,615 INFO
>> org.apache.zookeeper.server.auth.SaslServerCallbackHandler: [myid:0]
>> Setting authorizedID: hbase_srv
>> 2014-09-17,08:26:38,615 INFO org.apache.zookeeper.server.ZooKeeperServer:
>> [myid:0] adding SASL authorization for authorizationID: hbase_srv
>> 2014-09-17,08:26:38,618 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Closed socket connection for client /10.0.4.172:62397 which had
>> sessionid 0x4833538b30989f
>> 2014-09-17,08:26:38,707 INFO
>> org.apache.zookeeper.server.NIOServerCnxnFactory: [myid:0] Accepted socket
>> connection from /10.0.4.161:48670
>> 2014-09-17,08:26:38,708 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> [myid:0] Processing wchc command from /10.0.4.161:48670
>>
>>
>> On Wed, Sep 17, 2014 at 1:00 PM, Patrick Hunt <phunt@apache.org> wrote:
>>
>>> Looks like a client is calling the "dump" 4lw and not waiting for the
>>> results before closing the socket. Try tracking down where that
>>> command is called from. The logs should have something like this in
>>> it:
>>>
>>> 2014-09-16 21:59:11,407 [myid:1] - INFO
>>> [NIOWorkerThread-2:NIOServerCnxn@835] - Processing dump command from
>>> /127.0.0.1:51740
>>>
>>> (note the ip address)
>>>
>>> Patrick
>>>
>>> On Tue, Sep 16, 2014 at 7:57 PM, tobe <tobeg3oogle@gmail.com> wrote:
>>> > Can any one help to explain why I get so many "Broken pipe IOE" in
>>> > ZooKeeper log?
>>> >
>>> > ZooKeeper throws this exception almost every minute. I don't think we
>>> use
>>> > the four letter command to dumpWatches so frequently. So what does this
>>> > mean?
>>> >
>>> > 2014-09-17,10:52:09,179 ERROR
>>> org.apache.zookeeper.server.NIOServerCnxn:
>>> > [myid:0] Error sending data synchronously
>>> > java.io.IOException: Broken pipe
>>> >         at sun.nio.ch.FileDispatcher.write0(Native Method)
>>> >         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>>> >         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
>>> >         at sun.nio.ch.IOUtil.write(IOUtil.java:40)
>>> >         at
>>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn.sendBufferSync(NIOServerCnxn.java:138)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.checkFlush(NIOServerCnxn.java:453)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.write(NIOServerCnxn.java:474)
>>> >         at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:111)
>>> >         at java.io.BufferedWriter.write(BufferedWriter.java:212)
>>> >         at java.io.PrintWriter.write(PrintWriter.java:412)
>>> >         at java.io.PrintWriter.write(PrintWriter.java:429)
>>> >         at java.io.PrintWriter.print(PrintWriter.java:559)
>>> >         at java.io.PrintWriter.println(PrintWriter.java:695)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.WatchManager.dumpWatches(WatchManager.java:166)
>>> >         at
>>> > org.apache.zookeeper.server.DataTree.dumpWatches(DataTree.java:1240)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$WatchCommand.commandRun(NIOServerCnxn.java:722)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$CommandThread.run(NIOServerCnxn.java:496)
>>> > 2014-09-17,10:52:09,179 INFO org.apache.zookeeper.server.NIOServerCnxn:
>>> > [myid:0] Closed socket connection for client /10.20.201.234:53756
>>> which had
>>> > sessionid 0x34840357f664081
>>> > 2014-09-17,10:52:09,179 ERROR
>>> org.apache.zookeeper.server.NIOServerCnxn:
>>> > [myid:0] Error sending data synchronously
>>> > java.io.IOException: Broken pipe
>>> >         at sun.nio.ch.FileDispatcher.write0(Native Method)
>>> >         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>>> >         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
>>> >         at sun.nio.ch.IOUtil.write(IOUtil.java:40)
>>> >         at
>>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn.sendBufferSync(NIOServerCnxn.java:138)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.checkFlush(NIOServerCnxn.java:453)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.write(NIOServerCnxn.java:474)
>>> >         at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:111)
>>> >         at java.io.BufferedWriter.flush(BufferedWriter.java:235)
>>> >         at java.io.PrintWriter.flush(PrintWriter.java:276)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn.cleanupWriterSocket(NIOServerCnxn.java:424)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn.access$000(NIOServerCnxn.java:60)
>>> >         at
>>> >
>>> org.apache.zookeeper.server.NIOServerCnxn$CommandThread.run(NIOServerCnxn.java:500)
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message