kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee King <yuyunliu...@gmail.com>
Subject The service queue is full; it has 400 items.. Retrying in the next heartbeat period.
Date Sat, 04 Nov 2017 02:47:25 GMT
Hi,
    Our kudu cluster have ran well a long time,  but write became slowly
recently,client also come out rpc timeout. I check the warning and find
vast error look this:
W1104 10:25:16.833736 10271 consensus_peers.cc:365] T
149ffa58ac274c9ba8385ccfdc01ea14 P 59c768eb799243678ee7fa3f83801316 -> Peer
1c67a7e7ff8f4de494469766641fccd1 (cloud-sk-ds-08:7050): Couldn't send
request to peer 1c67a7e7ff8f4de494469766641fccd1 for tablet
149ffa58ac274c9ba8385ccfdc01ea14. Status: Timed out: UpdateConsensus RPC to
10.6.60.9:7050 timed out after 1.000s (SENT). Retrying in the next
heartbeat period. Already tried 5 times.
    I change the
configure rpc_service_queue_length=400,rpc_num_service_threads=40, but it
takes no effect.
    Our cluster include 5 master , 10 ts. 3800G data, 800 tablet per ts. I
check one of the ts machine's memory, 14G left(128 In all), thread 4739(max
32000), openfile 28000(max 65536), cpu disk utilization ratio about 30%(32
core), disk util  less than 30%.
    Any suggestion for this? Thanks!

Mime
View raw message