hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vimal Jain <vkj...@gmail.com>
Subject Re: High Full GC count for Region server
Date Thu, 24 Oct 2013 05:49:54 GMT
Hi Ted/Jean,
Can you please help here ?


On Tue, Oct 22, 2013 at 10:29 PM, Vimal Jain <vkjk89@gmail.com> wrote:

> Hi Ted,
> Yes i checked namenode and datanode logs and i found below exceptions in
> both the logs:-
>
> Name node :-
> java.io.IOException: File
> /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e
> could only be replicated to 0 nodes, instead of 1
>
> java.io.IOException: Got blockReceived message from unregistered or dead
> node blk_-2949905629769882833_52274
>
> Data node :-
> 480000 millis timeout while waiting for channel to be ready for write. ch
> : java.nio.channels.SocketChannel[connected local=/192.168.20.30:50010
>  remote=/192.168.20.30:36188]
>
> ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(192.168.20.30:50010,
> storageID=DS-1816106352-192.168.20.30-50010-1369314076237, infoPort=50075,
> ipcPort=50020):DataXceiver
>
> java.io.EOFException: while trying to read 39309 bytes
>
>
> On Tue, Oct 22, 2013 at 10:19 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> bq. java.io.IOException: File /hbase/event_data/
>> 4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0
>> could
>> only be replicated to 0 nodes, instead of 1
>>
>> Have you checked Namenode / Datanode logs ?
>> Looks like hdfs was not stable.
>>
>>
>> On Tue, Oct 22, 2013 at 9:01 AM, Vimal Jain <vkjk89@gmail.com> wrote:
>>
>> > HI Jean,
>> > Thanks for your reply.
>> > I have total 8 GB memory and distribution is as follows:-
>> >
>> > Region server  - 2 GB
>> > Master,Namenode,Datanode,Secondary Namenode,Zookepeer - 1 GB
>> > OS - 1 GB
>> >
>> > Please let me know if you need more information.
>> >
>> >
>> > On Tue, Oct 22, 2013 at 8:15 PM, Jean-Marc Spaggiari <
>> > jean-marc@spaggiari.org> wrote:
>> >
>> > > Hi Vimal,
>> > >
>> > > What are your settings? Memory of the host, and memory allocated for
>> the
>> > > different HBase services?
>> > >
>> > > Thanks,
>> > >
>> > > JM
>> > >
>> > >
>> > > 2013/10/22 Vimal Jain <vkjk89@gmail.com>
>> > >
>> > > > Hi,
>> > > > I am running in Hbase in pseudo distributed mode. ( Hadoop version
-
>> > > 1.1.2
>> > > > , Hbase version - 0.94.7 )
>> > > > I am getting few exceptions in both hadoop ( namenode , datanode)
>> logs
>> > > and
>> > > > hbase(region server).
>> > > > When i search for these exceptions on google , i concluded  that
>> > problem
>> > > is
>> > > > mainly due to large number of full GC in region server process.
>> > > >
>> > > > I used jstat and found that there are total of 950 full GCs in span
>> of
>> > 4
>> > > > days for region server process.Is this ok?
>> > > >
>> > > > I am totally confused by number of exceptions i am getting.
>> > > > Also i get below exceptions intermittently.
>> > > >
>> > > >
>> > > > Region server:-
>> > > >
>> > > > 2013-10-22 12:00:26,627 WARN org.apache.hadoop.ipc.HBaseServer:
>> > > > (responseTooSlow):
>> > > > {"processingtimems":15312,"call":"next(-6681408251916104762, 1000),
>> rpc
>> > > > version=1, client version=29,
>> > methodsFingerPrint=-1368823753","client":"
>> > > > 192.168.20.31:48270
>> > > >
>> > > >
>> > >
>> >
>> ","starttimems":1382423411293,"queuetimems":0,"class":"HRegionServer","responsesize":4808556,"method":"next"}
>> > > > 2013-10-22 12:06:17,606 WARN org.apache.hadoop.ipc.HBaseServer:
>> > > > (operationTooSlow): {"processingtimems":14759,"client":"
>> > > > 192.168.20.31:48247
>> > > >
>> > > >
>> > >
>> >
>> ","timeRange":[0,9223372036854775807],"starttimems":1382423762845,"responsesize":61,"class":"HRegionServer","table":"event_data","cacheBlocks":true,"families":{"ginfo":["netGainPool"]},"row":"1629657","queuetimems":0,"method":"get","totalColumns":1,"maxVersions":1}
>> > > >
>> > > > 2013-10-18 10:37:45,008 WARN org.apache.hadoop.hdfs.DFSClient:
>> > > DataStreamer
>> > > > Exception: org.apache.hadoop.ipc.RemoteException:
>> java.io.IOException:
>> > > File
>> > > >
>> > > >
>> > >
>> >
>> /hbase/event_data/4c3765c51911d6c67037a983d205a010/.tmp/bfaf8df33d5b4068825e3664d3e4b2b0
>> > > > could only be replicated to 0 nodes, instead of 1
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)
>> > > >
>> > > > Name node :-
>> > > > java.io.IOException: File
>> > > >
>> > > >
>> > >
>> >
>> /hbase/event_data/433b61f2a4ebff8f2e4b89890508a3b7/.tmp/99797a61a8f7471cb6df8f7b95f18e9e
>> > > > could only be replicated to 0 nodes, instead of 1
>> > > >
>> > > > java.io.IOException: Got blockReceived message from unregistered or
>> > dead
>> > > > node blk_-2949905629769882833_52274
>> > > >
>> > > > Data node :-
>> > > > 480000 millis timeout while waiting for channel to be ready for
>> write.
>> > > ch :
>> > > > java.nio.channels.SocketChannel[connected local=/
>> 192.168.20.30:50010
>> > > > remote=/
>> > > > 192.168.20.30:36188]
>> > > >
>> > > > ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
>> > > > DatanodeRegistration(
>> > > > 192.168.20.30:50010,
>> > > > storageID=DS-1816106352-192.168.20.30-50010-1369314076237,
>> > > infoPort=50075,
>> > > > ipcPort=50020):DataXceiver
>> > > > java.io.EOFException: while trying to read 39309 bytes
>> > > >
>> > > >
>> > > > --
>> > > > Thanks and Regards,
>> > > > Vimal Jain
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks and Regards,
>> > Vimal Jain
>> >
>>
>
>
>
> --
> Thanks and Regards,
> Vimal Jain
>



-- 
Thanks and Regards,
Vimal Jain

Mime
View raw message