Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of azuryyyu@gmail.com designates
 209.85.212.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAPQV63WUiAB_DhEECPKYfcLE8rBsDZyP2Gfa5r2VuwhYZHG1fQ@mail.gmail.com>
References: 
 <CAE24rAfCMB6ZGf_HiaaQM0n5mVwPJUeGSGHTa6C_5rP9N4aNDg@mail.gmail.com>
	<CA+r7Yvm74k5vcumDXtwnhoRzzX=HWss0jFEpPWoj7jDF4oF62Q@mail.gmail.com>
	<CAPQV63WUiAB_DhEECPKYfcLE8rBsDZyP2Gfa5r2VuwhYZHG1fQ@mail.gmail.com>
Date: Fri, 12 Jul 2013 19:41:55 +0800
Message-ID: 
 <CALr1C9oXW6xmZxQ4st7=swtcpUOAu8xiEg_dmudaxAGjS730ZA@mail.gmail.com>
Subject: Re: HBase issues since upgrade from 0.92.4 to 0.94.6
From: Azuryy Yu <azuryyyu@gmail.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=001a11339eda62067f04e14eff34

--001a11339eda62067f04e14eff34
Content-Type: text/plain; charset=ISO-8859-1

David,
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)

for this error, generally client always ask for bytes from the stream, but
sever has been shut down, so there maybe network issue or JVM crashed or
some others. I don't think this is releate to the HBase upgrade.


On Fri, Jul 12, 2013 at 7:32 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Might want to run memtest also, just to be sure there is no memory issue.
> It should not since it was working fine with 0.92.4, but costs nothing...
>
> the last version of Java 6 is 45... Might also worst to give it a try if
> you are running with 1.6.
>
> 2013/7/12 Asaf Mesika <asaf.mesika@gmail.com>
>
> > You need to see the jvm crash in .out log file and see if maybe its the
> .so
> > native Hadoop code that making the problem. In our case we
> > Downgraded from jvm 1.6.0-37 to 33 and it solved the issue.
> >
> >
> > On Friday, July 12, 2013, David Koch wrote:
> >
> > > Hello,
> > >
> > > NOTE: I posted the same message in the the Cloudera group.
> > >
> > > Since upgrading from CDH 4.0.1 (HBase 0.92.4) to 4.3.0 (HBase 0.94.6)
> we
> > > systematically experience problems with region servers crashing
> silently
> > > under workloads which used to pass without problems. More specifically,
> > we
> > > run about 30 Mapper jobs in parallel which read from HDFS and insert in
> > > HBase.
> > >
> > > region server log
> > > NOTE: no trace of crash, but server is down and shows up as such in
> > > Cloudera Manager.
> > >
> > > 2013-07-12 10:22:12,050 WARN
> > > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: File
> > >
> > >
> >
> hdfs://XXXXXXX:8020/hbase/.logs/XXXXXXX,60020,1373616547696-splitting/XXXXXXX%2C60020%2C1373616547696.1373617004286
> > > might be still open, length is 0
> > > 2013-07-12 10:22:12,051 INFO org.apache.hadoop.hbase.util.FSHDFSUtils:
> > > Recovering file
> > >
> > >
> >
> hdfs://XXXXXXX:8020/hbase/.logs/XXXXXXX,60020,1373616547696-splitting/XXXXXXX
> > > t%2C60020%2C1373616547696.1373617004286
> > > 2013-07-12 10:22:13,064 INFO org.apache.hadoop.hbase.util.FSHDFSUtils:
> > > Finished lease recover attempt for
> > >
> > >
> >
> hdfs://XXXXXXX:8020/hbase/.logs/XXXXXXX,60020,1373616547696-splitting/XXXXXXX%2C60020%2C1373616547696.1373617004286
> > > 2013-07-12 10:22:14,819 INFO org.apache.hadoop.io.compress.CodecPool:
> Got
> > > brand-new compressor [.deflate]
> > > 2013-07-12 10:22:14,824 INFO org.apache.hadoop.io.compress.CodecPool:
> Got
> > > brand-new compressor [.deflate]
> > > ...
> > > 2013-07-12 10:22:14,850 INFO org.apache.hadoop.io.compress.CodecPool:
> Got
> > > brand-new compressor [.deflate]
> > > 2013-07-12 10:22:15,530 INFO org.apache.hadoop.io.compress.CodecPool:
> Got
> > > brand-new compressor [.deflate]
> > > < -- last log entry, region server is down here -- >
> > >
> > >
> > > datanode log, same machine
> > >
> > > 2013-07-12 10:22:04,811 ERROR
> > > org.apache.hadoop.hdfs.server.datanode.DataNode:
> > XXXXXXX:50010:DataXceiver
> > > error processing WRITE_BLOCK operation  src: /YYY.YY.YYY.YY:36024 dest:
> > > /XXX.XX.XXX.XX:50010
> > > java.io.IOException: Premature EOF from inputStream
> > > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221)
> > > at java.lang.Thread.run(Thread.java:724)
> > > < -- many repetitions of this -- >
> > >
> > > What could have caused this difference in stability?
> > >
> > > We did not change any configuration settings with respect to the
> previous
> > > CDH 4.0.1 setup. In particular, we left ulimit and
> > > dfs.datanode.max.xcievers at 32k. If need be, I can provide more
> complete
> > > log/configuration information.
> > >
> > > Thank you,
> > >
> > > /David
> > >
> >
>

--001a11339eda62067f04e14eff34--