hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: export snapshot fail sometime due to LeaseExpiredException
Date Wed, 30 Apr 2014 20:25:09 GMT
Tianying:
Have you checked audit log on namenode for deletion event corresponding to
the files involved in LeaseExpiredException ?

Cheers


On Wed, Apr 30, 2014 at 10:44 AM, Tianying Chang <tychang@gmail.com> wrote:

> This time re-run passed (although with many failed/retry tasks) with my
> throttle bandwidth as 200M(although by iftop, it never reach close to that
> number). Is there a way to increase the lease expire time for low throttle
> bandwidth for individual export job?
>
> Thanks
> Tian-Ying
>
>
>
> On Wed, Apr 30, 2014 at 10:17 AM, Tianying Chang <tychang@gmail.com>
> wrote:
>
> > yes, I am using the bandwidth throttle feature. The export job of this
> > table actually succeed for its first run. When I rerun it (for my robust
> > testing) it seems never pass.  I am wondering if it has some werid state
> (I
> > did clean up the target cluster even removed
> > /hbase/.archive/rich_pint_data_v1 folder)
> >
> > It seems even if I set the throttle value really large, it still fail.
> And
> > I think even after I replace the jar back to the one without throttle, it
> > still fail for re-run.
> >
> > Is there some way that I can increase the lease to be very large to test
> > it out?
> >
> >
> >
> > On Wed, Apr 30, 2014 at 10:02 AM, Matteo Bertozzi <
> theo.bertozzi@gmail.com
> > > wrote:
> >
> >> the file is the file in export, so you are creating that file.
> >> do you have the bandwidth throttle on?
> >>
> >> I'm thinking that the file is slow writing: e.g. write(few bytes) wait
> >> write(few bytes)
> >> and on the wait your lease expire
> >> or something like that can happen if your MR job is stuck in someway
> (slow
> >> machine or similar) and it is not writing within the lease timeout
> >>
> >> Matteo
> >>
> >>
> >>
> >> On Wed, Apr 30, 2014 at 9:53 AM, Tianying Chang <tychang@gmail.com>
> >> wrote:
> >>
> >> > we are using
> >> >
> >> > Hadoop 2.0.0-cdh4.2.0 and hbase 0.94.7. We also backported several
> >> snapshot
> >> > related jira, e.g 10111(verify snapshot), 11083 (bandwidth throttle in
> >> > exportSnapshot)
> >> >
> >> > I found when the  LeaseExpiredException first reported, that file
> indeed
> >> > not there, and the map task retry. And I verifified couple minutes
> >> later,
> >> > that HFile does exist under /.archive. But the retry map task still
> >> > complain the same error of file  not exist...
> >> >
> >> > I will check the namenode log for the LeaseExpiredException.
> >> >
> >> >
> >> > Thanks
> >> >
> >> > Tian-Ying
> >> >
> >> >
> >> > On Wed, Apr 30, 2014 at 9:33 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >> >
> >> > > Can you give us the hbase and hadoop releases you're using ?
> >> > >
> >> > > Can you check namenode log around the time LeaseExpiredException was
> >> > > encountered ?
> >> > >
> >> > > Cheers
> >> > >
> >> > >
> >> > > On Wed, Apr 30, 2014 at 9:20 AM, Tianying Chang <tychang@gmail.com>
> >> > wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > When I export large table with 460+ regions, I saw the
> >> exportSnapshot
> >> > job
> >> > > > fail sometime (not all the time). The error of the map task is
> >> below:
> >> > > But I
> >> > > > verified the file highlighted below, it does exist. Smaller table
> >> seems
> >> > > > always pass. Any idea? Is it because it is too big and get session
> >> > > timeout?
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> >> > > > No lease on
> >> > > >
> >> > >
> >> >
> >>
> /hbase/.archive/rich_pin_data_v1/7713d5331180cb610834ba1c4ebbb9b3/d/eef3642f49244547bb6606d4d0f15f1f
> >> > > > File does not exist. Holder DFSClient_NONMAPREDUCE_279781617_1
> does
> >> > > > not have any open files.
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396)
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387)
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183)
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481)
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
> >> > > >         at
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
> >> > > >         at org.apache.hadoop.ipc.ProtobufR
> >> > > >
> >> > > >
> >> > > >
> >> > > > Thanks
> >> > > >
> >> > > > Tian-Ying
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message