hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Levin <magn...@gmail.com>
Subject Re: Question about dead datanode
Date Thu, 13 Feb 2014 21:53:28 GMT
This might be related:

http://hadoop.6.n7.nabble.com/Question-on-opening-file-info-from-namenode-in-DFSClient-td6679.html

> In hbase, we open the file once and keep it open.  File is shared
> amongst all clients.
>

Does it mean its perma cached if datanode is dead?

-Jack


On Thu, Feb 13, 2014 at 1:41 PM, Jack Levin <magnito@gmail.com> wrote:

> As far as I can tell I am hitting this issue:
>
>
> http://grepcode.com/search/usages?type=method&id=repository.cloudera.com%24content%24repositories%24releases@com.cloudera.hadoop%24hadoop-core@0.20.2-320@org%24apache%24hadoop%24hdfs%24protocol@LocatedBlocks@findBlock%28long%29&k=u
>
>
>
> 1581 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-320/org/apache/hadoop/hdfs/DFSClient.java#1581>
> // search cached blocks first
>
> 1582 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-320/org/apache/hadoop/hdfs/DFSClient.java#1582>
> *int* targetBlockIdx = locatedBlocks <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-320/org/apache/hadoop/hdfs/DFSClient.java#DFSClient.DFSInputStream.0locatedBlocks>.findBlock
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-320/org/apache/hadoop/hdfs/protocol/LocatedBlocks.java#LocatedBlocks.findBlock%28long%29>(offset);
>
>  1583 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-320/org/apache/hadoop/hdfs/DFSClient.java#1583>
> *if* (targetBlockIdx < 0) { // block is not cached
>
>
> Our RS DFSClient is asking for a block on a dead datanode because the block is somehow
cached in DDFClient.  It seems that after DN dies, DFSClients in 90.5v of HBase do not drop
the cache reference where those blocks are.  Seems like a problem.  It would be good if there
was an ability for that cache to expire because our dead DN was down since Sunday.
>
>
> -Jack
>
>
>
>
> On Thu, Feb 13, 2014 at 11:23 AM, Stack <stack@duboce.net> wrote:
>
>> RS opens files and then keeps them open as long as the RS is alive.  We're
>> failing read of this replica and then we succeed getting the block
>> elsewhere?  You get that exception every time?  What hadoop version Jack?
>>  You have short-circuit reads on?
>> St.Ack
>>
>>
>> On Thu, Feb 13, 2014 at 10:41 AM, Jack Levin <magnito@gmail.com> wrote:
>>
>> > I meant its in the 'dead' list on HDFS namenode page. Hadoop fsck /
>> shows
>> > no issues.
>> >
>> >
>> > On Thu, Feb 13, 2014 at 10:38 AM, Jack Levin <magnito@gmail.com> wrote:
>> >
>> > >  Good morning --
>> > > I had a question, we have had a datanode go down, and its been down
>> for
>> > > few days, however hbase is trying to talk to that dead datanode still
>> > >  2014-02-13 08:57:23,073 WARN org.apache.hadoop.hdfs.DFSClient:
>> Failed to
>> > > connect to /10.101.5.5:50010 for file
>> > > /hbase/img39/6388c3574c32c409e8387d3c4d10fcdb/att/2690638688138250544
>> for
>> > > block 805865
>> > >
>> > > so, question is, how come RS trying to talk to dead datanode, its on
>> in
>> > > HDFS list even.
>> > >
>> > > Isn't the RS is just HDFS client?  And it should not talk to offlined
>> > HDFS
>> > > datanode that went down?  This caused a lot of issues in our cluster.
>> > >
>> > > Thanks,
>> > > -Jack
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message