hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samuel Guo <guosi...@gmail.com>
Subject Re: HDFS unbalance issue. (HBase over HDFS)
Date Thu, 26 Mar 2009 06:57:09 GMT
After a file is deleted, HDFS does not immediately reclaim the available
physical storage. It does so only lazily during garbage collection. When a
file is deleted by the application, the master remove the file's metadata
from *FSNamesystem* and logs the deletion immediately. And the file's
deleted blocks information will be collected in each DataNodeDescriptor's
*invalidateBlocks* set in Namenode. During the heartbeats between NN and DN,
NN will scan the specified DN's DataNodeDescriptor's invalidateBlocks set,
find the blocks to be deleted in DN and send a *DNA_INVALIDATE* BlockCommand
to DN. And the *BlockScanner* thread running on DN will scan, find and
delete these blocks after DN receives the *DNA_INVALIDATE* BlockCommand.

You can search *DNA_INVALIDATE* in DataNode.java and NameNode.java files,
and find the logic of the garbage collection. Hope it will be helpful.

On Thu, Mar 26, 2009 at 11:07 AM, schubert zhang <zsongbo@gmail.com> wrote:

> Tanks Andrew and Billy.
> I think the subject of this mail thread is not appropriate, it may not be a
> balance issue.
> The problem seems the block deleting scheduler in HDFS.
>
> Last night(timezone:+8), I slow down my application, and this morning, I
> found almost all garbage blocks are deleted.
> Here is the current blocks number of each datanode:
> node1: 10651
> node2: 10477
> node3: 12185
> node4: 11607
> node5: 14000
>
> It seems fine.
> But I want to study the code of HDFS and make clear the policy of deleting
> blocks on datanodes. If anyone in the hadoop community can  give me some
> advices?
>
> Schubert
>
> On Thu, Mar 26, 2009 at 7:55 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
>
> >
> > > From: schubert zhang <zsongbo@gmail.com>
> > > From another point of view, I think HBase cannot control to
> > > delete blocks on which node, it would just delete files, and
> > > HDFS delete blocks where the blocks locating.
> >
> > Yes, that is exactly correct.
> >
> > Best regards,
> >
> >   - Andy
> >
> >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message