hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: HDFS unbalance issue. (HBase over HDFS)
Date Wed, 01 Apr 2009 07:44:34 GMT
What was the fix?
Thanks,
St.Ack

On Wed, Apr 1, 2009 at 6:54 AM, zsongbo <zsongbo@gmail.com> wrote:

> Thanks stack. (i am schubert)
> Yes, I have found the fix in 0.20.
> And I just made a temporary fix based on branch-0.19 and it work fine.
>
> On Mon, Mar 30, 2009 at 7:16 PM, stack <stack@duboce.net> wrote:
>
> > Thanks for doing the digging Schubert.  I agree, its an ugly issue.
> >  Another
> > gentleman reported he'd tripped over the same thing in private mail.  I'd
> > suggest you file an issue against HADOOP HDFS and add the below (I"ve
> > opened
> > HBASE-1296 so we can track it in our project).
> >
> > Good stuff,
> > St.Ack
> >
> > On Fri, Mar 27, 2009 at 9:50 AM, schubert zhang <zsongbo@gmail.com>
> wrote:
> >
> > > Sorry "But I found the namenode is fair to process the invalidating for
> > > each
> > > datanode."
> > > should be:
> > >
> > > "But I found the namenode is unfair to process the invalidating for
> each
> > > datanode."
> > >
> > >
> > > On Fri, Mar 27, 2009 at 3:49 PM, schubert zhang <zsongbo@gmail.com>
> > wrote:
> > >
> > > > Thanks Samuel,
> > > > Your information is very correct.
> > > > I have also read code about garbage collection of invalidating
> blocks.
> > > >
> > > > But I found the namenode is fair to process the invalidating for each
> > > > datanode.
> > > > In my cluster, there are 5 datanode. The storage IDs are:
> > > >
> > > > node1: DS- 978762906-10.24.1.12-50010-1237686434530
> > > > node2: DS- 489086185-10.24.1.14-50010-1237686416330
> > > > node3: DS-1170985665-10.24.1.16-50010-1237686426395
> > > > node4: DS-1024388083-10.24.1.18-50010-1237686404482
> > > > node5: DS-2136798339-10.24.1.20-50010-1237686444430
> > > > I know the storage ID is generated
> > > > by
> > org.apache.hadoop.hdfs.server.datanode.DataNode.setNewStorageID(...).
> > > >
> > > > In org.apache.hadoop.hdfs.server.namenode.FSNamesystem
> > > >
> > > >   // Keeps a Collection for every named machine containing
> > > >   // blocks that have recently been invalidated and are thought to
> live
> > > >   // on the machine in question.
> > > >   // Mapping: StorageID -> ArrayList<Block>
> > > >   //
> > > >   private Map<String, Collection<Block>> recentInvalidateSets
=
> > > >     new TreeMap<String, Collection<Block>>();
> > > >
> > > > In
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.ReplicationMonitor
> > > > This thread run in interval: replicationRecheckInterval=3000
> > > milliseconds.
> > > >
> > > > Into computeDatanodeWork()
> > > > nodesToProcess = 2.
> > > >
> > > > then into computeInvalidateWork(nodesToProcess)
> > > > the for cycle will only exec 2 cycles.
> > > >
> > > > for each cycle, go into invalidateWorkForOneNode()
> > > > it will always get the first node to invalidate blocks on this node.
> > > >   String firstNodeId =
> recentInvalidateSets.keySet().iterator().next();
> > > >
> > > > TreeMap is a stored map, so, the ketSet is:
> > > > [1024388083-10.24.1.18-50010-1237686404482,
> > > > 1170985665-10.24.1.16-50010-1237686426395,
> > > > 2136798339-10.24.1.20-50010-1237686444430,
> > > > 489086185-10.24.1.14-50010-1237686416330,
> > > > 978762906-10.24.1.12-50010-1237686434530]
> > > >
> > > > So, the sequence of node list in recentInvalidateSets is:
> > > > [node4, node3, node5, node2, node1]
> > > >
> > > > So, every time in invalidateWorkForOneNode(), it will always process
> > > node4
> > > > then node3, then node2 and then node1.
> > > >
> > > > My application is a HBase write-heavy application.
> > > > So, there is many blocks need invalidate in each datanode. So when
> each
> > > > 3000 milliseconds, at most, there is only two datanode is processed.
> > > Since
> > > > the node1 is the last one in the TreeMap, it have no change to be
> > garbage
> > > > collected.
> > > >
> > > > I think HDFS namenode should fix this issue.
> > > >
> > > > Schubert
> > > >
> > > > On Thu, Mar 26, 2009 at 2:57 PM, Samuel Guo <guosijie@gmail.com>
> > wrote:
> > > >
> > > >> After a file is deleted, HDFS does not immediately reclaim the
> > available
> > > >> physical storage. It does so only lazily during garbage collection.
> > When
> > > a
> > > >> file is deleted by the application, the master remove the file's
> > > metadata
> > > >> from *FSNamesystem* and logs the deletion immediately. And the
> file's
> > > >> deleted blocks information will be collected in each
> > > DataNodeDescriptor's
> > > >> *invalidateBlocks* set in Namenode. During the heartbeats between
NN
> > and
> > > >> DN,
> > > >> NN will scan the specified DN's DataNodeDescriptor's
> invalidateBlocks
> > > set,
> > > >> find the blocks to be deleted in DN and send a *DNA_INVALIDATE*
> > > >> BlockCommand
> > > >> to DN. And the *BlockScanner* thread running on DN will scan, find
> and
> > > >> delete these blocks after DN receives the *DNA_INVALIDATE*
> > BlockCommand.
> > > >>
> > > >> You can search *DNA_INVALIDATE* in DataNode.java and NameNode.java
> > > files,
> > > >> and find the logic of the garbage collection. Hope it will be
> helpful.
> > > >>
> > > >> On Thu, Mar 26, 2009 at 11:07 AM, schubert zhang <zsongbo@gmail.com
> >
> > > >> wrote:
> > > >>
> > > >> > Tanks Andrew and Billy.
> > > >> > I think the subject of this mail thread is not appropriate, it
may
> > not
> > > >> be a
> > > >> > balance issue.
> > > >> > The problem seems the block deleting scheduler in HDFS.
> > > >> >
> > > >> > Last night(timezone:+8), I slow down my application, and this
> > morning,
> > > I
> > > >> > found almost all garbage blocks are deleted.
> > > >> > Here is the current blocks number of each datanode:
> > > >> > node1: 10651
> > > >> > node2: 10477
> > > >> > node3: 12185
> > > >> > node4: 11607
> > > >> > node5: 14000
> > > >> >
> > > >> > It seems fine.
> > > >> > But I want to study the code of HDFS and make clear the policy
of
> > > >> deleting
> > > >> > blocks on datanodes. If anyone in the hadoop community can  give
> me
> > > some
> > > >> > advices?
> > > >> >
> > > >> > Schubert
> > > >> >
> > > >> > On Thu, Mar 26, 2009 at 7:55 AM, Andrew Purtell <
> > apurtell@apache.org>
> > > >> > wrote:
> > > >> >
> > > >> >
> > > >> > >
> > > >> > > > From: schubert zhang <zsongbo@gmail.com>
> > > >> > > > From another point of view, I think HBase cannot control
to
> > > >> > > > delete blocks on which node, it would just delete files,
and
> > > >> > > > HDFS delete blocks where the blocks locating.
> > > >> > >
> > > >> > > Yes, that is exactly correct.
> > > >> > >
> > > >> > > Best regards,
> > > >> > >
> > > >> > >   - Andy
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message