hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From schubert zhang <zson...@gmail.com>
Subject Re: HDFS unbalance issue. (HBase over HDFS)
Date Fri, 27 Mar 2009 07:50:41 GMT
Sorry "But I found the namenode is fair to process the invalidating for each
datanode."
should be:

"But I found the namenode is unfair to process the invalidating for each
datanode."


On Fri, Mar 27, 2009 at 3:49 PM, schubert zhang <zsongbo@gmail.com> wrote:

> Thanks Samuel,
> Your information is very correct.
> I have also read code about garbage collection of invalidating blocks.
>
> But I found the namenode is fair to process the invalidating for each
> datanode.
> In my cluster, there are 5 datanode. The storage IDs are:
>
> node1: DS- 978762906-10.24.1.12-50010-1237686434530
> node2: DS- 489086185-10.24.1.14-50010-1237686416330
> node3: DS-1170985665-10.24.1.16-50010-1237686426395
> node4: DS-1024388083-10.24.1.18-50010-1237686404482
> node5: DS-2136798339-10.24.1.20-50010-1237686444430
> I know the storage ID is generated
> by org.apache.hadoop.hdfs.server.datanode.DataNode.setNewStorageID(...).
>
> In org.apache.hadoop.hdfs.server.namenode.FSNamesystem
>
>   // Keeps a Collection for every named machine containing
>   // blocks that have recently been invalidated and are thought to live
>   // on the machine in question.
>   // Mapping: StorageID -> ArrayList<Block>
>   //
>   private Map<String, Collection<Block>> recentInvalidateSets =
>     new TreeMap<String, Collection<Block>>();
>
> In org.apache.hadoop.hdfs.server.namenode.FSNamesystem.ReplicationMonitor
> This thread run in interval: replicationRecheckInterval=3000 milliseconds.
>
> Into computeDatanodeWork()
> nodesToProcess = 2.
>
> then into computeInvalidateWork(nodesToProcess)
> the for cycle will only exec 2 cycles.
>
> for each cycle, go into invalidateWorkForOneNode()
> it will always get the first node to invalidate blocks on this node.
>   String firstNodeId = recentInvalidateSets.keySet().iterator().next();
>
> TreeMap is a stored map, so, the ketSet is:
> [1024388083-10.24.1.18-50010-1237686404482,
> 1170985665-10.24.1.16-50010-1237686426395,
> 2136798339-10.24.1.20-50010-1237686444430,
> 489086185-10.24.1.14-50010-1237686416330,
> 978762906-10.24.1.12-50010-1237686434530]
>
> So, the sequence of node list in recentInvalidateSets is:
> [node4, node3, node5, node2, node1]
>
> So, every time in invalidateWorkForOneNode(), it will always process node4
> then node3, then node2 and then node1.
>
> My application is a HBase write-heavy application.
> So, there is many blocks need invalidate in each datanode. So when each
> 3000 milliseconds, at most, there is only two datanode is processed. Since
> the node1 is the last one in the TreeMap, it have no change to be garbage
> collected.
>
> I think HDFS namenode should fix this issue.
>
> Schubert
>
> On Thu, Mar 26, 2009 at 2:57 PM, Samuel Guo <guosijie@gmail.com> wrote:
>
>> After a file is deleted, HDFS does not immediately reclaim the available
>> physical storage. It does so only lazily during garbage collection. When a
>> file is deleted by the application, the master remove the file's metadata
>> from *FSNamesystem* and logs the deletion immediately. And the file's
>> deleted blocks information will be collected in each DataNodeDescriptor's
>> *invalidateBlocks* set in Namenode. During the heartbeats between NN and
>> DN,
>> NN will scan the specified DN's DataNodeDescriptor's invalidateBlocks set,
>> find the blocks to be deleted in DN and send a *DNA_INVALIDATE*
>> BlockCommand
>> to DN. And the *BlockScanner* thread running on DN will scan, find and
>> delete these blocks after DN receives the *DNA_INVALIDATE* BlockCommand.
>>
>> You can search *DNA_INVALIDATE* in DataNode.java and NameNode.java files,
>> and find the logic of the garbage collection. Hope it will be helpful.
>>
>> On Thu, Mar 26, 2009 at 11:07 AM, schubert zhang <zsongbo@gmail.com>
>> wrote:
>>
>> > Tanks Andrew and Billy.
>> > I think the subject of this mail thread is not appropriate, it may not
>> be a
>> > balance issue.
>> > The problem seems the block deleting scheduler in HDFS.
>> >
>> > Last night(timezone:+8), I slow down my application, and this morning, I
>> > found almost all garbage blocks are deleted.
>> > Here is the current blocks number of each datanode:
>> > node1: 10651
>> > node2: 10477
>> > node3: 12185
>> > node4: 11607
>> > node5: 14000
>> >
>> > It seems fine.
>> > But I want to study the code of HDFS and make clear the policy of
>> deleting
>> > blocks on datanodes. If anyone in the hadoop community can  give me some
>> > advices?
>> >
>> > Schubert
>> >
>> > On Thu, Mar 26, 2009 at 7:55 AM, Andrew Purtell <apurtell@apache.org>
>> > wrote:
>> >
>> >
>> > >
>> > > > From: schubert zhang <zsongbo@gmail.com>
>> > > > From another point of view, I think HBase cannot control to
>> > > > delete blocks on which node, it would just delete files, and
>> > > > HDFS delete blocks where the blocks locating.
>> > >
>> > > Yes, that is exactly correct.
>> > >
>> > > Best regards,
>> > >
>> > >   - Andy
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message