hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Clarifications on excludedNodeList in DFSClient
Date Fri, 30 Nov 2012 11:50:00 GMT
Hi Inder,

I didn't see a relevant JIRA on this yet, so I went ahead and filed
one at https://issues.apache.org/jira/browse/HDFS-4246 as it seems to
affect HBase WALs too (when their blocksizes are
configured large, they form a scenario similar to yours), on small
clusters or in specific scenarios over large ones.

I think a timed-life cache of excludeNodesList would be more
preferable than a static, but we can keep it optional (or defaulted to
infinite) to not harm the existing behavior.

On Tue, Nov 20, 2012 at 11:49 PM, Harsh J <harsh@cloudera.com> wrote:
> The excludeNode list is initialized for each output stream created
> under a DFSClient instance. That is, it is empty for every new
> FS.create() returned DFSOutputStream initially and is maintained
> separately for each file created under a common DFSClient.
> However, this could indeed be a problem for a long-running single-file
> client, which I assume is a continuously alive and hflush()-ing one.
> Can you search for and file a JIRA to address this with any discussion
> taken there? Please put up your thoughts there as well.
> On Mon, Nov 19, 2012 at 3:25 PM, Inder Pall <inder.pall@gmail.com> wrote:
>> Folks,
>> i was wondering if there is any mechanism/logic to move a node back from the
>> excludedNodeList to live nodes to be tried for new block creation.
>> In the current DFSClient code i do not see this. The use-case is if the
>> write timeout is being reduced and certain nodes get aggressively added to
>> the excludedNodeList and the client caches DFSClient then the excludedNodes
>> never get tried again in the lifetime of the application caching DFSClient
>> --
>> - Inder
>> "You are average of the 5 people you spend the most time with"
> --
> Harsh J

Harsh J

View raw message