hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2092) Create a light inner conf class in DFSClient
Date Fri, 24 Jun 2011 06:08:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054247#comment-13054247
] 

Aaron T. Myers commented on HDFS-2092:
--------------------------------------

Note: I'm not necessarily opposed to this change, but please justify its usefulness. From
what I can tell so far, this patch seems to be optimizing something that's not actually an
issue.

bq. That was just a sample of measurement for a day. 

Sure, but what was it actually measuring? Increase in child heap size per task attempt? Increase
in heap size per TT? Something else?

bq. Also, Going forward, PIG 0.9 will store lots of meta data in the conf and also one can
embed the PIG script itself in the conf.

I don't know much about Pig, but that sounds like a bad idea on its part. Maybe I'm wrong
about that.

bq. This can potentially blow the TT.

Can it? I've seen users have a lot of different problems with Hadoop, but Task Trackers falling
over because of conf objects being too large isn't one I can recall.

bq. Since one can store anything in the job conf, we should be careful with the references
to this object - we should not hold for long duration.

At most these references will be held for the lifetime of a task attempt, right? So not so
long?

> Create a light inner conf class in DFSClient
> --------------------------------------------
>
>                 Key: HDFS-2092
>                 URL: https://issues.apache.org/jira/browse/HDFS-2092
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.23.0
>            Reporter: Bharath Mundlapudi
>            Assignee: Bharath Mundlapudi
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch
>
>
> At present, DFSClient stores reference to configuration object. Since, these configuration
objects are pretty big at times can blot the processes which has multiple DFSClient objects
like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient.

> This patch creates a light inner conf class and copies the required keys from the Configuration
object.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message