hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Demai Ni <nid...@gmail.com>
Subject Re: conf.get("dfs.data.dir") return null when hdfs-site.xml doesn't set it explicitly
Date Tue, 09 Sep 2014 00:02:06 GMT
Bhooshan,

Many thanks. I appreciate the help. I will also try out Cloudera mailing
list/community

Demai

On Mon, Sep 8, 2014 at 4:58 PM, Bhooshan Mogal <bhooshan.mogal@gmail.com>
wrote:

> Hi Demai,
>
> conf = new Configuration()
>
> will create a new Configuration object and only add the properties from
> core-default.xml and core-site.xml in the conf object.
>
> This is basically a new configuration object, not the same that the
> daemons in the hadoop cluster use.
>
>
>
> I think what you are trying to ask is if you can get the Configuration
> object that a daemon in your live cluster (e.g. datanode) is using. I am
> not sure if the datanode or any other daemon on a hadoop cluster exposes
> such an API.
>
> I would in fact be tempted to get this information from the configuration
> management daemon instead - in your case cloudera manager. But I am not
> sure if CM exposes that API either. You could probably find out on the
> Cloudera mailing list.
>
>
> HTH,
> Bhooshan
>
>
> On Mon, Sep 8, 2014 at 3:52 PM, Demai Ni <nidmgg@gmail.com> wrote:
>
>> hi, Bhooshan,
>>
>> thanks for your kind response.  I run the code on one of the data node of
>> my cluster, with only one hadoop daemon running. I believe my java client
>> code connect to the cluster correctly as I am able to retrieve fileStatus,
>> and list files under a particular hdfs path, and similar things...
>> However, you are right that the daemon process use the hdfs-site.xml under
>> another folder for cloudera :
>> /var/run/cloudera-scm-agent/process/90-hdfs-DATANODE/hdfs-site.xml.
>>
>> about " retrieving the info from a live cluster", I would like to get the
>> information beyond the configuration files(that is beyond the .xml files).
>> Since I am able to use :
>> conf = new Configuration()
>> to connect to hdfs and did other operations, shouldn't I be able to
>> retrieve the configuration variables?
>>
>> Thanks
>>
>> Demai
>>
>>
>> On Mon, Sep 8, 2014 at 2:40 PM, Bhooshan Mogal <bhooshan.mogal@gmail.com>
>> wrote:
>>
>>> Hi Demai,
>>>
>>> When you read a property from the conf object, it will only have a value
>>> if the conf object contains that property.
>>>
>>> In your case, you created the conf object as new Configuration() -- adds
>>> core-default and core-site.xml.
>>>
>>> Then you added site.xmls (hdfs-site.xml and core-site.xml) from specific
>>> locations. If none of these files have defined dfs.data.dir, then you will
>>> get NULL. This is expected behavior.
>>>
>>> What do you mean by retrieving the info from a live cluster? Even for
>>> processes like datanode, namenode etc, the source of truth for these
>>> properties is hdfs-site.xml. It is loaded from a specific location when you
>>> start these services.
>>>
>>> Question: Where are you running the above code? Is it on a node which
>>> has other hadoop daemons as well?
>>>
>>> My guess is that the path you are referring to (/etc/hadoop/conf.
>>> cloudera.hdfs/core-site.xml) is not the right path where these config
>>> properties are defined. Since this is a CDH cluster, you would probably be
>>> best served by asking on the CDH mailing list as to where the right path to
>>> these files is.
>>>
>>>
>>> HTH,
>>> Bhooshan
>>>
>>>
>>> On Mon, Sep 8, 2014 at 11:47 AM, Demai Ni <nidmgg@gmail.com> wrote:
>>>
>>>> hi, experts,
>>>>
>>>> I am trying to get the local filesystem directory of data node. My
>>>> cluster is using CDH5.x (hadoop 2.3) and the default configuration. So the
>>>> datanode is under file:///dfs/dn. I didn't specify the value in
>>>> hdfs-site.xml.
>>>>
>>>> My code is something like:
>>>>
>>>> conf = new Configuration()
>>>>
>>>> // test both with and without the following two lines
>>>> conf.addResource (new
>>>> Path("/etc/hadoop/conf.cloudera.hdfs/hdfs-site.xml"));
>>>> conf.addResource (new
>>>> Path("/etc/hadoop/conf.cloudera.hdfs/core-site.xml"));
>>>>
>>>> // I also tried get("dfs.datanode.data.dir"), which also return NULL
>>>> String dnDir = conf.get("dfs.data.dir");  // return NULL
>>>>
>>>> It looks like the get only look at the configuration file instead of
>>>> retrieving the info from the live cluster?
>>>>
>>>> Many thanks for your help in advance.
>>>>
>>>> Demai
>>>>
>>>
>>>
>>>
>>> --
>>> Bhooshan
>>>
>>
>>
>
>
> --
> Bhooshan
>

Mime
View raw message