incubator-whirr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Clark <...@daltonclark.com>
Subject hadoop config property override problem
Date Sat, 12 Mar 2011 03:56:58 GMT
Yes, thanks Tom, that works.

Many features and design changes in this branch are much appreciated, especially the config
property override in the whirr config file.

I found one problem, at least in the 0.4 branch.  If you override a property with a comma-separated
list, for example:

hadoop-common.io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec

then what actually shows up in core-site.xml is surrounded by brackets and has spaces between
the elements of the list, so

[org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec,
com.hadoop.compression.lzo.LzoCodec, com.hadoop.compression.lzo.LzopCodec]

Hadoop is not chomping or trimming that kind of thing, so you need to remove the spaces and
brackets to get that to work.  It's easy enough to patch the file, deploy to the slaves and
restart, but I'm wondering if that's accounted for anywhere.  I scanned CHANGES.txt in trunk
and I don't see it.

--Ben


On Mar 10, 2011, at 5:41 PM, Tom White wrote:

> On Thu, Mar 10, 2011 at 1:19 PM, Benjamin Clark <ben@daltonclark.com> wrote:
>> Thank you both, Tom and Andrei.   Now that I know where the faq is, I hope to bother
you less with things that are documented!
>> 
>> I think I should be all set with customization, but I need to build.  BUILD.txt says
'mvn clean install' or mvn package -Ppackage.  I can do that, and mvn reports success but
then I try to use whirr-cli-0.4.0-incubating.jar as the jar, and the manifest has no main
class (OK, I can supply 'org.apache.whirr.cli.Main'  if I need to), and in any case the jar
is not a fat jar, as I see all the publicly distributed versions are.  It looks in the poms
as if you have the maven assembly plugins trying to make a fat jar, but it doesn't seem to
be doing it for me.
> 
> Whirr no longer produces a shaded (fat) JAR as of 0.4.0 and trunk, so
> perhaps it is working. Try bin/whirr and it should list the roles for
> you.
> 
> Cheers
> Tom
> 
>> 
>> What am I doing wrong?
>> 
>> I'm doing this on a Mac like so:
>> 
>> $ ruby --version
>> ruby 1.8.7 (2010-08-16 patchlevel 302) [i686-darwin10]
>> $ mvn --version
>> Apache Maven 3.0.2 (r1056850; 2011-01-08 19:58:10-0500)
>> Java version: 1.6.0_24, vendor: Apple Inc.
>> Java home: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
>> Default locale: en_US, platform encoding: MacRoman
>> OS name: "mac os x", version: "10.6.6", arch: "x86_64", family: "mac"
>> $ java -version
>> java version "1.6.0_24"
>> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
>> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>> 
>> 
>> 
>> 
>> Now I think my only problem is that
>> On Mar 10, 2011, at 2:35 PM, Andrei Savu wrote:
>> 
>>> Starting with the upcoming 0.4.0 release Whirr is no longer using S3
>>> for storing the install and configure scripts. You can grab the
>>> scripts from:
>>> 
>>> ${WHIRR_HOME}/services/${SERVICE_NAME}/src/main/resources/functions/{install,configure}_SERVICE.sh
>>> 
>>> It's also easier to customize the scripts. You just need to place your
>>> version in ${WHIRR_HOME}/functions (I believe it should have the same
>>> name).
>>> 
>>> From the 0.4.0 FAQ: "If you want to change the scripts then you can
>>> place a modified copy of the
>>> scripts in a _functions_ directory in Whirr's installation directory. The
>>> original versions of the scripts can be found in _functions_ directories in the
>>> source trees."
>>> 
>>> -- Andrei Savu / andreisavu.ro
>>> 
>>> On Thu, Mar 10, 2011 at 9:27 PM, Benjamin Clark <ben@daltonclark.com> wrote:
>>>> So if we grab install_cdh_hadoop.sh from the source tree, and put a customized
version in our own bucket, and set whirr.run-url-base to the root of that bucket, it should
work, even in 4.0 and after?
>>>> 
>>>> Based on the FAQ I tried a few of these to attempt to verify I was on the
right track:
>>>> 
>>>> wget http://whirr.s3.amazonaws.com/install_cdh_hadoop
>>>> wget http://whirr.s3.amazonaws.com/0.4/install_cdh_hadoop
>>>> wget http://whirr.s3.amazonaws.com/0.4.0/install_cdh_hadoop
>>>> wget http://whirr.s3.amazonaws.com/install_cdh_hadoop.sh
>>>> wget http://whirr.s3.amazonaws.com/0.4/install_cdh_hadoop.sh
>>>> wget http://whirr.s3.amazonaws.com/0.4.0/install_cdh_hadoop.sh
>>>> 
>>>> but all give 404s.
>>>> 
>>>> 
>>>> 
>>>> On Mar 10, 2011, at 12:01 AM, Tom White wrote:
>>>> 
>>>>> Sorry I missed this thread. On 0.4.0 and later
>>>>> 
>>>>> whirr.hadoop-install-runurl=cloudera/cdh/install
>>>>> whirr.hadoop-configure-runurl=cloudera/cdh/post-configure
>>>>> 
>>>>> changes to
>>>>> 
>>>>> whirr.hadoop-install-function=install_cdh_hadoop
>>>>> whirr.hadoop-configure-function=configure_cdh_hadoop
>>>>> 
>>>>> Cheers,
>>>>> Tom
>>>>> 
>>>>> On Wed, Mar 9, 2011 at 8:23 AM, Sebastian Schoenherr
>>>>> <sebastian.schoenherr@uibk.ac.at> wrote:
>>>>>> Hi Saptarshi,
>>>>>> I tried to execute my working whirr 0.3.0 configuration (identical
to your
>>>>>> property file, using cloudera scripts) on branch-0.4 and the same
issues
>>>>>> arised for me.  Unfortunately I'm not sure yet why it's not working
with
>>>>>> branch-0.4. Is using branch-0.3 an option for you?
>>>>>> Any other guesses?
>>>>>> cheers,
>>>>>> sebastian
>>>>>> 
>>>>>> 
>>>>>> On 08.03.2011 05:41, Saptarshi Guha wrote:
>>>>>>> 
>>>>>>> once again, i've changed the secret identity ..
>>>>>>> 
>>>>>>> On Mon, Mar 7, 2011 at 8:41 PM, Saptarshi Guha
>>>>>>> <saptarshi@revolutionanalytics.com>  wrote:
>>>>>>>> 
>>>>>>>> Hello
>>>>>>>> 
>>>>>>>> No such luck on my end This is my script file, you can test
that the
>>>>>>>> scripts download. But when I log in
>>>>>>>> hadoop version is. (I pulled the latest git). Also my scripts
(you can
>>>>>>>> confirm if you download) echo a small line to files in /tmp.
>>>>>>>> They are not being created.
>>>>>>>> 
>>>>>>>> Hadoop 0.20.2
>>>>>>>> Subversion
>>>>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
>>>>>>>> -r 911707
>>>>>>>> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
>>>>>>>> 
>>>>>>>> whirr.cluster-name=revotesting2
>>>>>>>> whirr.service-name=hadoop
>>>>>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2
>>>>>>>> hadoop-datanode+hadoop-tasktracker
>>>>>>>> whirr.provider=aws-ec2
>>>>>>>> whirr.identity= AKIAJH5JBSI5KJ7YZQ6A
>>>>>>>> whirr.credential= b/kqLJAHOdRA4L30n7Zt8Edz383B1ARtPI3wiyD6
>>>>>>>> whirr.location-id=us-east-1
>>>>>>>> whirr.hardware-id=c1.xlarge
>>>>>>>> whirr.run-url-base=http://ml.stat.purdue.edu/whirr-scripts/
>>>>>>>> whirr.hadoop-install-runurl=cloudera/cdh/install
>>>>>>>> whirr.hadoop-configure-runurl=cloudera/cdh/post-configure
>>>>>>>> 
>>>>>>>> ## Rightscales CentOS AMI
>>>>>>>> 
>>>>>>>> ##http://support.rightscale.com/18-Release_Notes/02-AMI/RightImages_Release_Notes
>>>>>>>> jclouds.ec2.ami-owners=411009282317
>>>>>>>> whirr.image-id=us-east-1/ami-ccb35ea5
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Mar 7, 2011 at 9:59 AM, Benjamin Clark<ben@daltonclark.com>
>>>>>>>>  wrote:
>>>>>>>>> 
>>>>>>>>> In my experience you need
>>>>>>>>> 
>>>>>>>>> whirr.run-url-base=http://name-of-my-bucket-with-customized-scripts/
>>>>>>>>> whirr.hadoop-install-runurl=cloudera/cdh/install
>>>>>>>>> 
>>>>>>>>> and then you *also* need to have a copy of sun/java/install
in that same
>>>>>>>>> bucket.
>>>>>>>>> 
>>>>>>>>> And both of those scripts need to be public-readable.
>>>>>>>>> 
>>>>>>>>> So in the end you should be able to do
>>>>>>>>> curl
>>>>>>>>> http://name-of-my-bucket-with-customized-scripts.s3.amazonaws.com/cloudera/cdh/install
>>>>>>>>> 
>>>>>>>>> and
>>>>>>>>> curl
>>>>>>>>> http://name-of-my-bucket-with-customized-scripts.s3.amazonaws.com/sun/java/install
>>>>>>>>> 
>>>>>>>>> Even if you haven't customizied sun/java/install, it
needs to be there.
>>>>>>>>> 
>>>>>>>>> If you do all that, the scripts will run and you will
have the versions
>>>>>>>>> you asked for.
>>>>>>>>> 
>>>>>>>>> `hadoop version` on the name node then says, in my case:
>>>>>>>>> 
>>>>>>>>> Hadoop 0.20.2-CDH3B4
>>>>>>>>> 
>>>>>>>>> On Mar 7, 2011, at 12:41 PM, Saptarshi Guha wrote:
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> Fixed the security slip up. Did the hadoop version
thing and got this
>>>>>>>>>> 
>>>>>>>>>> [root@domU-12-31-39-0B-CC-41 ~]# hadoop version
>>>>>>>>>> Hadoop 0.20.2
>>>>>>>>>> Subversion
>>>>>>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
>>>>>>>>>> -r 911707
>>>>>>>>>> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
>>>>>>>>>> 
>>>>>>>>>> So i guess its not CDH.
>>>>>>>>>> 
>>>>>>>>>> Thanks
>>>>>>>>>> Saptarshi
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Mar 7, 2011 at 9:22 AM, Saptarshi Guha
>>>>>>>>>> <saptarshi@revolutionanalytics.com>  wrote:
>>>>>>>>>>> 
>>>>>>>>>>> dear me! thanks, will do right away.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Mar 7, 2011 at 1:46 AM, Sebastian Schoenherr
>>>>>>>>>>> <sebastian.schoenherr@uibk.ac.at>  wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Saptarshi,
>>>>>>>>>>>> Try to execute "hadoop version" on your namenode,
if the output is
>>>>>>>>>>>> Hadoop
>>>>>>>>>>>> 0.20.2-CDH3B4, the current cloudera distribution
has been installed.
>>>>>>>>>>>> Btw, I would recommend to set your current
Access Key ID and Secret
>>>>>>>>>>>> Key
>>>>>>>>>>>> inactive, since you posted it in your prop
file.
>>>>>>>>>>>> cheers
>>>>>>>>>>>> sebastian
>>>>>>>>>>>> 
>>>>>>>>>>>> On 06.03.2011 06:54, Saptarshi Guha wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I did git clone of the latest whirr and
copied cloudera scripts into
>>>>>>>>>>>>> the script directory (copied over
>>>>>>>>>>>>> from whirr-0.3-incubating).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> My properties file is at the end of this
email.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> However, I don't think the scripts are
being run because the
>>>>>>>>>>>>> jobtracker
>>>>>>>>>>>>> is the default Apache hadoop jobtracker
and not the cloudera
>>>>>>>>>>>>> jobtracker.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Have i missed something?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks in advance
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Saptarshi
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ## Properties
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> whirr.cluster-name=revotesting
>>>>>>>>>>>>> whirr.service-name=hadoop
>>>>>>>>>>>>> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,2
>>>>>>>>>>>>> hadoop-datanode+hadoop-tasktracker
>>>>>>>>>>>>> whirr.provider=aws-ec2
>>>>>>>>>>>>> whirr.identity= AKIAI3FUFFXAPYLE7CJA
>>>>>>>>>>>>> whirr.credential= 2Yq3Ar2HSxK/hbwZHs6aN6yrh0yfGNSPTpVw3t2n
>>>>>>>>>>>>> whirr.location-id=us-east-1
>>>>>>>>>>>>> whirr.hardware-id=c1.xlarge
>>>>>>>>>>>>> 
>>>>>>>>>>>>> ## Rightscales CentOS AMI
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> http://support.rightscale.com/18-Release_Notes/02-AMI/RightImages_Release_Notes
>>>>>>>>>>>>> jclouds.ec2.ami-owners=411009282317
>>>>>>>>>>>>> whirr.image-id=us-east-1/ami-ccb35ea5
>>>>>>>>>>>>> 
>>>>>>>>>>>>> whirr.hadoop-install-runurl=cloudera/cdh/install
>>>>>>>>>>>>> whirr.hadoop-configure-runurl=cloudera/cdh/post-configure
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message