hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edmund Kohlwey <ekohl...@gmail.com>
Subject Re: Time to build my own cluster - advice?
Date Thu, 05 Nov 2009 18:37:32 GMT
First of all, let me say I don't use EC2 - there's some people at my 
company who do, but I've been fortunate enough to use our internal dev 
cluster for all the work I've done, so this is total hearsay.

That having been said, the people that I know who are using EC2 aren't 
leaving the cluster running when not in use - there's scripts from (I 
believe) Cloudera that can allocate and configure the right number of 
nodes on EC2 with whatever AMI you specify, and then tear them down when 
you're done.

On 11/5/09 1:14 PM, Mark Kerzner wrote:
> Edmund,
> I wanted to install OpenOffice and connect to it from my java code. I tried
> to replicate the complete install by copying it, but there must be something
> else there, because I can't connect on Amazon MapReduce, but I can on my own
> cluster.
> When you say cheaper, do you mean that keeping your own EC2 machines up and
> using them as hadoop cluster is in the end cheaper than starting a Hadoop
> cluster every time you want to run a job?
> Thank you,
> Mark
> On Thu, Nov 5, 2009 at 12:04 PM, Edmund Kohlwey<ekohlwey@gmail.com>  wrote:
>> If all your dependencies are java based (like Open Office) you might try
>> using a dependency manager/build tool like maven or ant/ivy to package the
>> dependencies in your jar. I'm not sure if any parts of open office are
>> available in a public repo as maven artifacts or not, or if you want to get
>> into packaging artifacts for your build system, but its something you might
>> try.
>> I think its cheaper to just use EC2 anyways, so that might be a motivating
>> factor for you as well.
>>   Hi,
>>>> so far I've been using Amazon MapReduce. However, my app uses a growing
>>>> number of Linux packages. I have been installing them on the fly, in the
>>>> Mapper.configure(), but with OpenOffice this is hard, and I don't get a
>>>> service connection even after local install.
>>>> Therefore, my question is on the advice in creating my own Hadoop cluster
>>>> out of EC2 machines. Are there instructions? How hard is it? What are
>>>> best
>>>> practices?
>>>> Thank you,
>>>> Mark

View raw message