hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Nikolic <i.niko...@tudelft.nl>
Subject Re: Is Hadoop the thing for us ?
Date Wed, 25 Jun 2008 13:21:00 GMT
Thank you for your comment, it did confirm my suspicions.

You framed the problem correctly. I will probably invest a bit of time 
studying the framework anyway, to see if a rewrite is interesting, since 
we hit scaling limitations on our Agent scheduler framework. Our main 
computational load is the massive amount of agent reasoning ( think 
JbossRules) and  inter-agent communication ( they need to sell and buy 
stuff to each other)  so I am not sure if it is at all possible to break 
it down to small tasks, specially if this needs to happen across CPU's, 
the latency is going to kill us.

Thanks
igor

John Martyniak wrote:
> I am new to Hadoop.  So take this information with a grain of salt.
> But the power of Hadoop is breaking down big problems into small pieces and
> spreading it across many (thousands) of machines, in effect creating a
> massively parallel processing engine.
>
> But in order to take advantage of that functionality you must write your
> application to take advantage of it, using the Hadoop frameworks.
>
> So if I understand  your dilemma correctly.  I do not think that Hadoop is
> for you, unless you want to re-write your app to take advantage of it.  And
> I suspect that if you have access to a traditional cluster, that will be a
> better alternative for you.
>
> Hope that this helps some.
>
> -John
>
>
> On Wed, Jun 25, 2008 at 7:33 AM, Igor Nikolic <i.nikolic@tudelft.nl> wrote:
>
>   
>> Hello list
>>
>> We will be getting access to a cluster soon, and I was wondering whether
>> this I should use Hadoop ?  Or am I better of with the usual batch
>> schedulers such as ProActive etc ? I am not a CS/CE person, and from reading
>> the website I can not get a sense of whether hadoop is for me.
>>
>> A little background:
>> We have a  relatively large agent based simulation ( 20+ MB jar) that needs
>> to be swept across very large parameter spaces. Agents communicate only
>> within the simulation, so there is no interprocess communication. The
>> parameter vector is max 20 long , the simulation may take 5-10 minutes on a
>> normal desktop and it might return a few mb of raw data. We need 10k-100K
>> runs, more if possible.
>>
>>
>>
>> Thanks for advice, even a short yes/no is welcome
>>
>> Greetings
>> Igor
>>
>> --
>> ir. Igor Nikolic
>> PhD Researcher
>> Section Energy & Industry
>> Faculty of Technology, Policy and Management
>> Delft University of Technology, The Netherlands
>>
>> Tel: +31152781135
>> Email: i.nikolic@tudelft.nl
>> Web: http://www.igornikolic.com
>> wiki server: http://wiki.tudelft.nl
>>
>>
>>     
>
>
>   


-- 
ir. Igor Nikolic
PhD Researcher
Section Energy & Industry
Faculty of Technology, Policy and Management
Delft University of Technology, The Netherlands

Tel: +31152781135
Email: i.nikolic@tudelft.nl
Web: http://www.igornikolic.com
wiki server: http://wiki.tudelft.nl


Mime
View raw message