hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Trebacz <maciej.treb...@gmail.com>
Subject Re: Using Hadoop in non-typical large scale user-driven environment
Date Wed, 02 Dec 2009 23:03:18 GMT
Wow, so many answers in such short time. Thank You all for Your
insights on this!

So, to reiterate all the ideas and how I feel about them:
- William: HOD seems like a good idea. Of course we (I work with my
colleague from school) should respect user CPU time and don't eat it
when user is watching a movie or playing a game.
- Ed: Patry looks very interesting, because it already assumes that
the nodes will fail and is self-organising. I'll definitely take a
deeper look on this.
- Brian: out of Your recommendations no. 1 (Condor) seems like a nice
fit for our needs. As for security - it's a matter to discuss, we have
some ideas how to secure the network and check data integrity (look
below)
- Allen: In regard of data integrity, we thought of a system that
pushes same data to at least two nodes and then compares the results.
If they are identical, we assume that the result is correct and store
it. If not, we send it again to someone else. Sure, this approach is
making a lot of replication (2x or even more) but I should make a
guarantee that results are intact.

-- 
Best regards,
Maciej "mav" Trębacz

Mime
View raw message