hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 蔡超 <toppi...@gmail.com>
Subject Is there a similar case to learn
Date Tue, 02 Nov 2010 02:29:09 GMT

I'm a newbee to hadoop. I want to employ hadoop to analyze 10 million pieces
data (10G totally).  The data maybe residents in a rational DB, or a series
of XML files(thousands of pieces per file).  I have some questions.

1. how to guarantee mappers' exclusive access to the DB.
2. how to split XML files. To override MultiFileInputFormat?
3. how to transfer a bunch of resources (10M) to slaves.
4. Reduce is not necessary, is it suitable for hadoop?

 I can't find a similar case in the build-in examples of hadoop release.
Sorry to interrupt.

Chao Cai

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message