hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 鞠大升 <dashen...@gmail.com>
Subject Re:Three questions about Hadoop
Date Wed, 06 Jan 2010 02:12:59 GMT
1. At the client side, one user's files are small files; but at the server
side, they will not put one user's file as a sperate file, usually they put
the same type content together, like a database. For example, the webpages
crawled from Internet are small pages, but they put them together as large
webpage data warehouse.

 2. "write-once and read-many times" is usually a charactor for data
warehouse. The webpages crawled from Internet are written to hadoop data
warehouse once, then they use those data to do many other analyse, read many
times by different applications. Not all your data is "write-once and
read-many times".

 3. I do not know.

+86 13810875910

------------------ Original ------------------
  From:  "qin.wang"<qin.wang@i-soft.com.cn>;
 Date:  Tue, Jan 5, 2010 05:42 PM
 To:  "general"<general@hadoop.apache.org>;

 Subject:  Three questions about Hadoop

Hi team,

When I try to do some research on Hadoop, I have several high level
questions, if any comments from you it will do great help for me:

1. Hadoop assumes the files are big files, but take Google as an example, it
seems the google result for user are small files, so how to understand the
big files?And what’s the file content for example?

2. Why are the files write-once and read-many times?

3. How to install other softwares on Hadoop, is there any special
requirements for the software? Do they need to support the Map/Reduce module
and then can be installed?

It will be very appreciated for your help.

王 琴  Annie.Wang

Zip code: 200 233
Tel:      +86 21 5497 8666-8004
Fax:     +86 21 5497 7986
Mobile:  +86 137 6108 8369

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message