Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 5507 invoked from network); 6 Jan 2010 02:13:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Jan 2010 02:13:30 -0000 Received: (qmail 40304 invoked by uid 500); 6 Jan 2010 02:13:29 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 40218 invoked by uid 500); 6 Jan 2010 02:13:29 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 40208 invoked by uid 99); 6 Jan 2010 02:13:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 02:13:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dashengju@gmail.com designates 209.85.160.50 as permitted sender) Received: from [209.85.160.50] (HELO mail-pw0-f50.google.com) (209.85.160.50) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 02:13:20 +0000 Received: by pwi20 with SMTP id 20so11489695pwi.29 for ; Tue, 05 Jan 2010 18:12:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=1bN3OX7unrLmG6ah4nk7t3mQ+XZ/LmaUBQXZCTpz/7o=; b=tAjTgc1a33O7AD8VsGmBwmlwIGLRW5wf/K78h8Tq6goe/w26t2/Ec6V0i7enuWvcVr PFjrKMfxBexlIAbzZQJv5EC9vLVYHYbXPHqY2lG8X2rAJ5ElrbI+wGdVFtFZJ7nnemwz GOrbmnlkEuUT+1urp+3viTn6iGRodBXwQzw3w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Ox7P4jTbGdsM6FKq0yH5L9aBGyRjkkPn4GehWG641GeZx1SzGeuXD3iT8aKM8tTVRX vt6kBo6c6lWFsFrusWYVNW0fk4gfwH1xyIquwrjBedxKnojAMaJZ15E9XXOYF1al6m+R 680nJTe5+KHm3WmZkhc8FXRImXaETzjO7vKBs= MIME-Version: 1.0 Received: by 10.142.209.20 with SMTP id h20mr924806wfg.130.1262743979639; Tue, 05 Jan 2010 18:12:59 -0800 (PST) Date: Wed, 6 Jan 2010 10:12:59 +0800 Message-ID: <1e45bf881001051812h3907c6c0w52aeae27197c1c92@mail.gmail.com> Subject: Re:Three questions about Hadoop From: =?UTF-8?B?6Z6g5aSn5Y2H?= To: general@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cd32e2e594654047c757e76 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd32e2e594654047c757e76 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable 1. At the client side, one user's files are small files; but at the server side, they will not put one user's file as a sperate file, usually they put the same type content together, like a database. For example, the webpages crawled from Internet are small pages, but they put them together as large webpage data warehouse. 2. "write-once and read-many times" is usually a charactor for data warehouse. The webpages crawled from Internet are written to hadoop data warehouse once, then they use those data to do many other analyse, read man= y times by different applications. Not all your data is "write-once and read-many times". 3. I do not know. ----------------------------------- dashengju +86 13810875910 dashengju@gmail.com ------------------ Original ------------------ From: "qin.wang"; Date: Tue, Jan 5, 2010 05:42 PM To: "general"; Subject: Three questions about Hadoop Hi team, When I try to do some research on Hadoop, I have several high level questions, if any comments from you it will do great help for me: 1. Hadoop assumes the files are big files, but take Google as an example, i= t seems the google result for user are small files, so how to understand the big files=A3=BFAnd what=A1=AFs the file content for example? 2. Why are the files write-once and read-many times? 3. How to install other softwares on Hadoop, is there any special requirements for the software? Do they need to support the Map/Reduce modul= e and then can be installed? It will be very appreciated for your help. =CD=F5 =C7=D9 Annie.Wang =C9=CF=BA=A3=CA=D0=D0=EC=BB=E3=C7=F8=B9=F0=C1=D6=C2=B7418=BA=C57=BA=C5=C2= =A56=C2=A5 Zip code: 200 233 Tel: +86 21 5497 8666-8004 Fax: +86 21 5497 7986 Mobile: +86 137 6108 8369 --000e0cd32e2e594654047c757e76--