Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 59562 invoked from network); 7 Jul 2009 12:21:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jul 2009 12:21:19 -0000 Received: (qmail 81226 invoked by uid 500); 7 Jul 2009 12:21:27 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 81147 invoked by uid 500); 7 Jul 2009 12:21:27 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 81137 invoked by uid 99); 7 Jul 2009 12:21:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jul 2009 12:21:27 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lamfeeling@126.com designates 220.181.15.18 as permitted sender) Received: from [220.181.15.18] (HELO m15-18.126.com) (220.181.15.18) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 07 Jul 2009 12:21:16 +0000 Received: from lamfeeling ( [222.95.219.149] ) by ajax-webmail-wmsvr18 (Coremail) ; Tue, 7 Jul 2009 20:20:43 +0800 (CST) Date: Tue, 7 Jul 2009 20:20:43 +0800 (CST) From: Andy To: common-user , "shravan.mahankali" Message-ID: <12503059.482371246969243425.JavaMail.coremail@bj126app18.126.com> In-Reply-To: <20090706122627.84454816021@nike.apache.org> References: <20090706122627.84454816021@nike.apache.org> Subject: Re:how to use hadoop in real life? MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_128729_23890333.1246969243424" X-Originating-IP: [222.95.219.149] X-Priority: 3 X-Mailer: Coremail Webmail Server Version XT2_snapshot build 090618(8092.2435.2417) Copyright (c) 2002-2009 www.mailtech.cn 126com X-CM-TRANSID: EsqowLCLkAObPVNKB8TFAQ--.64039W X-CM-SenderInfo: xodpwvxhol0wa6rslhhfrp/1tbi6QOYSUoZiAthKAABsj X-Coremail-Antispam: 1Ur529EdanIXcx71UUUUU7IcSsGvfJTRUUUjtkYjxAI6xAIw2 8IcVW8XFylb7IF0VCF04k20xvEw2I207IF0wAYjxAI6xCIbckI1I0E57IF64kEYxAxM2kK 64x0aVW7GwIE548m6rI_Jw1UWr17M2vj6xkI62vS6c8GOVWUtr1rJFyl57IF6s8CjcxG0x ylFVAaXTZC67ZELSn0mTvEwaV2v3VFvVW8MIAIbVAYjsxI4VWUJwCS07vEb4IE77IF4wCS 07vE1I0E4x80FVAKz4kxMIAIbVAFxVCaYxvI4VCIwcAKzIAtMIAIbVAqx4xG6c804VAFz4 xC04v7MIAIbVAqx4xG64xvF2IEw4CE5I8CrVC2j2WlV2xY6cIj6x8ErcxFaVAv8VWUMIAI bVAv7VC0I7IYx2IY67AKxVWUJVWUGwCS07vEYx0Ex4A2jsIE14v26r1j6r4UMIAIbVCjr7 xvwVCIw2I0I7xG6c02F41lV2xY6x02cVAKzwCS07vEc7Ca8VAvwVA2a4k0FcxrMIAIbVCY 1Ik26cxK6x8YrwCS07vEc7Ca8VAvwVCFzxkY4VA2I41lV2xY6xkIecxEwVAFwVW5JwCS07 vEc2IjII80xcxEwVAKI48JMIAIbVCF72vE77IF4wCS07vEx4CE17CEb7AF67AKxVWUXVWU AbIYCTnIWIevJa73U X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_128729_23890333.1246969243424 Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: quoted-printable Hi, befor you resubmit your program, please make sure your output path is "= valid". The valid condition depends on your outputformat. In the case of Fi= leOutputFomrat, this means, your output dir should not exist. So try to del= ete them first. =20 you have various methods to initiate your hadoop program. try to refer to s= ome "classic" hadoop programs, such as Nutch, a totally hadoop based search= engine. Find how many ways that you can deploy and run a hadoop program. to my knowledge, you can submit your hadoop program anywhere, as long as yo= u can access the "master" machine through network. =20 what kind of report? not a good idea to replace your database with hadoop storage. the design of= hadoop aims the high I/O utility when dealing a large scale of data. So, m= aybe it is not appropriate to treat hadoop as distributed database. BTW, th= e hbase project may be useful for you. this is simple, use the HDFS APIs, you can do any file operation on HDFS re= motely. Java is a powerful language for most hadoop programmer, again, try to get f= amiliar with some Java written hadoop projects, you will find it is convien= ent to do your things. Best wishes Song =D4=DA2009-07-06?20:25:49=A3=AC"Shravan?Mahankali"??=D0=B4=B5=C0=A3=BA >Hi?Group, > >? > >Finally?I?have?written?a?sample?Mapred?program,?submitted?this?job?to?Hado= op >and?got?the?expected?results.?Thanks?to?all?of?you! > >? > >Now?I?don't?have?an?idea?of?how?to?use?Hadoop?in?real?life?(am?sorry?if?am >asking?wrong?question?at?wrong?time.!?(So,?am?right?;-)))?: > >? > >1)?If?I?re-submit?my?job,?Hadoop?responds?with?an?error?message?saying: >org.apache.hadoop.mapred.FileAlreadyExistsException:?Output?directory >hdfs://localhost:9000/user/root/impressions_output?already?exists > >2)?How?to?automatically?execute?Hadoop?jobs??let's?say?I?have?set?a?cron?j= ob >which?runs?various?Hadoop?jobs?at?specified?times.?Is?this?the?way?we?do?i= n >Hadoop?world? > >3)?Can?I?submit?jobs?to?Hadoop?from?a?different?machine/?network/?domain? > >4)?I?would?like?to?generate?reports?from?the?data?collected?in?the?Hadoop. >How?can?I?do?that? > >5)?Am?thinking?of?replacing?data?in?my?database?with?Hadoop?and?query?Hado= op >for?various?information.?Is?this?correct? > >6)?How?can?I?access?analyzed?data?in?Hadoop?from?external?world,?external >program? > >? > >NOTE:?I?would?like?to?use?Java?for?any?of?above?implementations. > >? > >Thanks?in?advance, > >Shravan?Kumar.?M? > >Catalytic?Software?Ltd.?[SEI-CMMI?Level?5?Company] > >----------------------------- > >This?email?and?any?files?transmitted?with?it?are?confidential?and?intended >solely?for?the?use?of?the?individual?or?entity?to?whom?they?are?addressed. >If?you?have?received?this?email?in?error?please?notify?the?system >administrator?-?? >netopshelpdesk@catalytic.com > >? > ------=_Part_128729_23890333.1246969243424--