Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 7836 invoked from network); 18 Apr 2007 18:23:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Apr 2007 18:23:58 -0000 Received: (qmail 80539 invoked by uid 500); 18 Apr 2007 18:24:03 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 80520 invoked by uid 500); 18 Apr 2007 18:24:03 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 80511 invoked by uid 99); 18 Apr 2007 18:24:03 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Apr 2007 11:24:02 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [212.13.37.242] (HELO gate.criticalsoftware.com) (212.13.37.242) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Apr 2007 11:23:54 -0700 Received: by gate.criticalsoftware.com (Postfix, from userid 102) id E4C37735B7; Wed, 18 Apr 2007 19:15:45 +0100 (WEST) Received: from [192.168.2.36] (unknown [192.168.2.36]) by gate.criticalsoftware.com (Postfix) with ESMTP id E912C735C3 for ; Wed, 18 Apr 2007 18:15:43 +0000 (UTC) Message-ID: <46266211.9080504@criticalsoftware.com> Date: Wed, 18 Apr 2007 19:23:13 +0100 From: Pedro Guedes User-Agent: Thunderbird 1.5.0.10 (X11/20070306) MIME-Version: 1.0 To: hadoop-user@lucene.apache.org Subject: Re: Serializing code to nodes: no can do? References: <4625F230.7030707@criticalsoftware.com> In-Reply-To: <4625F230.7030707@criticalsoftware.com> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Copyrighted-Material: Please visit http://www.criticalsoftware.com X-Virus-Checked: Checked by ClamAV on apache.org I keep talking to myself... hope it doesn't annoy u too much! We thought of a solution to our problem in wich we build a new .job file, in accordance with our crawl configuration, and then pass it to hadoop for execution... Is there somewhere i can look for the specification of the .job format? Thanks again, Pedro I wrote: > Hi hadoopers, > > I'm working on an enterprise search engine that works on an hadoop > cluster but is controlled form the outside. I managed to implement a > simple crawler much like Nutch's... > Now i have a new system's requirement: the crawl process must be > configurable outside hadoop. This means that I should be able to add > steps to the crawling process that the cluster would execute without > knowing before hand what they are... since serialization if not > possible, is there another way to achieve the same effect? > > Using Writable means I need implementations to be on each node so they > can read the object data from HDFS... but then i just get the same > object and not a new implementation, right? > > Any thoughts will be appreciated, > > Pedro > > DISCLAIMER: This message may contain confidential information or privileged material and is intended only for the individual(s) named. If you are not a named addressee and mistakenly received this message you should not copy or otherwise disseminate it: please delete this e-mail from your system and notify the sender immediately. E-mail transmissions are not guaranteed to be secure or without errors as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. Therefore, the sender does not accept liability for any errors or omissions in the contents of this message that arise as a result of e-mail transmissions. Please request a hard copy version if verification is required. Critical Software, SA.