Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 6554 invoked from network); 15 Jan 2007 20:29:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Jan 2007 20:29:48 -0000 Received: (qmail 93617 invoked by uid 500); 15 Jan 2007 20:29:54 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 93593 invoked by uid 500); 15 Jan 2007 20:29:53 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 93584 invoked by uid 99); 15 Jan 2007 20:29:53 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jan 2007 12:29:53 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [64.78.20.46] (HELO exsmtp012-2.exch012.intermedia.net) (64.78.20.46) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 15 Jan 2007 12:29:43 -0800 Received: from EXVBE012-1.exch012.intermedia.net ([64.78.20.16]) by exsmtp012-2.exch012.intermedia.net with Microsoft SMTPSVC(6.0.3790.1830); Mon, 15 Jan 2007 12:29:14 -0800 x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Hadoop + Lucene integration: possible? how? Date: Mon, 15 Jan 2007 12:29:19 -0800 Message-ID: <8E2AE6006D6A584F98D5CD65F4801BFE03D2BBD5@EXVBE012-1.exch012.intermedia.net> In-Reply-To: <45AB7967.1010807@getopt.org> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Hadoop + Lucene integration: possible? how? Thread-Index: Acc4pEnMU3/Yp7vdSDiVuTo7tU6kaQAPwsJQ References: <20070115134931.7vnny89brshwksso@webmail.webware.be> <45AB7967.1010807@getopt.org> From: "Igor Bolotin" To: X-OriginalArrivalTime: 15 Jan 2007 20:29:14.0841 (UTC) FILETIME=[D2C81C90:01C738E3] X-Virus-Checked: Checked by ClamAV on apache.org Actually there is a patch available for creating Lucene indexes directly on DFS. See here: http://issues.apache.org/jira/browse/LUCENE-532 But as Andrzej mentioned - the performance of searching is not stellar. Regards, Igor Bolotin -----Original Message----- From: Andrzej Bialecki [mailto:ab@getopt.org]=20 Sent: Monday, January 15, 2007 4:54 AM To: hadoop-user@lucene.apache.org Subject: Re: Hadoop + Lucene integration: possible? how? maarten@sherpa-consulting.be wrote: > I'm new to lucene and Hadoop but what I can't seem to find in the=20 > docs, internet... is how (and if possible?) to use Hadoop as the=20 > underlying FS for Lucene? > > Could anyone explain me how these can be tied together? Some small=20 > code/configuration example would be nice :-) It's possible to use Hadoop DFS to host a read-only Lucene index and use it for searching (Nutch has an implementation of FSDirectory for this purpose), but the performance is not stellar ... Currently it's not (yet) possible to use HDFS for creating Lucene indexes, a minor change to Lucene index format would be required. --=20 Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com