Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E6199DCE4 for ; Thu, 11 Oct 2012 02:27:20 +0000 (UTC) Received: (qmail 52219 invoked by uid 500); 11 Oct 2012 02:27:16 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 52141 invoked by uid 500); 11 Oct 2012 02:27:16 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 52133 invoked by uid 99); 11 Oct 2012 02:27:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 02:27:16 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.220.176] (HELO mail-vc0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 02:27:08 +0000 Received: by mail-vc0-f176.google.com with SMTP id gb22so1882244vcb.35 for ; Wed, 10 Oct 2012 19:26:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=jHt5XAZrNydl6f139S8qUrkXCrrQxpON0UQl6ZKyPgk=; b=pJNGuQPDc1IwAxH+wrk+5m7mduf+L20VrQvQ0SCM1v2kVUKY7WT8YZeQEiqNRnw64l 3qWC3cnsQ/DKFSqHbvkbYQ/AlBNUwPVrxafe1SSJsRGgjB3a9Ctkr7P3xzsDFdo9B9pU 37napigUaYZTJghIGznxc+7lT9FCua+ILMMZliW9TQevPKi5DktTeWtdbAmISSOaQHl8 ni6SFCKqI9NltaMDPK4wabk0UL+qokS1sKSsE+xLmdSD3c40Np++ILn49FY7SougVmwx 3CO/BfA1xFpR/pk6Ig8pMd+66HekZJBdSbUU9Lk+D0wkWl9ysf5mrwXurG3U9dHg2M8Y A9Tw== MIME-Version: 1.0 Received: by 10.52.68.7 with SMTP id r7mr12313146vdt.96.1349922407560; Wed, 10 Oct 2012 19:26:47 -0700 (PDT) Received: by 10.220.180.70 with HTTP; Wed, 10 Oct 2012 19:26:47 -0700 (PDT) X-Originating-IP: [50.79.202.93] In-Reply-To: <14678817.328.1349921743999.JavaMail.lancenorskog@Lance-Norskogs-MacBook-Pro.local> References: <14678817.328.1349921743999.JavaMail.lancenorskog@Lance-Norskogs-MacBook-Pro.local> Date: Wed, 10 Oct 2012 21:26:47 -0500 Message-ID: Subject: Re: Hadoop/Lucene + Solr architecture suggestions? From: Mark Kerzner To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec50165f793b72504cbbf4de9 X-Gm-Message-State: ALoCoQl7PgQ6/1IFIw/B0q28bT1aydtGEUBSDCqHInsVSk+YH24p6CaZf4dKVLc7P+TpyqrQ6tbM --bcaec50165f793b72504cbbf4de9 Content-Type: text/plain; charset=ISO-8859-1 That is very interesting, Lance, thank you. Mark On Wed, Oct 10, 2012 at 9:15 PM, Lance Norskog wrote: > In the LucidWorks Big Data product, we handle this with a reducer that > sends documents to a SolrCloud cluster. This way the index files are not > managed by Hadoop. > > ----- Original Message ----- > | From: "Ted Dunning" > | To: user@hadoop.apache.org > | Cc: "Hadoop User" > | Sent: Wednesday, October 10, 2012 7:58:57 AM > | Subject: Re: Hadoop/Lucene + Solr architecture suggestions? > | > | I prefer to create indexes in the reducer personally. > | > | Also you can avoid the copies if you use an advanced hadoop-derived > | distro. Email me off list for details. > | > | Sent from my iPhone > | > | On Oct 9, 2012, at 7:47 PM, Mark Kerzner > | wrote: > | > | > Hi, > | > > | > if I create a Lucene index in each mapper, locally, then copy them > | > to under /jobid/mapid1, /jodid/mapid2, and then in the reducers > | > copy them to some Solr machine (perhaps even merging), does such > | > architecture makes sense, to create a searchable index with > | > Hadoop? > | > > | > Are there links for similar architectures and questions? > | > > | > Thank you. Sincerely, > | > Mark > | > --bcaec50165f793b72504cbbf4de9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable That is very interesting, Lance, thank you.

Mark

On Wed, Oct 10, 2012 at 9:15 PM, Lance Norskog <goksron= @gmail.com> wrote:
In the LucidWorks Big Data product, we handl= e this with a reducer that sends documents to a SolrCloud cluster. This way= the index files are not managed by Hadoop.

----- Original Message -----
| From: "Ted Dunning" <tdunning@maprtech.com>
| To: user@hadoop.apache.org<= br> | Cc: "Hadoop User" <user@hadoop.apache.org>
| Sent: Wednesday, October 10, 2012 7:58:57 = AM
| Subject: Re: Hadoop/Lucene + Solr architecture suggestions?
|
| I prefer to create indexes = in the reducer personally.
|
| Also you can avoid the copies if you use an advanced hadoop-derived
| distro. Email me off list for details.
|
| Sent from my iPhone
|
| On Oct 9, 2012, at 7:47 PM, Mark Kerzner <mark.kerzner@shmsoft.com>
| wrote:
|
| > Hi,
| >
| > if I create a Lucene index in each mapper, locally, then copy them | > to under /jobid/mapid1, /jodid/mapid2, and then in the reducers
| > copy them to some Solr machine (perhaps even merging), does such
| > architecture makes sense, to create a searchable index with
| > Hadoop?
| >
| > Are there links for similar architectures and questions?
| >
| > Thank you. Sincerely,
| > Mark
|

--bcaec50165f793b72504cbbf4de9--