lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lamprecht" <clampre...@gmail.com>
Subject Re: Distributed Lucene.. - clustering as a requirement
Date Thu, 06 Apr 2006 19:55:01 GMT
What about using lucene just for searching (i.e., no stored fields
except maybe one "ID" primary key field), and using an RDBMS for
storing the actual "documents"?  This way you're using lucene for what
lucene is best at, and using the database for what it's good at.  At
least up to a point -- RDBMSs have their limits too.  OR maybe if you
have a huge dataset, you might want to check out Nutch.

On 4/6/06, Dmitry Goldenberg <dmitry.goldenberg@weblayers.com> wrote:
> I firmly believe that clustering support should be a part of Lucene.  We've tried implementing
it ourselves and so far have been unsuccessful.  We tried storing Lucene indices in a database
that is the back-end repository for our app in a clustered environment and could not overcome
the indexing exceptions in our custom Directory implementation.
>
> I think it'd be perfect if some of the Lucene gurus were to implement an RDBMS-backed
Directory and post it (in addition to the sleepycat.db package that's currently in contrib).
 The nitty-gritties of dealing with Lucene indexing structures at a single byte level are
just way too much trouble to deal with for application integrator like myself.
>
> - Dmitry
>
> ________________________________
>
> From: Samuru Jackson [mailto:samurujackson@googlemail.com]
> Sent: Mon 3/6/2006 10:05 AM
> To: java-user@lucene.apache.org
> Subject: Re: Distributed Lucene..
>
>
>
> Do you plan to release some kind of a commerical product including an API?
>
> I ask because I'm evaluating different technologies for a prototype
> which is part of my diploma thesis.
>
> The problem is that I have to deal with real huge data amounts and one
> machine is simply not enough to handle those amounts of data.
>
> Lucene seems to be a good choice but it won't scale up for real big
> data amounts. So I thought about expanding the indexes over several
> machines in junks so that it fits into the memory of those machines.
>
> One machine should collect the results and calculate some kind of
> score out of the delivered hits from the machines.
>
> As I'm not familiar with the concrete mechanisms of Lucene this is
> just a naive thought, but I think that such a clustering mechanism
> could become a killer app.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message