lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nader S. Henein" <...@bayt.net>
Subject RE: commercial websites powered by Lucene?
Date Tue, 24 Jun 2003 08:30:08 GMT
 I handle updates or inserts the same way first I delete the document
from the index and then I insert it (better safe than sorry), I batch my
updates/inserts every twenty minutes, I would do it in smaller intervals
but since I have to sync the XML files created from the DB to three
machines (I maintain three separate Lucene indices on my three separate
web-servers) it takes a little longer. You have to batch your changes
because Updating the index takes time as opposed to deleted which I
batch every two minutes. You won't have a problem updating the index and
searching at the same time because lucene updates the index on a
separate set of files and then when It's done it overwrites the old
version. I've had to provide for Backups, and things like server crashes
mid-indexing, but I was using Oracle Intermedia before and Lucene BLOWS
IT AWAY.

-----Original Message-----
From: news [mailto:news@main.gmane.org] On Behalf Of Chris Miller
Sent: Tuesday, June 24, 2003 12:06 PM
To: lucene-user@jakarta.apache.org
Subject: Re: commercial websites powered by Lucene?


Hi Nader,

I was wondering if you'd mind me asking you a couple of questions about
your implementation?

The main thing I'm interested in is how you handle updates to Lucene's
index. I'd imagine you have a fairly high turnover of CVs and jobs, so
index updates must place a reasonable load on the CPU/disk. Do you keep
CVs and jobs in the same index or two different ones? And what is the
process you use to update the index(es) - do you batch-process updates
or do you handle them in real-time as changes are made?

Any insight you can offer would be much appreciated as I'm about to
implement something similar and am a little unsure of the best approach
to take. We need to be able to handle indexing about 60,000
documents/day, while allowing (many) searches to continue operating
alongside.

Thanks!
Chris

"Nader S. Henein" <nsh@bayt.net> wrote in message
news:001401c32b38$32aa2440$d501a8c0@naderit...
> We use Lucene http://www.bayt.com , we're basically an on-line 
> Recruitment site and up until now we've got around 500 000 CVs and 
> documents indexed with results that stump Oracle Intermedia.
>
> Nader Henein
> Senior Web Dev
>
> Bayt.com
>
> -----Original Message-----
> From: John_Chun@platts.com [mailto:John_Chun@platts.com]
> Sent: Wednesday, June 04, 2003 6:09 PM
> To: lucene-user@jakarta.apache.org
> Subject: commercial websites powered by Lucene?
>
>
>
> Hello All,
>
> I've been trying to find examples of large commercial websites that 
> use Lucene to power their search.  Having such examples would make 
> Lucene an easy sell to management
>
> Does anyone know of any good examples?  The bigger the better, and the

> more the better.
>
> TIA,
> -John
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message