lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nader S. Henein" <...@bayt.net>
Subject RE: commercial websites powered by Lucene?
Date Tue, 24 Jun 2003 10:20:38 GMT
Because I've setup Lucene as a webapp with a centralized Init file and
setup properties file, I do my sanity check in the Init, because if the
serer crashes mid-indexing, I have to delete the lock files optimize and
re-index the files that were indexing when the crash occurred, there was
long discussion about this back in August, search for "Crash / Recovery
Scenario" in the lucene-dev archived discussions. Should answer all your
questions

Nader Henein

-----Original Message-----
From: Gareth Griffiths [mailto:Gareth.Griffiths@bridgeheadsoftware.com] 
Sent: Tuesday, June 24, 2003 1:11 PM
To: Lucene Users List; nsh@bayt.net
Subject: Re: commercial websites powered by Lucene?


Nader,
You say you have to cope with server crash mid-indexing. I think I'm
seeing lots of garbage files created by server crash mid merge/optimise
while lucene is creating a new index. Did you write code specifically to
handle this or is there something more automated. (I was thinking of
writing a sanity check for before start-up that looked in 'segments' and
'deletable and got rid of any files in the catalog directory that are
not referenced.)

Did you do something similar or have I missed something...

TIA

Gareth


----- Original Message -----
From: "Nader S. Henein" <nsh@bayt.net>
To: "'Lucene Users List'" <lucene-user@jakarta.apache.org>
Sent: Tuesday, June 24, 2003 9:30 AM
Subject: RE: commercial websites powered by Lucene?


> I handle updates or inserts the same way first I delete the document 
> from the index and then I insert it (better safe than sorry), I batch 
> my updates/inserts every twenty minutes, I would do it in smaller 
> intervals but since I have to sync the XML files created from the DB 
> to three machines (I maintain three separate Lucene indices on my 
> three separate
> web-servers) it takes a little longer. You have to batch your changes
> because Updating the index takes time as opposed to deleted which I
> batch every two minutes. You won't have a problem updating the index
and
> searching at the same time because lucene updates the index on a
> separate set of files and then when It's done it overwrites the old
> version. I've had to provide for Backups, and things like server
crashes
> mid-indexing, but I was using Oracle Intermedia before and Lucene
BLOWS
> IT AWAY.
>
> -----Original Message-----
> From: news [mailto:news@main.gmane.org] On Behalf Of Chris Miller
> Sent: Tuesday, June 24, 2003 12:06 PM
> To: lucene-user@jakarta.apache.org
> Subject: Re: commercial websites powered by Lucene?
>
>
> Hi Nader,
>
> I was wondering if you'd mind me asking you a couple of questions 
> about your implementation?
>
> The main thing I'm interested in is how you handle updates to Lucene's

> index. I'd imagine you have a fairly high turnover of CVs and jobs, so

> index updates must place a reasonable load on the CPU/disk. Do you 
> keep CVs and jobs in the same index or two different ones? And what is

> the process you use to update the index(es) - do you batch-process 
> updates or do you handle them in real-time as changes are made?
>
> Any insight you can offer would be much appreciated as I'm about to 
> implement something similar and am a little unsure of the best 
> approach to take. We need to be able to handle indexing about 60,000 
> documents/day, while allowing (many) searches to continue operating 
> alongside.
>
> Thanks!
> Chris
>
> "Nader S. Henein" <nsh@bayt.net> wrote in message 
> news:001401c32b38$32aa2440$d501a8c0@naderit...
> > We use Lucene http://www.bayt.com , we're basically an on-line 
> > Recruitment site and up until now we've got around 500 000 CVs and 
> > documents indexed with results that stump Oracle Intermedia.
> >
> > Nader Henein
> > Senior Web Dev
> >
> > Bayt.com
> >
> > -----Original Message-----
> > From: John_Chun@platts.com [mailto:John_Chun@platts.com]
> > Sent: Wednesday, June 04, 2003 6:09 PM
> > To: lucene-user@jakarta.apache.org
> > Subject: commercial websites powered by Lucene?
> >
> >
> >
> > Hello All,
> >
> > I've been trying to find examples of large commercial websites that 
> > use Lucene to power their search.  Having such examples would make 
> > Lucene an easy sell to management
> >
> > Does anyone know of any good examples?  The bigger the better, and 
> > the
>
> > more the better.
> >
> > TIA,
> > -John
> >
> >
> >
> > --------------------------------------------------------------------
> > -
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message