lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sam s" <tamputa...@hotmail.com>
Subject RE: Context-based suggestions with spell check
Date Fri, 21 Nov 2003 19:58:54 GMT
You are right, just for search it won’t take much time. I have relatively 
small index.

Almost all documents change daily. So one way I am thinking of updating 
index is creating new index set in a temp directory by other process. When 
time comes to update index then renaming in-use dir to archive dir and 
renaming temp dir to in-use dir. I am not sure whether this is a good idea, 
what do you say?

I have overall 70000 docs with around 20 fields where 3 fields contain more 
text. I have indexed and tokenized 11 fields. Average size of data I put in 
a document is 2 kb.

Thanks

>From: Dan Quaroni <dquaroni@OPENRATINGS.com>
>Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
>To: 'Lucene Users List' <lucene-user@jakarta.apache.org>
>Subject: RE: Context-based suggestions with spell check
>Date: Fri, 21 Nov 2003 09:45:13 -0500
>
>There are some important questions to ask before you discount performing 
>all
>of those searches.  What are your performance requirements?  How big is 
>your
>index?  How are you deploying it?
>
>-----Original Message-----
>From: sam s [mailto:tamputampu@hotmail.com]
>Sent: Thursday, November 20, 2003 8:15 PM
>To: lucene-user@jakarta.apache.org
>Subject: RE: Context-based suggestions with spell check
>
>
>Levenshtein is again word based like spell check. I found Jazzy quite handy
>to some level for spell check. I am not worrying much about the spell check
>part. What I want to do is show user right spell check suggestion (when
>spell check returns multiple suggestions) based on other words he/she
>entered for the search.
>Once again going to same example
>User enters: inted motherboard
>spell check returns 3 suggestions for inted
>inter
>intel
>intek
>
>In this situation since I have both words intel and motherboard in one
>document of my search collection I should able to show user something like
>Did you mean: intel motherboard?
>
>One simplest way to achieve this is do search 3 times for all three
>suggestions with word motherboard and show user suggestion for which search
>got more hits. Problem with this is number of iterations involved. If there
>are suggestions on two words user entered, there will be all kinds of
>combinations and those many iterations. So I dont want to go this way.
>
>I haven't studied in detail how lucene does indexing and search on it. I
>don't know whether that will help.
>
>Has anybody come across this problem? Or I must be missing something..
>
>Again, I apologize if you guys think this is not right post for lucene user
>list.
>
>Thanks,
>Abhay
>
>
> >From: "sam s" <tamputampu@hotmail.com>
> >Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> >To: lucene-user@jakarta.apache.org
> >Subject: RE: Context-based suggestions with spell check
> >Date: Fri, 21 Nov 2003 00:35:13 +0000
> >
> >I actually thought of using search for right combination of suggestions 
>but
>
> >I feared of performance degrade. I'll look at levenshtein.
> >
> >Thanks
> >
> >>From: Dan Quaroni <dquaroni@OPENRATINGS.com>
> >>Reply-To: Lucene Users List <lucene-user@jakarta.apache.org>
> >>To: 'sam s ' <tamputampu@hotmail.com>
> >>Subject: RE: Context-based suggestions with spell check
> >>Date: Thu, 20 Nov 2003 19:22:51 -0500
> >>
> >>  I would also suggest 'intend' as a possible correction.
> >>
> >>There are a decent number of algorithms out there for distance between 
>to
> >>words.  Check out levenshtein for that.
> >>
> >>In terms of context based corrections, you could do a search for the 
>word
> >>combined with the word in front of it and the word behind it.
> >>
> >>"I just bought an inted motherboard"
> >>
> >>Then you do a search for "an inter", "an intel", etc and "inter
> >>motherboard", "intel motherboard", etc and count the number of hits you
> >>get
> >>for each one and rank your suggestions accordingly.
> >>
> >>
> >>-----Original Message-----
> >>From: sam s
> >>To: lucene-user@jakarta.apache.org
> >>Sent: 11/20/03 7:07 PM
> >>Subject: Context-based suggestions with spell check
> >>
> >>Hi,
> >>
> >>I am thinking to give spell check functionality to the search. I am
> >>trying
> >>to achieve two things to complement search.
> >>
> >>1. Spell check where dictionary will be composed of all text I am
> >>creating
> >>search index. This looks simple with some spell check implementation.
> >>
> >>2. The problem I am facing is how do I suggest right suggestion to a
> >>wrong
> >>word accompanied with other word. For example when user enters search
> >>term
> >>'inted' spell check returns suggestions inter, intel and intek. Now
> >>problem
> >>is when user searches 'inted motherboard' how do I decide that user is
> >>searching for 'intel motherboard'? Where there are some items contain
> >>text
> >>'intel motherboard'. How do I make context-based suggestions? Does
> >>anybody
> >>any simple algorithm for this.
> >>I know this is not related to lucene but thought may get some help from
> >>community. Suggestions are appreciated.
> >>
> >>Thanks in advance,
> >>Sam
> >>
> >>_________________________________________________________________
> >>Tired of spam? Get advanced junk mail protection with MSN 8.
> >>http://join.msn.com/?page=features/junkmail
> >>
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >>
> >
> >_________________________________________________________________
> >Add photos to your e-mail with MSN 8. Get 2 months FREE*.
> >http://join.msn.com/?page=features/featuredemail
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
>_________________________________________________________________
>STOP MORE SPAM with the new MSN 8 and get 2 months FREE*
>http://join.msn.com/?page=features/junkmail
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>

_________________________________________________________________
Add photos to your e-mail with MSN 8. Get 2 months FREE*. 
http://join.msn.com/?page=features/featuredemail


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message