lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "patrick o'leary" <pj...@pjaol.com>
Subject Re: A Challenge!: Combining 2 searches into a single resultset?
Date Fri, 17 Apr 2009 15:32:19 GMT
Why not put the keywords into the same document as another field? and search
both fields
at once, you can then use lucene syntax to give a boosting to the keyword
fields.
e.g.
body:A good game last night by the redskins
keyword: redskins

Query= body:(game OR redskins) keyword:(game OR redskins)^10

And adjust the boosting until you're happy.
Check out for querying multiple fields
http://wiki.apache.org/lucene-java/LuceneFAQ#head-300f0756fdaa71f522c96a868351f716573f2d77

You might even want to consider Solr and it's dismax search component
http://wiki.apache.org/solr/DisMaxRequestHandler
to make it easier




On Fri, Apr 17, 2009 at 11:19 AM, theDude_2 <aornstein@webmd.net> wrote:

>
> I appreciate your response, and read the wiki article concerning the
> Federated search
> and
>
> I'm not sure that my project falls into the "Federated Search" bucket...
>
> What I've done is created 2 indexes created with the same documents.
> One index, contains the full documents - great for pure relevancy search
> The second index: contains all of the same documents, but a small subset of
> each documents contents - only allowing words to be indexed that we deem as
> "good words" -
>
> (for example) if this was a football article database
> Index 1: would index 100% of the article about the Redskins and the New
> York
> Giants
> Index 2: would index the same article by only the "good words" in the
> document like Redskins, Giants, Quarterback, Linebacker, etc.
>
> What I'm trying to do, if it's even possible! is run the search on both
> indexes containing references to the same article, and multiple the scores
> together to get a final score that would represent something like a
> "relative AND good word" score....
>
> Figuring that if a user searches on "Who is the Quarterback for the Giants"
> this will get the user an article that is both related to the query, and
> deemed "important" to the query...
>
> I will look further into federated search and related items, but I think
> that lucene probably wont be able to help me with this, am I right?
>
>
>
>
>
>
>
>
> ------------
>
> pjaol wrote:
> >
> > I'd start by doing some research on the question rather than asking for a
> > solution..
> > What your asking for can be considered 'Federated Search'
> > http://en.wikipedia.org/wiki/Federated_search
> >
> > And it can be conceived in as many ways as you have document types. Any
> > answer will probably end up
> > customized and weighted by your document silo value, usually companies
> > weight those by business rules
> > rather than head down the path of federated search, as it's just quicker
> > and
> > cheaper, and you can accomplish more.
> > e.g
> > Medication = score *2  (as higher advertising incentives)
> > Diseases = score
> > Books = score * 0.75  ( thousands of books, which nobody buys etc..)
> >
> > You might also want to try consolidating your data into 1 schema, and
> > consider layering or collapsing results
> > based on type.
> >
> > P
> >
> > On Fri, Apr 17, 2009 at 10:39 AM, theDude_2 <aornstein@webmd.net> wrote:
> >
> >>
> >> (bump) - any thoughts?
> >> ----
> >>
> >>
> >>
> >> theDude_2 wrote:
> >> >
> >> > hi!
> >> >
> >> > I am trying to do something a little unique...
> >> >
> >> > I have a 90k text documents that I am trying to search
> >> > Search A: indexes and searches the documents using regular relevancy
> >> > search
> >> > Search B: indexes and searches the documents using a smaller subset of
> >> > "key" words that I have chosen
> >> >
> >> > This gives me 2 seperate scores: Score A, and Score B...
> >> >
> >> > I am trying to show the top 10 results of the scores combined so....
> >> >
> >> > FinalScoretextDoc = (scoreA_of_td1 * 0.5) * (scoreB_of_td1 * 0.5)
> >> >
> >> > While it seems straightforward, I do not want to calculate the scores
> >> of
> >> > all the documents outside of lucene.  How can I integrate this better
> >> into
> >> > the lucene search engine?  Is this possible to do by any simple means?
> >> >
> >> > Thanks guys + gals!
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/A-Challenge%21%3A-Combining-2-searches-into-a-single-resultset--tp23085506p23098961.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/A-Challenge%21%3A-Combining-2-searches-into-a-single-resultset--tp23085506p23099744.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message