lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From theDude_2 <aornst...@webmd.net>
Subject Re: A Challenge!: Combining 2 searches into a single resultset?
Date Fri, 17 Apr 2009 15:19:12 GMT

I appreciate your response, and read the wiki article concerning the
Federated search
and

I'm not sure that my project falls into the "Federated Search" bucket...

What I've done is created 2 indexes created with the same documents.
One index, contains the full documents - great for pure relevancy search
The second index: contains all of the same documents, but a small subset of
each documents contents - only allowing words to be indexed that we deem as
"good words" - 

(for example) if this was a football article database
Index 1: would index 100% of the article about the Redskins and the New York
Giants
Index 2: would index the same article by only the "good words" in the
document like Redskins, Giants, Quarterback, Linebacker, etc.

What I'm trying to do, if it's even possible! is run the search on both
indexes containing references to the same article, and multiple the scores
together to get a final score that would represent something like a
"relative AND good word" score....

Figuring that if a user searches on "Who is the Quarterback for the Giants"
this will get the user an article that is both related to the query, and
deemed "important" to the query...

I will look further into federated search and related items, but I think
that lucene probably wont be able to help me with this, am I right?








------------

pjaol wrote:
> 
> I'd start by doing some research on the question rather than asking for a
> solution..
> What your asking for can be considered 'Federated Search'
> http://en.wikipedia.org/wiki/Federated_search
> 
> And it can be conceived in as many ways as you have document types. Any
> answer will probably end up
> customized and weighted by your document silo value, usually companies
> weight those by business rules
> rather than head down the path of federated search, as it's just quicker
> and
> cheaper, and you can accomplish more.
> e.g
> Medication = score *2  (as higher advertising incentives)
> Diseases = score
> Books = score * 0.75  ( thousands of books, which nobody buys etc..)
> 
> You might also want to try consolidating your data into 1 schema, and
> consider layering or collapsing results
> based on type.
> 
> P
> 
> On Fri, Apr 17, 2009 at 10:39 AM, theDude_2 <aornstein@webmd.net> wrote:
> 
>>
>> (bump) - any thoughts?
>> ----
>>
>>
>>
>> theDude_2 wrote:
>> >
>> > hi!
>> >
>> > I am trying to do something a little unique...
>> >
>> > I have a 90k text documents that I am trying to search
>> > Search A: indexes and searches the documents using regular relevancy
>> > search
>> > Search B: indexes and searches the documents using a smaller subset of
>> > "key" words that I have chosen
>> >
>> > This gives me 2 seperate scores: Score A, and Score B...
>> >
>> > I am trying to show the top 10 results of the scores combined so....
>> >
>> > FinalScoretextDoc = (scoreA_of_td1 * 0.5) * (scoreB_of_td1 * 0.5)
>> >
>> > While it seems straightforward, I do not want to calculate the scores
>> of
>> > all the documents outside of lucene.  How can I integrate this better
>> into
>> > the lucene search engine?  Is this possible to do by any simple means?
>> >
>> > Thanks guys + gals!
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/A-Challenge%21%3A-Combining-2-searches-into-a-single-resultset--tp23085506p23098961.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/A-Challenge%21%3A-Combining-2-searches-into-a-single-resultset--tp23085506p23099744.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message