lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ywlee522 <ywlee...@gmail.com>
Subject Re: How to structure lucene query?
Date Sun, 07 Jun 2009 18:48:03 GMT


Thanks for the comments. Apology for not providing details earlier.

Users in my system generate reports of some type everyday. So a Lucene
document has 4 fields; user name, report create_dt, report type, and  report
text.  For example, an analyst writes a report of telco market today, and
may write a report of mobile phones in tomorrow.

The query is "of the users who has one or more reports containing "ABC",
find users who also has one or more reports containing "XYZ". 

A user may have "ABC" in one report, and "XYZ" in another report, i.e., not
in the same report. But this will match the query.  

I first tried this in two searches: one searching "ABC" and collecting user
names (going thru all results), and the second one searching "XYZ" among the
users found in the first search.  But this seems very inefficient, and not
sure if this is the right use of Lucene.

If I put all reports of a user into a single Lucene document, then it is
equal to find all documents containing  both "ABC" and "XYZ".  But, then, i
will lose the report_dt field, which is another parameter in the query.








Simon Willnauer wrote:
> 
> could you please give us more details of you query or an example that
> might help to understand what you are trying to do. I had the same
> impression as Ted though.
> 
> simon
> 
> On Sun, Jun 7, 2009 at 4:28 PM, ywlee522<ywlee522@gmail.com> wrote:
>>
>> Thanks for the tip.  But, no, it is not same as finding documents with
>> both
>> "ABC" and "XYZ", as they can be appear in separate documents of the same
>> user.
>>
>>
>>
>>
>> Ted Dunning wrote:
>>>
>>> It is the same as finding documents with both "ABC" and "XYZ" except
>>> that
>>> you need to run over the results yourself and collect the user names.
>>>
>>> Lucene doesn't have a fancy query language so you can't magically do any
>>> group-by or count(distinct) tricks.
>>>
>>> On Sat, Jun 6, 2009 at 8:59 AM, ywlee522 <ywlee522@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> A document has two fields; username, date, and document text. A user
>>>> can
>>>> have multiple documents.
>>>>
>>>> The query is:
>>>>
>>>> Of the users who have one or more documents with keyword "ABC", find
>>>> users
>>>> who also have one or more document with keyword "XYZ".
>>>>
>>>> This isn't finding documents with both "ABC" and "XYZ".   How can this
>>>> be
>>>> done in lucene query? THANK YOU
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/How-to-structure-lucene-query--tp23902784p23902784.html
>>>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>> --
>>> Ted Dunning, CTO
>>> DeepDyve
>>>
>>> 111 West Evelyn Ave. Ste. 202
>>> Sunnyvale, CA 94086
>>> http://www.deepdyve.com
>>> 858-414-0013 (m)
>>> 408-773-0220 (fax)
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/How-to-structure-lucene-query--tp23902784p23911598.html
>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-structure-lucene-query--tp23902784p23914028.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Mime
View raw message