lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uri Boness <ubon...@gmail.com>
Subject Re: Field Collapsing (was Re: Schema for group/child entity setup)
Date Thu, 10 Sep 2009 14:31:06 GMT
All work and progress on this patch is done under the JIRA issue: 
https://issues.apache.org/jira/browse/SOLR-236


R. Tan wrote:
>> The patch which will be committed soon will add this functionality.
>>     
>
>
> Where can I follow the progress of this patch?
>
>
> On Mon, Sep 7, 2009 at 3:38 PM, Uri Boness <uboness@gmail.com> wrote:
>
>   
>>> Great. Nice site and very similar to my requirements.
>>>
>>>       
>> thanks.
>>
>>  So, right now, you get all field values by default?
>>     
>> Right now, no field values are returned for the collapsed documents. The
>> patch which will be committed soon will add this functionality.
>>
>>
>> R. Tan wrote:
>>
>>     
>>> Great. Nice site and very similar to my requirements.
>>>
>>>
>>>
>>>       
>>>> There's work on the patch that is being done now which will enable you to
>>>> ask for specific field values of the collapsed documents using a
>>>> dedicated
>>>> request parameter.
>>>>
>>>>
>>>>         
>>> So, right now, you get all field values by default?
>>>
>>>
>>> On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness <uboness@gmail.com> wrote:
>>>
>>>
>>>
>>>       
>>>> You can check out http://www.ilocal.nl. If you search for a bank in
>>>> Amsterdam then you'll see that a lot of the results are collapsed. For
>>>> this
>>>> we used an older version of this patch (which works on 1.3) but a lot has
>>>> changed since then. We're currently using this patch on another project,
>>>> but
>>>> it's not live yet.
>>>>
>>>>
>>>> Uri
>>>>
>>>> R. Tan wrote:
>>>>
>>>>
>>>>
>>>>         
>>>>> Thanks Uri. Your personal suggestion is appreciated and I think I'll
>>>>> follow
>>>>> your advice. We're still early in development and 1.4 would be a good
>>>>> choice. I hope I can get field collapsing to work with my requirements.
>>>>> Do
>>>>> you know any live site using field collapsing already?
>>>>>
>>>>> On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness <uboness@gmail.com>
wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> There's work on the patch that is being done now which will enable
you
>>>>>> to
>>>>>> ask for specific field values of the collapsed documents using a
>>>>>> dedicated
>>>>>> request parameter. This work is not committed yet to the latest patch,
>>>>>> but
>>>>>> will be very soon. There is of course a drawback to that as well,
the
>>>>>> collapsed documents set can be very large (depends on your data of
>>>>>> course)
>>>>>> in which case the returned result which includes the fields values
can
>>>>>> be
>>>>>> rather large, which will impact performance, this is why this feature
>>>>>> will
>>>>>> be enabled only if you specify this extra parameter - by default
no
>>>>>> field
>>>>>> values will be returned.
>>>>>>
>>>>>> AFAIK, the latest patch should work fine with the latest build. Martijn
>>>>>> (which is the main maintainer of this patch) tries to keep it up
to
>>>>>> date
>>>>>> with the latest builds. But I guess the safest way is to work with
the
>>>>>> nightly build of the same date as the latest patch (though I would
give
>>>>>> it a
>>>>>> try first with the latest build).
>>>>>>
>>>>>> BTW, it's not an official suggestion from the Solr development team,
>>>>>> but
>>>>>> if
>>>>>> you ask me, if you have to choose now whether to use 1.3 or 1.4-dev,
I
>>>>>> would
>>>>>> go for the later. 1.4 is supposed to be released in the upcoming
week
>>>>>> or
>>>>>> two
>>>>>> and it bring loads of bug fixes, enhancements and extra functionality.
>>>>>> But
>>>>>> again, this is my personal suggestion.
>>>>>>
>>>>>>
>>>>>> cheers,
>>>>>> Uri
>>>>>>
>>>>>> R. Tan wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Okay. Thanks for giving an insight on how it works in general.
Without
>>>>>>> trying it myself, are the field values for the collapsed ones
also
>>>>>>> part
>>>>>>> of
>>>>>>> the results data?
>>>>>>> What is the latest build that is safe to use on a production
>>>>>>> environment?
>>>>>>> I'd probably go for that and use field collapsing.
>>>>>>>
>>>>>>> Thank you very much.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness <uboness@gmail.com>
wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> The collapsed documents are represented by one "master" document
>>>>>>>> which
>>>>>>>> can
>>>>>>>> be part of the normal search result (the doc list), so pagination
>>>>>>>> just
>>>>>>>> works
>>>>>>>> as expected, meaning taking only the returned documents in
account
>>>>>>>> (ignoring
>>>>>>>> the collapsed ones). As for the scoring, the "master" document
is
>>>>>>>> actually
>>>>>>>> the document with the highest score in the collapsed group.
>>>>>>>>
>>>>>>>> As for Solr 1.3 compatibility... well... it's very hart to
tell. All
>>>>>>>> latest
>>>>>>>> patch are certainly *not* 1.3 compatible (I think they're
also
>>>>>>>> depending
>>>>>>>> on
>>>>>>>> some changes in lucene which are not available for solr 1.3).
I guess
>>>>>>>> you'll
>>>>>>>> have to try some of the old patches, but I'm not sure about
their
>>>>>>>> stability.
>>>>>>>>
>>>>>>>> cheers,
>>>>>>>> Uri
>>>>>>>>
>>>>>>>>
>>>>>>>> R. Tan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> Thanks Uri. How does paging and scoring work when using
field
>>>>>>>>> collapsing?
>>>>>>>>> What patch works with 1.3? Is it production ready?
>>>>>>>>>
>>>>>>>>> R
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness <uboness@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> The development on this patch is quite active. It
works well for
>>>>>>>>>> single
>>>>>>>>>> solr instance, but distributed search (ie. shards)
is not yet
>>>>>>>>>> supported.
>>>>>>>>>> Using this page you can group search results based
on a specific
>>>>>>>>>> field.
>>>>>>>>>> There are two flavors of field collapsing - adjacent
and
>>>>>>>>>> non-adjacent,
>>>>>>>>>> the
>>>>>>>>>> former collapses only document which happen to be
located next to
>>>>>>>>>> each
>>>>>>>>>> other
>>>>>>>>>> in the otherwise-non-collapsed results set. The later
(the
>>>>>>>>>> non-adjacent)
>>>>>>>>>> one
>>>>>>>>>> collapses all documents with the same field value
(regardless of
>>>>>>>>>> their
>>>>>>>>>> position in the otherwise-non-collapsed results set).
Note, that
>>>>>>>>>> non-adjacent performs better than adjacent one. There's
currently
>>>>>>>>>> discussion
>>>>>>>>>> to extend this support so in addition to collapsing
the documents,
>>>>>>>>>> extra
>>>>>>>>>> information will be returned for the collapsed documents
(see the
>>>>>>>>>> discussion
>>>>>>>>>> on the issue page).
>>>>>>>>>>
>>>>>>>>>> Uri
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> R. Tan wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> I think this is what I'm looking for. What is
the status of this
>>>>>>>>>>> patch?
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Sep 3, 2009 at 12:00 PM, R. Tan <tanrihaed58@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>>>> Hi Solrers,
>>>>>>>>>>>> I would like to get your opinion on how to
best approach a search
>>>>>>>>>>>> requirement that I have. The scenario is
I have a set of business
>>>>>>>>>>>> listings
>>>>>>>>>>>> that may be group into one parent business
(such as 7-eleven
>>>>>>>>>>>> having
>>>>>>>>>>>> several
>>>>>>>>>>>> locations). On the results page, I only want
7-eleven to show up
>>>>>>>>>>>> once
>>>>>>>>>>>> but
>>>>>>>>>>>> also show how many locations matched the
query (facet filtered by
>>>>>>>>>>>> state,
>>>>>>>>>>>> for
>>>>>>>>>>>> example) and maybe a preview of the some
of the locations.
>>>>>>>>>>>>
>>>>>>>>>>>> Searching for the business name is straightforward
but the
>>>>>>>>>>>> locations
>>>>>>>>>>>> within
>>>>>>>>>>>> the a result is quite tricky. I can do the
opposite, searching
>>>>>>>>>>>> for
>>>>>>>>>>>> the
>>>>>>>>>>>> locations and faceting on business names,
but it will still
>>>>>>>>>>>> basically
>>>>>>>>>>>> be
>>>>>>>>>>>> the
>>>>>>>>>>>> same thing and repeat results with the same
business name.
>>>>>>>>>>>>
>>>>>>>>>>>> Any advice?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> R
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>       
>
>   

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message