lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Fauerbach <chris.fauerb...@gmail.com>
Subject Re: Faceting on multivalued field
Date Mon, 04 Apr 2011 02:17:25 GMT
Wouldn't you want to extract your original data format from the index and then 'count' the
comments for each post ? 
I don't think facets are appropriate. 

On Apr 3, 2011, at 22:10, Kaushik Chakraborty <kaychaks@gmail.com> wrote:

> Ok. My expectation was since "comment_post_id" is a MultiValued field hence
> it would appear multiple times (i.e. for each comment). And hence when I
> would facet with that field it would also give me the count of those many
> documents where comment_post_id appears.
> 
> My requirement is getting total for every document i.e. finding number of
> comments per post in the whole corpus. To explain it more clearly, I'm
> getting a result xml something like this
> 
> <str name="post_id">46</str>
> <str name="post_text">Hello World</str>
> <str name="person_id">20</str>
> <arr name="comment_id">
>    <str>9</str>
>    <str>10</str>
> </arr>
> <arr name="comment_person_id">
>   <str>19</str>
>   <str>2</str>
> </arr>
> <arr name="comment_post_id">
>  <str>46</str>
>  <str>46</str>
> </arr>
> <arr name="comment_text">
>   <str>Hello - from World</str>
>   <str>Hi</str>
> </arr>
> 
> <lst name="facet_fields">
>  <lst name="comment_post_id">
>     *<int name="46">1</int>*
> 
> I need the count to be 2 as the post 46 has 2 comments.
> 
> What other way can I approach?
> 
> Thanks,
> Kaushik
> 
> 
> On Mon, Apr 4, 2011 at 4:29 AM, Erick Erickson <erickerickson@gmail.com>wrote:
> 
>> Hmmm, I think you're misunderstanding faceting. It's counting the
>> number of documents that have a particular value. So if you're
>> faceting on "comment_post_id", there is one and only one document
>> with that value (assuming that the comment_post_ids are unique).
>> Which is what's being reported.... This will be quite expensive on a
>> large corpus, BTW.
>> 
>> Is your task to show the totals for *every* document in your corpus or
>> just the ones in a display page? Because if the latter, your app could
>> just count up the number of elements in the XML returned for the
>> multiValued comments field.
>> 
>> If that's not relevant, could you explain a bit more why you need this
>> count?
>> 
>> Best
>> Erick
>> 
>> On Sun, Apr 3, 2011 at 2:31 PM, Kaushik Chakraborty <kaychaks@gmail.com
>>> wrote:
>> 
>>> Hi,
>>> 
>>> My index contains a root entity "Post" and a child entity "Comments".
>> Each
>>> post can have multiple comments. data-config.xml:
>>> 
>>> <document>
>>>           <entity name="posts" transformer="TemplateTransformer"
>>> dataSource="jdbc" query="">
>>> 
>>>               <field column="post_id" />
>>>               <field column="post_text"/>
>>>               <field column="person_id"/>
>>>               <entity name="comments" dataSource="jdbc" query="select *
>>> from comments where post_id = ${posts.post_id}" >
>>>                   <field column="comment_id" />
>>>                   <field column="comment_text" />
>>>                   <field column="comment_person_id" />
>>>                   <field column="comment_post_id" />
>>>              </entity>
>>>           </entity>
>>> </document>
>>> 
>>> The schema has all columns of "comment" entity as "MultiValued" fields
>> and
>>> all fields are indexed & stored. My requirement is to count the number of
>>> comments for each post. Approach I'm taking is to query on "*:*" and
>>> faceting the result on "comment_post_id" so that it gives the count of
>>> comment occurred for that post.
>>> 
>>> But I'm getting incorrect result e.g. if a post has 2 comments, the
>>> multivalued fields are populated alright but the facet count is coming as
>> 1
>>> (for that post_id). What else do I need to do?
>>> 
>>> 
>>> Thanks,
>>> Kaushik
>>> 
>> 

Mime
View raw message