manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-235) item description element not indexed
Date Thu, 04 Aug 2011 00:07:27 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079134#comment-13079134
] 

Karl Wright commented on CONNECTORS-235:
----------------------------------------

Ok, another mystery solved.  The RSS chromed data mode of "None" was not properly tried because
of the inadvertant database switch, and I found that recrawling vs. crawling fresh generated
incorrect version information.  I've fixed that problem but I can't check it in because it
causes the following error against a plain-vanilla Solr installation:

ERROR: [http://www.onemansjazz.ca/content/view/330/50/] multiple values encountered for non
multiValued field description: [Jazz radio show from Winnipeg on CKUW 95.9 FM, hosted by Maurice
Hogue., I have created a Listener Survey and if you have the time to complete it, that would
be terrific. I&#39;m trying to do an evaluation of One Man&#39;s Jazz as well as considering
some new options that have arisen. Your feedback would be most appreciate.This survey is in
two parts and is a total of twenty parts, most of them just require a click of your mouse.
Click here (http://www.surveymonkey.com/s/C3DZ3JK) for Part One, and here (http://www.surveymonkey.com/s/C38FVH8)
for Part Two. Thanks again for your input. ]

I'm not sure why Solr is interpreting this long field as multivalued, but clearly it would
be much better if I used a metadata name that wasn't "description", since Solr's example configuration
has dibs on that.  I'll experiment and post further.


> item description element not indexed
> ------------------------------------
>
>                 Key: CONNECTORS-235
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-235
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: RSS connector
>    Affects Versions: ManifoldCF 0.2
>            Reporter: Kate McGonigal
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr index. 
> I have a typical RSS feed with the general structure:
> <rss>
>     <channel>
>         <title></title>
>         <link></link>
>         <description></description>
>         <item>
>             <title></title>
>             <link></link>
>             <pubDate></pubDate>
>             <description> *** the description I do want *** </description>
>             <author></author>
>             <category></category>
>         </item>
>     </channel>
> </rss>
> Example:
> For the RSS feed: http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description metadata
from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 FM, hosted by Maurice Hogue."
in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is selected. I'm
not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message