lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fergus McMenemie <fer...@twig.me.uk>
Subject Re: DIH: Limited xpath syntax unable to parse all xml elements
Date Thu, 02 Jul 2009 18:40:22 GMT
>Shalin Shekhar Mangar wrote:
>> On Thu, Jul 2, 2009 at 11:08 PM, Mark Miller <markrmiller@gmail.com> wrote:
>>
>>   
>>> It looks like DIH implements its own subset of the Xpath spec.
>>>     
>>
>>
>> Right, DIH has a streaming implementation supporting a subset of XPath only.
>> The supported things are in the wiki examples.
>>
>>
>>   
>>> I don't see any tests with multiple matching sub nodes, so perhaps DIH
>>> Xpath does not properly support that and just selects the last matching
>>> node?
>>>     
>>
>>
>> It selects all matching nodes. But if the field is not multi-valued, it will
>> store only the last value. I guess this is what is happening here.
>>
>>   
>So do you think it should match them all and add the concatenated text 
>as one field?
>
>That would be more Xpath like I think, and less arbitrary than just 
>choosing the last one.

Only when the field in schema.xml in not multiValued. If the field is
multiValued is should still behave as at present?

Also... what went wrong with the suggested:-
    <field column="body" xpath="/book/body/chapter flatten="true"/>

Regards Fergus.

Mime
View raw message