lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nair, Manas" <Manas.N...@mtvnmix.com>
Subject RE: XML data in solr field
Date Wed, 17 Mar 2010 06:04:38 GMT
Thankyou Tommy. But the real problem here is that the xml is dynamic and the element names
will be different in different docs which means that there will be a lot of field names to
be added in schema if I were to index those xml nodes separately.
Is it possible to have nested indexing (xml within xml) in solr without the overhead of adding
all those inner xml nodes as actual fields in solr schema?
 
Manas

________________________________

From: Tommy Chheng [mailto:tommy.chheng@gmail.com]
Sent: Tue 3/16/2010 5:05 PM
To: solr-user@lucene.apache.org
Subject: Re: XML data in solr field




  Do you have the option of just importing each xml node as a
field/value when you add the document?

That'll let you do the search easily. If you need to store the raw XML,
you can use an extra field.

Tommy Chheng
Programmer and UC Irvine Graduate Student
Twitter @tommychheng
http://tommy.chheng.com <http://tommy.chheng.com/> 


On 3/16/10 12:59 PM, Nair, Manas wrote:
> Hello Experts,
>
> I need help on this issue of mine. I am unsure if this scenario is possible.
> I have a field in my solr document named<inputxml>, the value of which is a xml
string as below. This xml structure is within the inputxml field value. I needed help on searching
this xml structure i.e. if I search  for Venue, I should get "Radio City Music Hall" as the
result and not the complete tag like<Venue value="Radio City Music Hall" />. Is this
supported in solr?? If it is, how can this be implemented??
>
> <root>
> <Venue value="Radio City Music Hall" />
> <Link value="http://bit.ly/Rndab" />
> <LinkText value="En savoir +" />
> <Address value="New-York, USA" />
> </root>
>
> Any help is appreciated. I donot need the tag name in the result, instead I need the
tag value.
>
> Thanks in advance,
> Manas Nair
>



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message