lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shalin Shekhar Mangar" <shalinman...@gmail.com>
Subject Re: Scoped searches in XML documents
Date Mon, 22 Dec 2008 13:34:32 GMT
I don't know much about Solr Cell but if you can see each node's content in
different fields in Solr then you should be able to query it too.

On Mon, Dec 22, 2008 at 6:59 PM, Jana, Kumar Raja <kjana@ptc.com> wrote:

> Hi Shalin,
>
> Thanks for the quick response. I've found my mistake. It was actually a
> silly setting in my application before sending the documents to
> Solr-Cell which was stripping off the xml tags. I was able to index the
> document with the xml tags. Sorry for being so hasty.
>
> So the only question left is, will I be able to perform scoped searches
> using Solr? Is this already implemented in Solr or is there a
> workaround?
>
> Thanks
> Kumar
>
>
> -----Original Message-----
> From: Shalin Shekhar Mangar [mailto:shalinmangar@gmail.com]
> Sent: Monday, December 22, 2008 6:27 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Scoped searches in XML documents
>
> If your XML documents are of a fixed schema, you may want to look at
> DataImportHandler with XPathEntityProcessor
>
> http://wiki.apache.org/solr/DataImportHandler
>
> On Mon, Dec 22, 2008 at 5:49 PM, Jana, Kumar Raja <kjana@ptc.com> wrote:
>
> > Hi,
> >
> >
> >
> > I want to perform scoped searches in XML documents using Solr. I am
> > using Solr-Cell to index my document files. I've noticed that when I
> > index an xml file to Solr (via Solr-Cell) the field tags get stripped
> > off and only the values are sent to Solr.
> >
> > i.e. Say I have an XML document which contains the following data:
> >
> > <test>
> >
> >    <node1>
> >
> >        <inner_node1>XYZ</inner_node1>
> >
> >        <inner_node2>ABC</inner_node2>
> >
> >        <sometag>PPPP</sometag>
> >
> >    </node1>
> >
> >    <node1>
> >
> >        ....
> >
> >    </node1>
> >
> > </test>
> >
> >
> >
> > When I index this xml file, only the field values(XYZ, ABC and PPPP)
> > seem to go to Solr and the tag elements are stripped off!!! (Although
> > probing a bit more into the cause seems to point out that this is what
> > Apache Tika does).
> >
> >
> >
> > Is there any setting or feature which would enable me to preserve the
> > field/tag information and hence allow me to perform scoped searches
> > using Solr?
> >
> >
> >
> > Just to clear any confusion by the term "scoped search":
> >
> > What I mean by scoped search is when I index the above xml document,
> > Scoped search would allow me to find all occurrences of ABC within the
> > <inner_node2> XML tag.
> >
> >
> >
> >
> >
> > -Kumar
> >
> >
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message