lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Lucene as xml store
Date Fri, 22 Jul 2005 11:55:30 GMT

On Jul 22, 2005, at 4:37 AM, Namrata Kumari wrote:
> - Well, the application I want to develop is more like storing xml  
> files and
> with each of them having different structure. And then performing  
> search on
> them that in turn can depend on the structure of the xml doc and  
> user's
> requirement.

That's still a pretty generic requirement.  What type of queries?   
XPath?

> - Moreover, I did not exactly understood as to how I can store the xml
> document. I mean, I went through the java doc and couldnot figure  
> out the
> api's that could be used for this purpose. Can you guide me in this?

Look at the various types of fields.  There is a "stored" attribute  
on Field that allows the field to be stored.

> - But the biggest question is: Is Lucene a good option [which now I  
> doubt on
> the basis of what I have read till now :-(]

It really all depends.  I built a search engine for the Rossetti  
Archive (http://www.rossettiarchive.org/rose/) which indexes XML  
files like this:

     http://www.rossettiarchive.org/docs/1-1847.s244.raw.xml

XPath queries are not possible into the XML, but that is also not a  
use case for the system.  Highly structured queries such as this one  
are supported because the indexing process extracted detailed  
information from the XML files:

     http://www.rossettiarchive.org/rose/?query=%2Bgenre%3Asonnet+%2B% 
28author%3Arossetti+OR+author%3Adgr%29+%2Byear%3A%5B1850+TO+1870%5D

I still do not have a clear cut understanding of your needs and thus  
still not sure if Lucene is suitable or not.  Certainly for full-text  
searches it is a fine choice, but the structured queries are a  
different story.

     Erik


>
> Regards,
> Namrata
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Friday, July 22, 2005 2:11 PM
> To: general@lucene.apache.org
> Subject: Re: Lucene as xml store
>
>
> On Jul 22, 2005, at 1:07 AM, Namrata Kumari wrote:
>
>
>>
>> hi,
>>
>> I am a beginner to lucene , So kindly excuse me if the questions
>> mentioned a bit naive.
>> - Can I use lucene as an xml store + search engine?
>> - What I understood is that if we want to perform search on xml doc.
>> we need to parse xml document, form indexes and on the basis of  
>> fields
>> perform search.
>> - So, does this mean, that even if we use lucene as xml store (IF WE
>> CAN!!), we need to parse it to form indexes?
>>
>
> Lucene is a search engine and only deals with text (Strings  
> essentially).
> Lucene is also a flat document space and doing queries for things
> hierarchical is not how it was designed, but it can be done to a  
> limited
> degree depending on how data is indexed.
>
> Yes, Lucene can store text as well as make it searchable - so you  
> could
> store an XML document in it as well.
>
> You have not provided any information on the types of queries you  
> need to
> support or what the user experience will be like.  There are many  
> ways to
> use Lucene and whether it is suitable solution to your
> application depends on that information.   Tell us more about what
> you're wanting to do and we can guide you further.
>
>
>> Please reply to this as soon as possible
>>
>
> That's what they all say!   :)   No need to say such a thing - if you
> have well articulated questions that are straightforward enough to  
> answer,
> you'll get responses quickly here.
>
>      Erik
>


Mime
View raw message