lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "none none" <kor...@lycos.com>
Subject Re: Multiple fields in XML
Date Tue, 04 Nov 2003 16:16:43 GMT
i maintain the xml hierarchy, most of them are big documents, 3-5 mb.
I don't know how to use Xpath with Lucene...

---
KorfuT

--------- Original Message ---------

DATE: Tue, 4 Nov 2003 09:11:35 
From: Erik Hatcher <erik@ehatchersolutions.com>
To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>
Cc: 

>Do you maintain the XML hierarchy with subelements and sub-subelements,  
>etc?
>
>What about XPath querying?  That would be sweet!
>
>	Erik
>
>On Tuesday, November 4, 2003, at 05:26  AM, Che Dong wrote:
>
>> I had a solution for xml indexing(even rss):
>> http://sourceforge.net/projects/weblucene/
>>
>>
>> Che, Dong
>> ----- Original Message -----
>> From: "none none" <korfut@lycos.com>
>> To: <lucene-dev@jakarta.apache.org>
>> Sent: Tuesday, November 04, 2003 3:15 PM
>> Subject: Multiple fields in XML
>>
>>
>>> hi all,
>>> i need some help/ideas,
>>> what i would like to do is index xml files and dinamically search  
>>> against each field, to be more clear,
>>> i have 2 documents to index:
>>> <doc>
>>>  <id>1</id>
>>>  <author>Myself</author>
>>>  <page>
>>>   <id>1</id>
>>>   <body> Wherever goes here 1 , one</body>
>>>  </page>
>>>  <page>
>>>   <id>2</id>
>>>   <body> Wherever goes here 2, two </body>
>>>  </page>
>>> </doc>
>>> and:
>>>
>>> <doc>
>>>  <id>2</id>
>>>  <author>Somebody</author>
>>>  <private>Y</private>
>>>  <path>C:/docs/test.txt</path>
>>>  <page>
>>>   <id>4</id>
>>>   <body> Wherever goes here, four </body>
>>>  </page>
>>>  <page>
>>>   <id>5</id>
>>>   <body> Wherever goes here, five </body>
>>>  </page>
>>> </doc>
>>>
>>> now, what i need to do is:
>>> 1)show a list of those fields the user can search, eg:  
>>> id,page,author,private,path.
>>> 2)if user search for: 'wherever five', i want return as results the  
>>> doc with id=2.
>>> 3) once i get a document (by id) eg: doc id=2 i want to be able to  
>>> get a list of all the page id that contains the word 'wherever and  
>>> five' in the body (eg: page 5 for doc_id=2.
>>> 3)user should be able to search for >>> page:wherever and private:Y
 
>>> <<< and get doc=2.
>>> Is there a way to implement something like that? anybody did before?  
>>> any help is appreciated.
>>> thank you.
>>>
>>> ---
>>> KorfuT
>>>
>>>
>>>
>>> ____________________________________________________________
>>> FREE ADHD DVD or CD-Rom (your choice) - click here!
>>> http://ad.doubleclick.net/clk;6413623;3807821;f?http://mocda2.com/1/ 
>>> c/563632/131726/311392/311392
>>> AOL users go here:  
>>> http://ad.doubleclick.net/clk;6413623;3807821;f?http://mocda2.com/1/ 
>>> c/563632/131726/311392/311392
>>> This offer applies to U.S. Residents Only
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>



____________________________________________________________
FREE ADHD DVD or CD-Rom (your choice) - click here!
http://ad.doubleclick.net/clk;6413623;3807821;f?http://mocda2.com/1/c/563632/131726/311392/311392
AOL users go here: http://ad.doubleclick.net/clk;6413623;3807821;f?http://mocda2.com/1/c/563632/131726/311392/311392
This offer applies to U.S. Residents Only

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message