lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lmhelp <lea.mass...@ign.fr>
Subject Lucene - Retrieve extracted/indexed tokens for further analysis
Date Mon, 28 Jun 2010 14:18:38 GMT

Hi,

Thank you for reading my post.

Here is what I wish I could do.

Having an XML file with the following structure:
------------------------------
<root_element>
    <page>
        <title>[...]</title>
        <text>[...]</text>
    </page>
    [...]
    <page> 
        <title>[...]</title>
        <text>[...]</text>
    </page>
</root_element>
------------------------------

I wish I could:
- "ask" Lucene to extract tokens for each "text" element
- "give" me these tokens for further analysis.
 
     --------------------------------------
     - "text" element 1 => list of tokens 1
     - "text" element 2 => list of tokens 2
       [...]
     - "text" element n => list of tokens n
     --------------------------------------

Is it possible to do such a thing?
Can you put me on the trail?

Thanks and all the best,
--
Lmhelp


-- 
View this message in context: http://lucene.472066.n3.nabble.com/Lucene-Retrieve-extracted-indexed-tokens-for-further-analysis-tp927910p927910.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Mime
View raw message