lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Scrima" <dscr...@verilogue.com>
Subject Lucene searching across documents
Date Wed, 08 Apr 2009 13:32:35 GMT
So I have a requirement where I have a directory filled with xml files.
I wrote a parser to parse these files, and index all of the xml
attributes and properties into documents. An example of one of these
documents is below. I'm parsing sentences into words, and tagging the
sentences based on certain criteria.

My issue is trying to find out if lucene can handle cross-document
searching. So below is indexed as a single document... and there will be
multiple sentences before, after, and throughout an entire transcript.
Is it possible somehow to say, "I want a result where one line marked as
Symptom is 5 lines away from another line marked as Brand." So in
essence, I'm trying to search across multiple lucene documents.

 

Any thoughts or literature out there?

 

<transcript>

                <line id="1">

                                <tag id="10" type="Symptom" />

<tag id="12" type="Brand" />

                                <word>

                                                <token>Coughing</token>

 
<part-of-speech>SBJ</part-of-speech>

</word>

<word>

                                                <token>is</token>

 
<part-of-speech>VB</part-of-speech>

</word>

<word>

                                                <token>caused</token>

 
<part-of-speech>NP</part-of-speech>

</word>

<word>

                                                <token>by</token>

 
<part-of-speech>PP</part-of-speech>

</word>

<word>

                                                <token>Mucinex</token>

 
<part-of-speech>PDC</part-of-speech>

</word>

                </line>

</transcript>

 

 

Thanks so much!


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message