lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Øie <k...@gan.no>
Subject Re: Lucene Search Result with Line Numbers?
Date Mon, 11 Apr 2005 13:54:22 GMT
Yes, the biggest drawback is text spanning lines:

L1 - it was the best of times,
L2 - it was the worst of times

will return no hits for the search "it was the best of times, it was 
the worst of times" (with quotes). because no single lucene document 
contains the whole text alone.

I would be interested in an alternative approach here because i have 
encountered this problem myself. A possible solution would be to have a 
freetext index and a linetext index, and the query is run against the 
fulltext index, but when hits are returned, these hits are compared 
against the linetext index to find each freetext hit's exact 
linenumber.

Mvh Karl Øie

On 11. apr. 2005, at 15.46, cerberus yao wrote:

> But the "crash.java" is a just single document physically.
> Do we have any drawback if we treat each line in "crash.java" as a 
> doucment?
>
> Another question:
>   If we need to present the search result with the hit lines plus n
> lines forward and backword, how can I do this if each lines are
> seperated in each document?
>   for example:
>
>  1. contents in crash.java are:
>       public class crash {
>           public static void main(String[] args) {
>           }
>       }
>  2. query "main"
>  3. search result= the hit line +1 line and -1 line
>      1 public class crash {
>      2    public static void main(String[] args) {
>      3   }
>
> On Apr 11, 2005 8:28 PM, Karl Øie <karl@gan.no> wrote:
>> Most indexing creates a Lucene document for each Source document. What
>> would need is to create a Lucene document for each line.
>>
>> String src_doc = "crash.java";
>> int line_number = 0;
>> while(reader!=EOF) {
>>         String line = reader.readLine();
>>         Document ld = new Document();
>>         ld.add(new Field("id", src_doc, true, true, false));
>>         ld.add(new Field("line", ""+line_number, true, true, false));
>>         ld.add(new Field("text", line.toString(), false, true, true));
>>         index_writer.addDocument(ld);
>>         line_number++;
>> }
>>
>> This will create a small lucene document for each line, upon search 
>> you
>> will find documents based on the content of the line and the line
>> number as a field. The reason syntax highlighting works without
>> creating a lucene document for each line is because syntax 
>> highlighting
>> bases its result on groups of occurencies of text, not line numbers.
>>
>> Mvh Karl Øie
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
- Somewhere, out there on the Net, is an HD full of lame quotes


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message