lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <>
Subject Re: Possible location of word inside the file.
Date Thu, 04 Jul 2013 08:34:09 GMT
Sounds like you're indexing each log file as one lucene document.
Obvious answer is to index each line in each log file as a separate
doc.  Searches would then match lines in files and you can display
those lines, summarizing counts per file if you want that,

If you wanted to be able to show surrounding lines, index the line
number and the file name.  So if you got a hit on line 12345 of file
logabc.txt you could execute a second search with logfilename:
logabc.txt AND lineno:[12340 TO 12350] to get 5 lines either side.
Use a NumericField and NumericRangeQuery for lineno if you are
concerned about performance.  See recent thread on this list for more
on that.


On Thu, Jul 4, 2013 at 8:10 AM, Ankit Murarka
<> wrote:
> Dear Team,
>                  I have a potential usecase. I have large number of log
> files which are archived in a particular directory. Now the administrator
> would like to view certain information which might/might not be present in
> any of the files inside the directory.
> Using lucene, I was able to get whether the specific word he is searching
> for is present in the files or not and in which files they are present.
> BUT, is it possible to find the location of that word inside the file. Each
> file is about 5 MB and does not really make sense to parse the file to know
> the location of a certain word which is present.
> Can lucene help in this regard? Or atleast a close approximation of its
> location in the file. I would be wishing to show atleast 256KB of data from
> the point that word is present int he file.
> Googled a lot but to no avail.
> --
> Regards
> Ankit
> "Peace is found not in what surrounds us, but in what we hold within."
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message