lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <>
Subject Re: Index Rows as Documents? Help me design a solution
Date Tue, 25 Jul 2006 21:23:18 GMT
Few comments -

> (from first posting in this thread)
> The indexing was taking much more than minutes for a 1 MB log file. ...
> I would expect to be able to index at least a of GB of logs within 1 or 2

1-2 minutes per GB would be 30-60 GB/Hour, which for a single machine/jvm
is a lot - well at least I did not see Lucene index this fast.

> doc.add(new Field("msisdn", columns[0], Field.Store.YES,
> doc.add(new Field("messageid", columns[2], Field.Store.YES,

Is it really required to analyze the text for these fields - "msisdn" , "

> doc.add(new Field("line", line, Field.Store.YES, Field.Index.NO));

This is storing the original text of all input lines that are indexed -
quite an overhead.

- Doron

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message