lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Indexation takes a lot of time :(
Date Fri, 15 Apr 2011 10:10:55 GMT
On Wed, Apr 6, 2011 at 11:50 AM, findbestopensource
<findbestopensource@gmail.com> wrote:
> Hello daniel,
>
> The code seems to be fine. I think you are calculating the time for entire
> program which may read the data from external source and prepare the array
> list. Just calculate time only for indexing.
>
> Regards
> Aditya
> www.findbestopensource.com
>
>
>
> On Wed, Apr 6, 2011 at 2:38 PM, ZYWALEWSKI, DANIEL (DANIEL) <
> daniel.zywalewski@alcatel-lucent.com> wrote:
>
>> Hello Champions !!
>>
>> I have a problem with indexation(or should I say its time); So the elements
>> to Index are represtented by my own class - DocumentToIndex that consists of
>> Fields(one Field is a fieldName and fieldValue). All documentToIndex are
>> kept/stocked in ArrayList. When I start indexing firstly I open IndexWriter
>> then for each field of documentToIndex I take its value and name and I
>> create Lucene Field then added to the LuceneDocument; Once it's finished
>> (creation of LuceneDocument I add it to index). After passing all documents
>> I close IndexWriter; All this can be represented by code:
>>
>>
>> indexWriter = new IndexWriter(indexDirectory, indexAnalyzer, false,
>> IndexWriter.MaxFieldLength.UNLIMITED);
>>
>>      for (DocumentToIndex documentToIndex : objectsToIndex) {
>>        Document indexedDocument = new Document();
>>        for (int i = 0; i < documentToIndex.getDocumentSize(); i++) {
>>
>>          indexedDocument.add(new
>> Field(documentToIndex.getDocumentField(i).getName(),
>>
>>  documentToIndex.getDocumentField(i).getValue(), Field.Store.YES,
>>                                        Field.Index.ANALYZED));
>>        }
>>
>>        indexWriter.addDocument(indexedDocument);
>>
>>      }
>>
>> indexWriter.close()
>>
>> My problem is that it takes much time to index. For example to index 28310
>> documentToIndex I need about 15min. Do I miss something or it's normal?
>> Maybe this code is not really optimized? I'll be really grateful for any
>> hints and tips;
>>
>> Thanks in advance,
>> D

I think you code is fine you really need to look into what you are
measuring... in an idea situation you can index up to 60k documents
per second with lucene given your hardware is fast and you can get the
data quick enough out of your database. look here

http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/

or here

http://blog.mikemccandless.com/2010/09/lucenes-indexing-is-fast.html

for some insights!

Simon
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message