lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Lerch <lerch.johan...@googlemail.com>
Subject Performance problems on retrieving fields
Date Thu, 09 Sep 2010 08:01:30 GMT
Hi,

i am working on a search for stacktraces. To do this i implemented my own
Query, Weight and Scorer. I save exception, method and the frames as fields
in the index and am able to pick relevant documents by matching those fields
with my query stacktrace (using IndexReader.termDocs()). I implemented my
own scoring which is calculated pairwise for stacktraces (the one of the
query and each of the relevant documents). For this scoring i calculate a
similarity between both traces by comparing the frames if they exist in both
and also check for ordering. This works similar as diff on text/source code.
My problem is, that i need all frames contained in both stacktraces, so i
have to retrieve all frame fields of the stored stacktraces. For now i do
this with:
Document document = reader.document(doc, new FieldSelector() {
            @Override
            public FieldSelectorResult accept(String fieldName) {
                if(Indexer.FIELD_FRAMES.equals(fieldName))
                    return FieldSelectorResult.LAZY_LOAD;
                else
                    return FieldSelectorResult.NO_LOAD;
            }
        });
Fieldable[] fieldables = document.getFieldables(Indexer.FIELD_FRAMES);

But this call really decreases performance to something which is not
agreeable for me (>10 times slower on 100000 stacktraces in index). So my
question is, are there are other ways to get stored fields or do you have
ideas for workarounds. Would it be better to store all stacktraces in a
database and retrieve them from there? If so how do i get the docId of
stacktraces i wrote to the index?

Regards,
Johannes

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message