lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rose, Stuart J" <stuart.r...@pnnl.gov>
Subject retrieved doc field values being cached?
Date Fri, 24 Feb 2012 21:18:52 GMT

Lucene (using 3.5) seems to be caching field values for documents (after they have been retrieved)
and I am hoping someone can provide more information on how and where exactly the field values
are stored.

The table below lists the times (in milliseconds) associated with retrieving for a set of
documents matching a particular query a single stored value from each document in the set.
Results are shown for three queries (A, B, and C) submitted multiple times. The first time
each query is submitted, the time to retrieve it's matching document values is considerably
longer than any time after that.

1) search A          nDocs =                489         time =   1342
2) search A          nDocs =                489         time =   811
3) search B          nDocs =                47038    time =   76658
4) search B          nDocs =                47038    time =   1062
5) search C          nDocs =                5256       time =   22741
6) search C          nDocs =                5256       time =   578
7) search A          nDocs =                489         time =   515
8) search A          nDocs =                489         time =   514
9) search B          nDocs =                47038    time =   1000
10) search B        nDocs =                47038    time =   967
11) search C        nDocs =                5256       time =   563
12) search C        nDocs =                5256       time =   562


Whatever information that is being cached is available across separate processes so presumably
it is residing somewhere in the file system (and/or virtual memory). I have also seen the
same behavior when retrieving TermFreqVector information as well.

Any additional insight is appreciated!

Thanks,
Stuart


__________________________________________________
Stuart Rose
Senior Research Engineer
Pacific Northwest National Laboratory


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message