incubator-blur-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Blur Wiki] Update of "DataStructureDevelopment" by AaronMcCurry
Date Sun, 21 Oct 2012 19:57:56 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Blur Wiki" for change notification.

The "DataStructureDevelopment" page has been changed by AaronMcCurry:
http://wiki.apache.org/blur/DataStructureDevelopment?action=diff&rev1=3&rev2=4

  
  === User query requirements ===
  
+ ==== Sorting ====
+ 
- User queries for the most part are short lived and require minimal amounts of heap space
the big except is sorting.  These queries require the ordering field(s) to be loaded into
memory.  Many improvements have been made in Lucene 4 when it comes to field caching, but
the default implementation loads the entire field contents into the heap.  In addition to
the on heap version, Lucene offers a separate implementation that will read the field contents
from files (Directory API), this should be the implementation that Blur will use to perform
sorting.
+ User queries for the most part are short lived and require minimal amounts of heap space
the big exception is sorting.  These queries require the ordering field(s) to be loaded into
memory.  Many improvements have been made in Lucene 4 when it comes to field caching, but
the default implementation loads the entire field contents into the heap.  In addition to
the on heap version, Lucene offers a separate implementation that will read the field contents
from files (Directory API), this should be the implementation that Blur will use to perform
sorting.
  
  NOTE: These features in Lucene 4.0 are call Column Stride Fields.
  
+ ==== Filtering ====
+ 
+ The next largest memory consumer for user queries is filter caching.  For the most part
this is accomplished through weakly referenced bit sets that represent the filter the user
requested.  A file based solution has not yet been implemented, but should be.
+ 

Mime
View raw message