incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravikumar (JIRA)" <>
Subject [jira] [Updated] (BLUR-290) NRT Updates using RAMDirectory & Swap
Date Sat, 16 Nov 2013 17:43:21 GMT


Ravikumar updated BLUR-290:


The last patch handled the write part, to quickly determine the doc-idVsrow-id mapping.

This patch handles the reader/scoring part.

The internally uses a SortingMultiReader class 
to open readers for searching. This class internally opens a CompressingRowReader and adds
to RowReaderCache. This cache is on CoreClosedListener, which takes care of removing obsolete

The Scoring part is contained in PrimeDocCache and RowDocsCollector.

The PrimeDocCache will simply load the BitSet from CompressingRowReader, based on real-time

The RowDocsCollector gathers rows across segments and returns a globally score-sorted TopDocs.
I have left-out "Super" scoring, as I do not know how to do it correctly. It has some hairy
logic I don't understand

Depending on the number of rows matched, this Collector will take up lot of    memory unlike
a PriorityQueue. We need to hold-on to all rows, until the last row is fully examined.

Please do go through this, when you find time and see if it fits in with existing Blur Logic

There is still the problem of plugging in RAMDirs here, which I shall probably attempt to
solve down the line. 

> NRT Updates using RAMDirectory & Swap
> -------------------------------------
>                 Key: BLUR-290
>                 URL:
>             Project: Apache Blur
>          Issue Type: New Feature
>    Affects Versions: experimental-dev
>            Reporter: Ravikumar
>         Attachments:,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
> We have been discussing about handling humungous rows in Blur (BLUR-220). Explore the
idea of using RAMDirectory at the front, backed by persistent-index.

This message was sent by Atlassian JIRA

View raw message