incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravikumar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BLUR-290) NRT Updates using RAMDirectory & Swap
Date Sat, 16 Nov 2013 17:43:21 GMT

     [ https://issues.apache.org/jira/browse/BLUR-290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ravikumar updated BLUR-290:
---------------------------

    Attachment: BlurFieldsConsumer.java
                CompressingRowWriter.java
                CompressingRowReader.java
                RowReaderCache.java
                SortingMultiReader.java
                PrimeDocCache.java
                RowDocsCollector.java

The last patch handled the write part, to quickly determine the doc-idVsrow-id mapping.

This patch handles the reader/scoring part.

The BlurRealTimeManager.java internally uses a SortingMultiReader class 
to open readers for searching. This class internally opens a CompressingRowReader and adds
to RowReaderCache. This cache is on CoreClosedListener, which takes care of removing obsolete
readers

The Scoring part is contained in PrimeDocCache and RowDocsCollector.

The PrimeDocCache will simply load the BitSet from CompressingRowReader, based on real-time
flag.

The RowDocsCollector gathers rows across segments and returns a globally score-sorted TopDocs.
I have left-out "Super" scoring, as I do not know how to do it correctly. It has some hairy
logic I don't understand

Depending on the number of rows matched, this Collector will take up lot of    memory unlike
a PriorityQueue. We need to hold-on to all rows, until the last row is fully examined.

Please do go through this, when you find time and see if it fits in with existing Blur Logic

There is still the problem of plugging in RAMDirs here, which I shall probably attempt to
solve down the line. 

> NRT Updates using RAMDirectory & Swap
> -------------------------------------
>
>                 Key: BLUR-290
>                 URL: https://issues.apache.org/jira/browse/BLUR-290
>             Project: Apache Blur
>          Issue Type: New Feature
>    Affects Versions: experimental-dev
>            Reporter: Ravikumar
>         Attachments: BlurFieldsConsumer.java, BlurFieldsConsumer.java, BlurFlushingIndexWriter.java,
BlurIndexTracker.java, BlurPostingsConsumer.java, BlurPostingsFormat.java, BlurRealTimeIndex.java,
BlurRealTimeIndexWriter.java, BlurRealTimeManager.java, BlurRealTimeManagerReopenThread.java,
BlurRowCodec.java, BlurTermsConsumer.java, CompressingRowIndexReader.java, CompressingRowIndexWriter.java,
CompressingRowReader.java, CompressingRowReader.java, CompressingRowWriter.java, CompressingRowWriter.java,
GrowableByteArrayDataOutput.java, PrimeDocCache.java, RealTimeTransactionRecorder.java, RowCache.java,
RowDocsCollector.java, RowReaderCache.java, SlabAllocator.java, SlabRAMDirectory.java, SlabRAMFile.java,
SlabRAMInputStream.java, SlabRAMOutputStream.java, SortingMultiReader.java, SortingMultiReader.java,
TestCompressingRowWriter.java
>
>
> We have been discussing about handling humungous rows in Blur (BLUR-220). Explore the
idea of using RAMDirectory at the front, backed by persistent-index.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message