incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravikumar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BLUR-290) NRT Updates using RAMDirectory & Swap
Date Wed, 08 Jan 2014 08:50:51 GMT

    [ https://issues.apache.org/jira/browse/BLUR-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865211#comment-13865211
] 

Ravikumar commented on BLUR-290:
--------------------------------

Thanks Aaron for the findings.

I think this is cool stuff. Sounds very very interesting. Commits inside millisec is awesome

"In Blur now there's no need for the WAL because everything is committed to disk". 

I do not understand this part. You mean to say that FastHdfsKeyValueDirectory points to the
local-dir and every operation takes place here, with an implicit commit?

I have also observed few things, which I wanted to get your opinion on

1. In TransactionRecorder, we have a synchronized block on writer.commit(). For realtime indexes,
this could be detrimental right?

2. commit() is supposed to be very slow and async operation, as documented in lucene-API [tens-of-seconds
is also acceptable!!!]. So, may be the gains of improving commit time might turn out to be
too-much-work and too-little-gain. 

3. NRTCachingDirectory is exactly meant for frequent NRT re-open calls, like in our case.
Newly created files will be on RAM and synced to disk, only during commit() calls [which is
anyways async]. So HDFS meta-calls and FileStatus calls are fully avoided.
It would be interesting to see if you can wrap the HDFSDir with a NRTCachingDir and benchmark
it, against the experimental KeyValueDir 

> NRT Updates using RAMDirectory & Swap
> -------------------------------------
>
>                 Key: BLUR-290
>                 URL: https://issues.apache.org/jira/browse/BLUR-290
>             Project: Apache Blur
>          Issue Type: New Feature
>    Affects Versions: experimental-dev
>            Reporter: Ravikumar
>         Attachments: BlurFieldsConsumer.java, BlurFieldsConsumer.java, BlurFieldsConsumer.java,
BlurFlushingIndexWriter.java, BlurIndexTracker.java, BlurPostingsConsumer.java, BlurPostingsConsumer.java,
BlurPostingsFormat.java, BlurPostingsFormat.java, BlurRealTimeIndex.java, BlurRealTimeIndex.java,
BlurRealTimeIndexTest.java, BlurRealTimeIndexWriter.java, BlurRealTimeManager.java, BlurRealTimeManagerReopenThread.java,
BlurRowCodec.java, BlurRowCodec.java, BlurSegmentInfoFormat.java, BlurSegmentInfoWriter.java,
BlurTermsConsumer.java, BlurTermsConsumer.java, CompressingRowIndexReader.java, CompressingRowIndexWriter.java,
CompressingRowReader.java, CompressingRowReader.java, CompressingRowReader.java, CompressingRowWriter.java,
CompressingRowWriter.java, CompressingRowWriter.java, GrowableByteArrayDataOutput.java, PrimeDocCache.java,
RealTimeTransactionRecorder.java, RealTimeTransactionRecorder.java, RowCache.java, RowDocsCollector.java,
RowDocsCollector.java, RowReaderCache.java, RowReaderCache.java, SlabAllocator.java, SlabRAMDirectory.java,
SlabRAMFile.java, SlabRAMInputStream.java, SlabRAMOutputStream.java, SortingMultiReader.java,
SortingMultiReader.java, TestCompressingRowWriter.java, TestCompressingRowWriter.java
>
>
> We have been discussing about handling humungous rows in Blur (BLUR-220). Explore the
idea of using RAMDirectory at the front, backed by persistent-index.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message