lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer
Date Tue, 16 Mar 2010 05:50:27 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845721#action_12845721
] 

Jason Rutherglen commented on LUCENE-2312:
------------------------------------------

Just to clarify, I think Mike's referring to ParallelArray?

http://gee.cs.oswego.edu/dl/jsr166/dist/extra166ydocs/extra166y/P
arallelArray.html

There's AtomicIntegerArray:
http://www.melclub.net/java/_atomic_integer_array_8java_source.html 
which underneath uses the sun.Unsafe class for volatile array
access. Could this be reused for an AtomicByteArray class (why
isn't there one of these already?).

A quick and easy way to solve this is to use a read write lock
on the byte pool? Remember when we'd sync on each read bytes
call to the underlying random access file in FSDirectory (eg,
now we're using NIOFSDir which can be a good concurrent
throughput improvement). Lets try the RW lock and examine the
results? I guess the issue is we're not writing in blocks of
bytes, we're actually writing byte by byte and need to read byte
by byte concurrently? This sounds like a fairy typical thing to
do? 


> Search on IndexWriter's RAM Buffer
> ----------------------------------
>
>                 Key: LUCENE-2312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2312
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 3.0.1
>            Reporter: Jason Rutherglen
>            Assignee: Michael Busch
>             Fix For: 3.1
>
>
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable. 
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing. 
> Michael Busch has good suggestions regarding how to handle deletes using max doc ids.
 
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here: 
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message