Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 18756 invoked from network); 4 May 2009 18:31:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 May 2009 18:31:54 -0000 Received: (qmail 43638 invoked by uid 500); 4 May 2009 18:31:54 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 43577 invoked by uid 500); 4 May 2009 18:31:53 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 43569 invoked by uid 99); 4 May 2009 18:31:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 May 2009 18:31:53 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 May 2009 18:31:51 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7DC71234C044 for ; Mon, 4 May 2009 11:31:30 -0700 (PDT) Message-ID: <526864004.1241461890514.JavaMail.jira@brutus> Date: Mon, 4 May 2009 11:31:30 -0700 (PDT) From: "Jason Rutherglen (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1313) Realtime Search In-Reply-To: <1125794672.1214154225042.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705675#action_12705675 ] Jason Rutherglen commented on LUCENE-1313: ------------------------------------------ {quote}I don't like how "deep" the dichotomy of "RAMDir vs FSDir" {quote} Agreed, it's a bit awkward but I don't see another way to do this. The good thing is if IW has written some .fdt files to the main dir (via FSD), IW crashes, then IW is created again, IFD automatically deletes the extraneous .fdt (and other extension) files. {quote}Why can't we push FSD down to all these places (IFD, SegmentInfo/s, etc.)?{quote} {quote}Could we simply make the single CMS instance smart enough to realize that a single RAM merge is allowed to proceed regardless of the thread limit?{quote} Hmm... I think for benchmarking it would be good to allow options as we simply don't know. In the latest patch a ram mergescheduler can be set to the IndexWriter. {quote}have to fix FSD to understand CFX must go to the dir too{quote} I think this is fixed in the patch, where compound files are not created in RAM. {quote} You're saying we should have IW create the ramdir by default after getReader is called and remove the IW ramdir constructor? Right. This should be "under the hood".{quote} Ok, this will require some reworking of the patch. {quote}OK, though I'd like to simply always use FSD, even if primary & secondary are the same dir. {quote} How will always using FSD work? Doesn't it assume writing to two different directories? {quote}this ram size should be used not only for deciding when it's time to merge to a disk segment, but also when it's time for DW to flush a new segment{quote} In the new patch this is fixed. {quote}So if budget is 32 MB, and net RAM used (segments + DW) is say 22, we have a 10 MB "budget", so we are allowed to select merges that total to < 10 MB.{quote} One issue is the ram buffer flush doubles the ram used (because the segment is flushed as is to the RAM dir). You're saying roughly estimate the ram size used on the result of a merge and have the merge policy take this into account? This makes sense, otherwise we will consistently (if temporarily) exceed the ram buffer size. The algorithm is fairly simple? Find segments whose total sizes are lower than whatever we have left of the max ram buffer size? I have new code, but will rework it a bit to include this discussion. > Realtime Search > --------------- > > Key: LUCENE-1313 > URL: https://issues.apache.org/jira/browse/LUCENE-1313 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.4.1 > Reporter: Jason Rutherglen > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch > > > Realtime search with transactional semantics. > Possible future directions: > * Optimistic concurrency > * Replication > Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication. It is difficult to replicate using other methods because while the document may easily be serialized, the analyzer cannot. > I think this issue can hold realtime benchmarks which include indexing and searching concurrently. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org