Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 27225 invoked from network); 1 May 2009 18:39:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 May 2009 18:39:52 -0000 Received: (qmail 68041 invoked by uid 500); 1 May 2009 18:39:52 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 67972 invoked by uid 500); 1 May 2009 18:39:51 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 67964 invoked by uid 99); 1 May 2009 18:39:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 May 2009 18:39:51 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 May 2009 18:39:51 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7C51A234C003 for ; Fri, 1 May 2009 11:39:30 -0700 (PDT) Message-ID: <1787456491.1241203170504.JavaMail.jira@brutus> Date: Fri, 1 May 2009 11:39:30 -0700 (PDT) From: "Jason Rutherglen (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-1313) Realtime Search In-Reply-To: <1125794672.1214154225042.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1313: ------------------------------------- Attachment: LUCENE-1313.patch * IndexFileDeleter takes into account the ram directory (which when using NRT with the FSD caused files to not be found). * FSD is included and writes fdx, fdt, tvx, tvf, tvd extension files to the primary directory (which is the same as IW.directory). LUCENE-1618 needs to be updated with these changes (or we simply include it in this patch as the LUCENE-1618 patch is only a couple of files). * Removed DocumentsWriter.ramOverLimit * I think we need to give the option of a ram mergescheduler because the user may want not want the ram merging and disk merging to compete for threads. I'm thinking if of the use case where NRT is a priority then one may allocate more threads to the ram CMS and less to the disk CMS. This also gives us the option of trying out more parameters when performing benchmarks of NRT. * We may want to default the ram mergepolicy to not use compound files as it's not useful when using a ram dir? * Because FSD uses IW.directory, FSD will list files that originated from FSD and from IW.directory, we may want to keep track of which files are supposed to be in FSD (from the underlying primary dir) and which are not? {quote}If NRT is never used, the behavior of IW should be unchanged (which is not the case w/ this patch I think). RAMDir should be created the first time a flush is done due to NRT creation. {quote} In the patch if ramdir is not passed in, the behavior of IW remains the same as it is today. You're saying we should have IW create the ramdir by default after getReader is called and remove the IW ramdir constructor? What if the user has an alternative ramdir implementation they want to use? {quote}StoredFieldsWriter & TermVectorsTermsWriter now writes to IndexWriter.getFlushDirectory(), which is confusing because that method returns the RAMDir if set? Shouldn't this be the opposite? (Ie it should flush to IndexWriter.getDirectory()? Or we should change getFlushDiretory to NOT return the ramdir?){quote} The attached patch uses FileSwitchDirectory, where these files are written to the primary directory (IW.directory). So getFlushDirectory is ok? {quote}Why did you need to add synchronized to some of the SegmentInfo files methods? (What breaks if you undo that?). The contract here is IW protects access to SegmentInfo/s{quote} SegmentInfo.files was being cleared while sizeInBytes was called which resulted in an NPE. The alternative is sync IW in IW.size(SegmentInfos) which seems a bit extreme just to obtain the size of a segment info? {quote}The MergePolicy needs some smarts when it's dealing w/ RAM. EG it should not do a merge of more than XXX% of total RAM usage (should flush to the real directory instead){quote} Isn't this handled well enough in updatePendingMerges or is there more that needs to be done? > Realtime Search > --------------- > > Key: LUCENE-1313 > URL: https://issues.apache.org/jira/browse/LUCENE-1313 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.4.1 > Reporter: Jason Rutherglen > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1313.jar, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, LUCENE-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch, lucene-1313.patch > > > Realtime search with transactional semantics. > Possible future directions: > * Optimistic concurrency > * Replication > Encoding each transaction into a set of bytes by writing to a RAMDirectory enables replication. It is difficult to replicate using other methods because while the document may easily be serialized, the analyzer cannot. > I think this issue can hold realtime benchmarks which include indexing and searching concurrently. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org