Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 19672 invoked from network); 26 Aug 2010 18:14:14 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Aug 2010 18:14:14 -0000 Received: (qmail 81060 invoked by uid 500); 26 Aug 2010 18:14:12 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 80916 invoked by uid 500); 26 Aug 2010 18:14:11 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 80908 invoked by uid 99); 26 Aug 2010 18:14:11 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Aug 2010 18:14:11 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of karl.wettin@gmail.com designates 209.85.215.176 as permitted sender) Received: from [209.85.215.176] (HELO mail-ey0-f176.google.com) (209.85.215.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Aug 2010 18:13:47 +0000 Received: by eydd26 with SMTP id d26so1978013eyd.35 for ; Thu, 26 Aug 2010 11:13:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=rHrhYtTov/BdrDNBWECUoLvcF8ZZJhgIqH7sy/RwEXM=; b=AEetwY/zguTnVydkqc6BQRPM1I9PhBY36qBMJ9AaIVRXeDUukosNQz9Nxys8oEAeej 8CihKFtMvW6lfh8Tr50BO/EIV80GVKXfIAjpeWSCuNY9S87CeFNtKGbpkDT+CN+FkNA8 J4VAWYIBNErnRzgUQBvY2l8f1MwV8WOIyJLcs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=kqlT1ffK9e82fAn+nx+v9Fr5l2ohVMx9+vDBozi8jDzUQui0hpDiGYiKxlTraCbhnV eqfw0RU8KgDIhXq0H1xuBSZ3iBPv40fm96YCYXTJwzfH6n7MQY6FHR4UZYbswuZEVYun dva2au0mCuRWvM/uKBBJumV0sRoDnNx9fK/+c= Received: by 10.213.114.5 with SMTP id c5mr1105567ebq.91.1282846405661; Thu, 26 Aug 2010 11:13:25 -0700 (PDT) Received: from [192.168.1.201] (c-918770d5.029-18-6d6c6d2.cust.bredbandsbolaget.se [213.112.135.145]) by mx.google.com with ESMTPS id v8sm4465801eeh.20.2010.08.26.11.13.21 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 26 Aug 2010 11:13:22 -0700 (PDT) Message-Id: <8D02E901-B277-4CC9-8672-DC5C28F4F977@gmail.com> From: Karl Wettin To: java-user@lucene.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: instantiated contrib Date: Thu, 26 Aug 2010 20:13:20 +0200 References: X-Mailer: Apple Mail (2.936) X-Virus-Checked: Checked by ClamAV on apache.org My mail client died while sending this mail.. Sorry for any duplicate. It is strange that it should take 20 second to gather fields, this is the only thing that really suprises me. I'd expect it to be instant compared to RAMDirectory. It is hard to say from the information you provided. Did you perhaps lazy load field values from your RAMDirectory and not retrieve them, or something like that? Why your queries are slow is also hard to say, there can be many reaons. 70k documents can be quite a few documents for II if they contain enough text. Here are a few questions that may or may not be helpful: What is the content of the documents? Do they contain a lot of the same text? Or are they all rather unique? The major thing that makes II faster than RAMDirectory is that it does not have to deserialize values from the bytestream. As the index grows binary searching for documents containing a given term will start consume more time than deserializing the index. What speed do you see if you only load 10% (7k)? Did you see the graphics in the package level javadocs? http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/store/instantiated/package-summary.html karl 26 aug 2010 kl. 09.24 skrev Li Li: > I have about 70k document, the total indexed size is about 15MB(the > orginal text files' size). > dir=new RAMDirectory(); > IndexWriter write=new IndexWriter(dir,...; > for(loop){ > writer.addDocument(doc); > } > writer.optimize(); > writer.close(); > > IndexReader ir=IndexReader.open(dir,true); > InstantiatedIndex ii=new InstantiatedIndex(ir); > InstantiatedIndexReader iir=new InstantiatedIndexReader(ii); > is=new IndexSearcher(ir); > is2=new IndexSearcher(iir); > > I calculate the time by: > long searchStart=System.nanoTime(); > TopDocs docs=is.search(bQuery,Integer.MAX_VALUE); > long searchEnd=System.nanoTime(); > > I searched 10,000 documents and the time of RAMDirectory > and instantiated > the time used is time1: 21s(21812978000 ns) time2: > 20s(20713817000 ns) > I also calulate the time including get field value: > total1: 23852ms total2: 22610ms > it seems instantiated is not much faster than > RAMDirectory. Is there any thing wrong I used? my max memory is 4GB > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org