Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 77052 invoked from network); 7 Jul 2010 06:30:41 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Jul 2010 06:30:41 -0000 Received: (qmail 37632 invoked by uid 500); 7 Jul 2010 06:30:39 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 37355 invoked by uid 500); 7 Jul 2010 06:30:36 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 37335 invoked by uid 99); 7 Jul 2010 06:30:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Jul 2010 06:30:34 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of fancyerii@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Jul 2010 06:30:27 +0000 Received: by pwj2 with SMTP id 2so1763920pwj.35 for ; Tue, 06 Jul 2010 23:30:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=1RkSuOgkEmCgwnBGnJHbOxCJOCgOH0/1eeOGdF+/lCU=; b=v2c5x6ZCG0pBOGqKdg9iO/Fbkpsnx8XOHGXRtiEy0OhgXTvUS8pWiwwk/Tlw5Ju4x8 geX9gBV4w3AZVxMpawG+54f+GFDDjrVRDQ53cyUBBqxd65oA8JHwBBCJvX+E0loQ76p0 mvgXlgB+ZU5I6jzOy0BoVmnYgMi3X2dzIxD0Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=VpXqpzsXImY3jzre6s2nOZyZuhlmvhNovaNyErOHxuiAvoQ223+zIwquIWsKXQLLML wogDTFV4qdXZ58yoCHZdcLY7bH5NzPOXB8gLu5Y+fZGrQkkIYK1R10yUtruVoo+cBcnB FrFrmkW92SL3kdOnAKcFnpJzrHzCE8oRCez+E= MIME-Version: 1.0 Received: by 10.142.169.12 with SMTP id r12mr7143002wfe.287.1278484205489; Tue, 06 Jul 2010 23:30:05 -0700 (PDT) Received: by 10.142.217.2 with HTTP; Tue, 6 Jul 2010 23:30:05 -0700 (PDT) In-Reply-To: References: Date: Wed, 7 Jul 2010 14:30:05 +0800 Message-ID: Subject: Re: How to manage resource out of index? From: Li Li To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org thank you. 2010/7/7 Rebecca Watson : > hi li, > > i looked at doing something similar - where we only index the text > but retrieve search results / highlight from files -- we ended up giving > up because of the amount of customisation required in solr -- mainly > because we wanted the distributed search functionality in solr which > meant making > sure the original file ended up the same filing system i.e. machine too!)= . > > we ended up just storing the main text field too even though there was a > bit of text -- in the end solr/lucene can handle the index size fine and > disk space is cheaper than man-hours to customise solr/lucene to work > in this way! > > that was our conclusion anyway and it works fine -- we also have > separate index / search server(s) so we don't care about merge time > either -- and as i said above - we use the distributed search so don't te= nd > to need to merge very large indexes anyway. > when your system grows / you go into production you'll probably split > the indexes too to use solr's distributed search func. for the sake of > query speed). > > hope that helps, > > bec :) > > On 7 July 2010 14:07, Li Li wrote: >> I used to store full text into lucene index. But I found it's very >> slow when merging index because when merging 2 segments it copy the >> fdt files into a new one. So I want to only index full text. But When >> searching I need the full text for applications such as hightlight and >> view full text. I can store the full text by pair in >> database and load it to memory. And When I search in lucene(or solr), >> I retrive url of doc first, then use url to get full text. But when >> they are stored separately, it is hard to managed. They may be not >> consistent with each other. Does lucene or solr provied any method to >> ease this problem? Or any one =A0has some experience of this problem? >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org