Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54599 invoked from network); 17 Oct 2007 03:03:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Oct 2007 03:03:20 -0000 Received: (qmail 85864 invoked by uid 500); 17 Oct 2007 03:03:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 85836 invoked by uid 500); 17 Oct 2007 03:03:02 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 85825 invoked by uid 99); 17 Oct 2007 03:03:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Oct 2007 20:03:02 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chris.lu@gmail.com designates 64.233.182.188 as permitted sender) Received: from [64.233.182.188] (HELO nf-out-0910.google.com) (64.233.182.188) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Oct 2007 03:03:02 +0000 Received: by nf-out-0910.google.com with SMTP id d3so1700634nfc for ; Tue, 16 Oct 2007 20:02:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=VsHCZQ/Yt00Uzq8c3aAHw5KIBG+UxiLUbK3ZIi65tCQ=; b=Z5WdtHnjK9AjYxQprxnzy+Gl2rEc0oWTz1wU/kvREZm7wVK4uK3Gcm9RafvrnSAMIhIvghmfkRAo5frpFHJlD9Hf9uTsrL68UojaevlWjpObIpA4B7wU6h+16iCp/TUAFr4X0VVWuqlFj429VMHPfiwRFFM1Gjl5BHv5AjY5rAk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=UuQefHkfKuA5vK8XVD2plGftt4eKnCzPARA2sU55JA/3K5OwRdzDVPPGntkiRktviP/MY9SVvt4ITAI/7hSTGgSbZrflgaVyDAoLsBGmJ1eloKZt6hinN1P6+daBk/iHIaRiFp09r6Ezmybd4nw2TmNAawAddfjdyEsBdtFC99s= Received: by 10.78.156.6 with SMTP id d6mr5339240hue.1192590159815; Tue, 16 Oct 2007 20:02:39 -0700 (PDT) Received: by 10.78.140.5 with HTTP; Tue, 16 Oct 2007 20:02:39 -0700 (PDT) Message-ID: <6e3ae6310710162002g280374f7n422a718dbb5571f7@mail.gmail.com> Date: Tue, 16 Oct 2007 20:02:39 -0700 From: "Chris Lu" To: java-user@lucene.apache.org Subject: Re: use lucene as datastore? In-Reply-To: <13246220.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_46975_7854058.1192590159808" References: <13246220.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_46975_7854058.1192590159808 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline No experience on this. But there are two points I can think of: 1) you can use compressed field to store the text 2) use the hash code of the path as the key -- Chris Lu ------------------------- Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes On 10/16/07, argh wrote: > > > Hi, > > I'm adding Lucene to an existing project where a daemon monitors a > frequently updated file system tree containing lots of expensive-to-parse > files for changes in order to keep cached metadata up to date about each > file. (File writes unfortunately cannot be routed to allow for more > efficient change detection.) > > Metadata is currently stored in a mirror directory tree as individual > files > that are a trivial XML serialization of the same data that will soon be > indexed by Lucene. > > I'm thus curious about the possibility eliminating the XML files > altogether > and just using Lucene to store the metadata. It seems like it could be a > big win on the complexity front. My main concern lies with the time and > space efficiency of switching from implicit filename lookups to the > search-based model of "find the one document with the path field > containing > /some/really/long/pathname". > > This seems like a really common type of problem, but my searching didn't > turn up anything useful. Pointers? Thoughts? > > Thanks... > > -rg > > > > -- > View this message in context: > http://www.nabble.com/use-lucene-as-datastore--tf4637962.html#a13246220 > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_46975_7854058.1192590159808--