Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 28258 invoked from network); 20 Jan 2010 22:10:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Jan 2010 22:10:05 -0000 Received: (qmail 49459 invoked by uid 500); 20 Jan 2010 22:10:03 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 49374 invoked by uid 500); 20 Jan 2010 22:10:03 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 49364 invoked by uid 99); 20 Jan 2010 22:10:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jan 2010 22:10:02 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [72.249.82.150] (HELO uptecs.net) (72.249.82.150) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jan 2010 22:09:53 +0000 Received: from ical.example.com (110-174-160-164.static.tpgi.com.au [110.174.160.164]) (Authenticated sender: jacob) by uptecs.net (Postfix) with ESMTPA id E266F7C1D1 for ; Wed, 20 Jan 2010 22:09:28 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1077) Subject: Re: Lucene as a primary datastore From: Jacob Rhoden In-Reply-To: <856ac15f1001200230g6a95ea0dp50c1502a4c6ac575@mail.gmail.com> Date: Thu, 21 Jan 2010 09:09:26 +1100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <856ac15f1001200230g6a95ea0dp50c1502a4c6ac575@mail.gmail.com> To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.1077) X-Virus-Checked: Checked by ClamAV on apache.org In the same way that you should take regular exports/dumps of your mysql = databases, you could have the same strategy with lucene. As long as you have code that can export your data that runs daily, and = code that can rebuild your index from that data, In the event of a = problem the most you will loose is up to 24 hours of data yes? The whole concept of using lucene as the data store has also been on my = mind, simply because I have some systems where the lucene index is = simply a copy of all of the mysql data, makes me wonder why I even = bother with the mysql part (: On 20/01/2010, at 9:30 PM, Chris Harris wrote: > I don't do a lot of work with straight Lucene right now, but I do use > Solr, and from time to time the Lucene index inside my master Solr > server gets corrupted; in particular, some of the Lucene segment files > that are still in use somehow get deleted, resulting in Lucene > throwing FileNotFoundExceptions. Once this happens, I have to either > rebuild the whole index, or else run the Lucene CheckIndex tool in > "fix" mode, which renders the index operable again, but at the expense > of throwing away some of the data. This happens rarely, and I haven't > been able to diagnose it yet. In the meantime, though, I find it > somewhat reassuring to know that my source data is in a SQL table. >=20 > I don't know that this experience is relevant to you; my problem could > come from a variety of sources outside Lucene, including a potential > bug in Solr, and user error on my part. All the same, perhaps it would > be worth searching the mailing list archives for FileNotFound, to see > what else comes up? >=20 > On Tue, Jan 19, 2010 at 7:58 PM, Guido Bartolucci > wrote: >> I know that the primary use case for Lucene is as an index of data >> that can be reconstructed (e.g., from a relational database or from >> spidering your corporate intranet). >>=20 >> But, I'm curious if anyone uses Lucene as their primary datastore for >> their gold data. Is it good enough? >>=20 >> Would anyone consider (or do people already) store data in Lucene >> that, if it was lost, would destroy their business? And no, I'm not >> suggesting that you don't back up this data, I'm just curious if = there >> are problems with using Lucene in this way. Are there subtle >> corruptions that might show up in Lucene that wouldn't show up in >> Oracle or MySQL? >>=20 >> I'm considering using Lucene in this way but I haven't been able to >> find any documentation describing this use case. Are there any = studies >> of Lucene vs MySQL running for N years comparing the corruptions and >> recovery times? >>=20 >> Am I just ignorant and scared of Lucene and too trusting of Oracle = and MySQL? >>=20 >> Thanks. >>=20 >> -guido. >>=20 >> (BTW, I did find a similar question asked back in 2007 in the = archives >> but it doesn't really answer my question) >>=20 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >>=20 >>=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org >=20 Kind regards, Jacob Rhoden ____________________________________ Information Technology Services, The University of Melbourne Email: jrhoden@unimelb.edu.au Phone: +61 3 8344 2884 Mobile: +61 4 1095 7575 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org