From java-user-return-40680-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Tue Jun 09 12:51:21 2009 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 87524 invoked from network); 9 Jun 2009 12:51:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Jun 2009 12:51:18 -0000 Received: (qmail 98972 invoked by uid 500); 9 Jun 2009 12:51:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 98643 invoked by uid 500); 9 Jun 2009 12:51:01 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 97734 invoked by uid 99); 9 Jun 2009 12:47:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jun 2009 12:47:51 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jun 2009 12:47:42 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1ME0jZ-0004Ea-Q3 for java-user@lucene.apache.org; Tue, 09 Jun 2009 05:47:21 -0700 Message-ID: <23942165.post@talk.nabble.com> Date: Tue, 9 Jun 2009 05:47:21 -0700 (PDT) From: ChristophD To: java-user@lucene.apache.org Subject: Re: Using Luke on a Lucene Index in a Database In-Reply-To: <23630467.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: christoph.dietze@kqv.de References: <23611846.post@talk.nabble.com> <6f4104d80905190146m1bd39d76k9cbc5fad0ea2a4fc@mail.gmail.com> <23613338.post@talk.nabble.com> <359a92830905190716l5c073d31j8f13777985bbcf4a@mail.gmail.com> <23630467.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org Upon a request on the experiences on this issue, I am posting the most important functions of the program. Every DB record maps directly to one file. The function that I did not include is "getDataSource()" which acquires a jdbc datasource to your database. cheers, Christoph private void run() throws Exception { DataSource ds = getDataSource(); File toDir = new File("outputDir"); toDir.mkdirs(); assert toDir.exists(); assert toDir.isDirectory(); copyToFilesystem(ds, toDir); } public void copyToFilesystem(DataSource ds, File toDir) throws SQLException, IOException { Connection conn = ds.getConnection(); Statement stmt = conn.createStatement(); ResultSet rs = stmt .executeQuery("select NAME_, VALUE_ from XXXX.XXXX"); while (rs.next()) { String name = rs.getString("NAME_"); log.info("filename: '" + name + "'"); InputStream inStream = rs.getBinaryStream("VALUE_"); File file = new File(toDir, name); assert !file.exists(); boolean ok = file.createNewFile(); assert ok; FileOutputStream outStream = new FileOutputStream(file); copyLarge(inStream, outStream); inStream.close(); outStream.close(); } conn.rollback(); } /** * Taken from commons IO * * @see http://svn.apache.org/viewvc/commons/proper/io/trunk/src/java/org/ * apache/commons/io/IOUtils.java?revision=736890 */ public static long copyLarge(InputStream input, OutputStream output) throws IOException { byte[] buffer = new byte[DEFAULT_BUFFER_SIZE]; long count = 0; int n = 0; while (-1 != (n = input.read(buffer))) { output.write(buffer, 0, n); count += n; } return count; } ChristophD wrote: > > Ok, so let me clear it up. > > Lucene offers different types of Directories > (org.apache.lucene.store.Directory) into which it stores the index data. > Most people probably use the FSDirectory implementation which writes the > index data as files into the filesystem. However, we use the DbDirectory > implementation which writes into a specified relational database. > > Now, I was really surprised to see that Luke only offers to open an index > that was written to the filesystem. I had expected to be able to supply a > jdbc url. > > The way Lucene writes the index into the DB is really a direct projection > of the FS version. For every file it creates a record in the DB which has > a name and a BLOB that contains the file's data. > > So what I did was, I wrote a small program that reads the index from the > DB and writes it back as files. These files I could open with Luke without > problems. > > so long, > Christoph > > > Erick Erickson wrote: >> >> Well, you haven't really provided much in the way of details.For >> instance, >> what does it mean that your Lucene index is >> stored in a database"? Did you store it as a BLOB? Your >> problem statement is very hard to understand, please explain >> in more detail. Pretend you don't know a thing about your >> app (as in, you're just a random reader of this list) and imagine >> you were trying to understand well enough to offer useful >> responses... >> >> Best >> Erick >> >> On Tue, May 19, 2009 at 6:12 AM, ChristophD >> wrote: >> >>> >>> This isn't really addressing my problem. I already have a running search >>> system and just want to analyze it. >>> >>> cheers, >>> Christoph >>> >>> >>> >>> amin1977 wrote: >>> > >>> > Are you using an object relational mapping tool like Hibernate? if >>> you >>> > are >>> > you could use hibernate search to index your persistent entities and >>> then >>> > use luke to inspect the indexes. There may other ways of doing it I >>> > guess. Just a thought. >>> > >>> > >>> > Cheers >>> > Amin >>> > >>> > On Tue, May 19, 2009 at 9:23 AM, ChristophD >>> > wrote: >>> > >>> >> >>> >> Hello, >>> >> >>> >> I would like to use Luke to connect to an existing Lucene Index which >>> is >>> >> stored in a Database. >>> >> However, Luke seems only to be able to read file based indexes. >>> >> >>> >> What are my options to analyze the DB index? >>> >> >>> >> thx, >>> >> Christoph >>> >> -- >>> >> View this message in context: >>> >> >>> http://www.nabble.com/Using-Luke-on-a-Lucene-Index-in-a-Database-tp23611846p23611846.html >>> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >>> >> >>> >> >>> >> --------------------------------------------------------------------- >>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> >> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >> >>> >> >>> > >>> > >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Using-Luke-on-a-Lucene-Index-in-a-Database-tp23611846p23613338.html >>> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/Using-Luke-on-a-Lucene-Index-in-a-Database-tp23611846p23942165.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org