lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Ochoa" <marcelo.oc...@gmail.com>
Subject Re: Oracle and Lucene Integration
Date Thu, 23 Nov 2006 12:22:51 GMT
Otis:
  I am new to Lucene API and searching technologies :)

                  doc.add(new Field("rowid", rowid, Field.Store.YES,
                          Field.Index.UN_TOKENIZED));
                  Done!!.

  Also the Oracle ROWID format has a portion which can be used as the
document id into the Lucene document, this will simplify the delete
operation, for example, because with the rowid we can use
reader.deleteDocument(idFromRowIDValue).
http://download-east.oracle.com/docs/cd/B19306_01/server.102/b14220/datatype.htm#sthref3899
  But I don't know how to add documents with an specific id.
  Somebody can help me showing a code snipped with an adding operation
using a predefined ID?
  Rowid number start with 0 and are sequentially assigned.
  Best regards, Marcelo.

On 11/23/06, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> Wow, very cool, even though I don't use Oracle anywhere at the moment.
> You probably don't want that rowid field tokenized, by the way.
>
> Otis
>
> ----- Original Message ----
> From: Marcelo Ochoa <marcelo.ochoa@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Wednesday, November 22, 2006 8:44:58 AM
> Subject: Re: Oracle and Lucene Integration
>
> Hi Mark:
> > Very interesting.
> >
> > So how does this solution manage mapping Oracle primary keys to and from Lucene
doc ids?
>   I am storing the rowid value as a Document field, here a code sniped
>                 Document doc = new Document();
>                 doc.add(new Field("rowid", rowid, Field.Store.YES,
>                           Field.Index.TOKENIZED));
>                 Object value = rs.getObject(2);
>                 String valueStr = null;
>                 if (value!=null) { // Sanity checks
>                   if (value instanceof CLOB)
>                     valueStr =
> ((CLOB)value).getSubString(1,(int)((CLOB)value).length());
>                   else if (value instanceof XMLType)
>                     valueStr =
> ((XMLType)value).extract("//text()","").getStringVal();
>                   else
>                     valueStr = value.toString();
>                   doc.add(new
> Field(col,valueStr,Field.Store.NO,Field.Index.TOKENIZED));
>                   writer.addDocument(doc);
>
>   So when I am querying I can get the rowid back using:
>                 if (iterator.hasNext()) {
>                     // append rowid to collection
>                     Hit hit = (Hit) iterator.next();
>                     try {
>                         rid =  hit.get("rowid");
>                         score = hit.getScore();
>                     } catch (IOException e) {
>                         e.printStackTrace();
>                         throw new SQLException(e.getMessage());
>                     }
>                     rlist[i] = new String(rid);
>                     slist.put(rid,new Float(score));
>                     idx++;
>                 } else {.............
>   and passing the rowid to the Oracle execution plan which is
> collecting in bacth of 2000 rowids.
> >
> > >> Another benefits of using the Data Cartridge API is that if the
> > >>table T1 has insert, update or delete rows operations a corresponding
> > >>Java method will be called to automatically update the Lucene Index.
> >
> > I suspect the tricky bit is optimizing the opening/closing of Lucene IndexReaders/Writers
especially in the event of large batches of database updates.
> > Does this API pass the transactional info which would help organize the batching
of the Lucene reader.delete and writer.add calls?
>   Well, I think that Oracle Text uses a Queue to store large batches,
> because it use a ctx_sys.sync procedure to update the index ;)
>   We can make the same solution using Oracle AQ.
> >
> > Cheers
> > Mark
>  Best regards, Marcelo.
> --
> Marcelo F. Ochoa
> http://marcelo.ochoa.googlepages.com/home
> ______________
> Do you Know DBPrism? Look @ DB Prism's Web Site
> http://www.dbprism.com.ar/index.html
> More info?
> Chapter 17 of the book "Programming the Oracle Database using Java &
> Web Services"
> http://www.amazon.com/gp/product/1555583296/
> Chapter 21 of the book "Professional XML Databases" - Wrox Press
> http://www.amazon.com/gp/product/1861003587/
> Chapter 8 of the book "Oracle & Open Source" - O'Reilly
> http://www.oreilly.com/catalog/oracleopen/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Marcelo F. Ochoa
http://marcelo.ochoa.googlepages.com/home
______________
Do you Know DBPrism? Look @ DB Prism's Web Site
http://www.dbprism.com.ar/index.html
More info?
Chapter 17 of the book "Programming the Oracle Database using Java &
Web Services"
http://www.amazon.com/gp/product/1555583296/
Chapter 21 of the book "Professional XML Databases" - Wrox Press
http://www.amazon.com/gp/product/1861003587/
Chapter 8 of the book "Oracle & Open Source" - O'Reilly
http://www.oreilly.com/catalog/oracleopen/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message