lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chandan khatua" <chand...@nrifintech.com>
Subject Can not index raw binary data stored in Database in BLOB format.
Date Mon, 24 Feb 2014 07:21:10 GMT
Hi,

 

We have raw binary data stored in database(not word,excel,xml etc files) in
BLOB.

We are trying to index using TikaEntityProcessor but nothing seems to get
indexed.

But the same configuration works when xml/word/excel files are stored in the
BLOB field.

Below is our data-config.xml:

 

<?xml version="1.0" encoding="UTF-8" ?>

<dataConfig>

<dataSource name="db" driver="oracle.jdbc.driver.OracleDriver"
url="jdbc:oracle:thin:@//a.a.a.a:a/d11gr21" user="abc" password="abc"
convertType="true"/>

<dataSource name="dastream" type="FieldStreamDataSource" />

<document>

 <entity 

      name="messages" pk=" PK" transformer='DateFormatTransformer'

      query="select * from table1"

      dataSource="db">

                <field column =" PK" name ="id" />

                <field column="last_modified"  dateTimeFormat="YYYY-MM-DD
HH24:MI:SS" locale="en" />

    <entity 

        name="message"

        dataSource="dastream"

        processor="TikaEntityProcessor"

        url="message"

        dataField="messages.MESSAGE"

                                format="text"

        >

                                

        <field column="text" name="mxMsg" blob="true" />

      </entity>

    </entity>

                

 </document>

</dataConfig>

 

Please suggest us the changes required to index binary data.

 

Thanking you,

 

-Chandan


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message