lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DataImportHandler" by Don Lelel
Date Wed, 21 Sep 2011 16:21:45 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "DataImportHandler" page has been changed by Don Lelel:
http://wiki.apache.org/solr/DataImportHandler?action=diff&rev1=292&rev2=293

    . {{{
     <dataSource type="FileDataSource" encoding="UTF-8"/>
  }}}
+  * If you dont get the expected data imported from a db, there are a few things to check:

+ 
+ 1. Chaining the transformers is a bit tricky. Some of the transformers get the data from
specified "sourceColName" (attribute) but they put the transformed data back into the other
specified "column" (attribute) so next transformer in chain will actually act on the same
untransformed data! To avoid this, it's better to fix the column names in your sql using "AS"
and use no "sourceColName":
+   . {{{
+ <entity name="transaction"
+  transformer="ClobTransformer, RegexTransformer"
+  query="SELECT CO_TRANSACTION_ID as TID_COMMON, CO_FROM_SERVICE_DT as FROM_SERVICE_DT, CO_TO_SERVICE_DT
as TO_SERVICE_DT, CO_PATIENT_LAST_NM as PATIENT_LAST_NM, CO_TOTAL_CLAIM_CHARGE_AMT as TOTAL_CLAIM_CHARGE_AMT
FROM TABLE(pkg_solr_import.cb_get_transactions('${document.DOCUMENT_ID}'))"
+ 			>
+ <field column="TID_COMMON" splitBy="#" clob="true"/>
+ <field column="FROM_SERVICE_DT" splitBy="#" clob="true"/>
+ <field column="TO_SERVICE_DT" splitBy="#" clob="true"/>
+ <field column="PATIENT_LAST_NM" splitBy="#" clob="true"/>
+ <field column="TOTAL_CLAIM_CHARGE_AMT" splitBy="#" clob="true"/>
+ 
+ </entity>
+ }}}
+ 
+ One common issue due to the chaining of the transformers and use of the "sourceColName"
is getting stuff like oracle.sql.CLOB@aed3a5 in your imported data. 
+ 
+ 2. Pay attention to case sensitivity in the column names! I'd recommend using only upper
case. If specifying field column="FROM_SERVICE_Dt" but the query has the column named FROM_SERVICE_DT
then you wont see any error but you wont get any data either on that field!
+ 
  
  ----
  CategorySolrRequestHandler

Mime
View raw message