lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Trivial Update of "TikaEntityProcessor" by KojiSekiguchi
Date Mon, 11 Apr 2011 13:35:24 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "TikaEntityProcessor" page has been changed by KojiSekiguchi.
The comment on this change is: correct URLs. Tika is TLP.
http://wiki.apache.org/solr/TikaEntityProcessor?action=diff&rev1=6&rev2=7

--------------------------------------------------

   * parser : (optional) Default is org.apache.tika.parser.!AutoDetectParser . Povide a FQN
of a class which implements org.apache.tika.parser.Parser
  
  ==== fields ====
- Each field may have an optional attribute meta="true". Which means this field is to be obtained
from the !MetaData of the document. The column value is used as the key on metadata. Checkout
the list of available keys from here [[http://svn.apache.org/viewvc/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/DublinCore.java?revision=801678&view=markup
| DublinCore]] , [[http://svn.apache.org/viewvc/lucene/tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/MSOffice.java?revision=801678&view=markup
|MSOffice]]
+ Each field may have an optional attribute meta="true". Which means this field is to be obtained
from the !MetaData of the document. The column value is used as the key on metadata. Checkout
the list of available keys from here [[http://svn.apache.org/viewvc/tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/DublinCore.java?revision=801678&view=markup
| DublinCore]] , [[http://svn.apache.org/viewvc/tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/MSOffice.java?revision=801678&view=markup
|MSOffice]]
  
  === DataSource ===
  use any !DataSource of type !DataSource<!InputStream>. The inbuilt ones are

Mime
View raw message