lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "ExtractingRequestHandler" by Drzraf
Date Wed, 23 May 2012 14:04:00 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "ExtractingRequestHandler" page has been changed by Drzraf:
http://wiki.apache.org/solr/ExtractingRequestHandler?action=diff&rev1=72&rev2=73

  = Metadata =
  As has been implied up to now, Tika produces Metadata about the document.  Metadata often
contains things like the author of the file or the number of pages, etc.  The Metadata produced
depends on the type of document submitted.  For instance, PDFs have different metadata from
Word docs.
  
- In addition to Tika's metadata, Solr adds the following metadata (defined in !ExtractingMetadataConstants):
+ In addition to Tika's metadata, Solr adds the following metadata (defined in [[https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/extraction/src/java/org/apache/solr/handler/extraction/ExtractingMetadataConstants.java|ExtractingMetadataConstants]]):
  
   * "stream_name" - The name of the !ContentStream as uploaded to Solr.  Depending on how
the file is uploaded, this may or may not be set.
   * "stream_source_info" - Any source info about the stream.  See !ContentStream.

Mime
View raw message