lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From javaxmlsoapdev <>
Subject Where to put ExternalRequestHandler and Tika jars
Date Wed, 25 Nov 2009 16:16:45 GMT

My SOLR_HOME =/home/solr_1_4_0/apache-solr-1.4.0/example/solr/conf in

POI, PDFBox, Tika and related jars are under

When I try to index files using SolrJ API as follow, I don't see content of
the file being indexed. It only indexes file size (bytes) and file/type into
"content" field. See below schema defintion as well.
ContentStreamUpdateRequest up = new
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);

schema.xml has following
 <field name="issueKey" type="slong" indexed="true" stored="true"
required="true" /> 
 <field name="content" type="text" indexed="true" stored="true"


And solrconfig.xml has
<requestHandler name="/update/extract"
    <lst name="defaults">
      <str name="map.content">content</str>
      <str name="defaultField">content</str>

Luke response is as below, which displays correct count (7) of indexed
documents but no "content" in the index. in tomcat logs I don't see any
errors or anything. Unless I am going blind with something I don't see
anything missing in setting things up. Can anyone advise. Do I need to
include tika jars in tomcat's deployed solr/lib or unde /example/lib in

  <?xml version="1.0" encoding="UTF-8" ?> 
- <response>
- <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">28</int> 
- <lst name="index">
  <int name="numDocs">7</int> 
  <int name="maxDoc">7</int> 
  <int name="numTerms">25</int> 
  <long name="version">1259164190261</long> 
  <bool name="optimized">false</bool> 
  <bool name="current">true</bool> 
  <bool name="hasDeletions">false</bool> 

  <date name="lastModified">2009-11-25T15:50:03Z</date> 
- <lst name="fields">
- <lst name="content">
  <str name="type">text</str> 
  <str name="schema">ITSM----------</str> 
  <str name="index">ITS----------</str> 
  <int name="docs">7</int> 
  <int name="distinct">18</int> 
- <lst name="topTerms">
  <int name="text">3</int> 
  <int name="applic">3</int> 
  <int name="msword">3</int> 
  <int name="applicationmsword">3</int> 
  <int name="plain">2</int> 
  <int name="textplain">2</int> 
  <int name="70144">1</int> 
  <int name="453">1</int> 
  <int name="2370">1</int> 
  <int name="html">1</int> 
- <lst name="histogram">
  <int name="1">12</int> 
  <int name="2">2</int> 
  <int name="4">4</int> 
- <lst name="issueKey">
  <str name="type">slong</str> 
  <str name="schema">I-S----O-----l</str> 
  <str name="index">I-S----O-----</str> 
  <int name="docs">7</int> 
  <int name="distinct">7</int> 
- <lst name="topTerms">
  <int name="1">1</int> 
  <int name="2">1</int> 
  <int name="3">1</int> 
  <int name="4">1</int> 
  <int name="5">1</int> 
  <int name="6">1</int> 
  <int name="0">1</int> 
- <lst name="histogram">
  <int name="1">7</int> 
- <lst name="info">
- <lst name="key">
  <str name="I">Indexed</str> 
  <str name="T">Tokenized</str> 
  <str name="S">Stored</str> 
  <str name="M">Multivalued</str> 
  <str name="V">TermVector Stored</str> 
  <str name="o">Store Offset With TermVector</str> 
  <str name="p">Store Position With TermVector</str> 
  <str name="O">Omit Norms</str> 
  <str name="L">Lazy</str> 
  <str name="B">Binary</str> 
  <str name="C">Compressed</str> 
  <str name="f">Sort Missing First</str> 
  <str name="l">Sort Missing Last</str> 
  <str name="NOTE">Document Frequency (df) is not updated when a document is
marked for deletion. df values include deleted documents.</str> 
View this message in context:
Sent from the Solr - User mailing list archive at

View raw message