lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Content Field Incudes Some Other Fields By Tika
Date Sat, 27 Apr 2013 17:24:16 GMT
I have that fields at my schema.xml:

<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false" />
<field name="content" type="text_general" indexed="true" stored="true"
multiValued="false"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
<dynamicField name="*" type="ignored" multiValued="true" />

and my solrconfig:

<requestHandler name="/update/extract"
class="solr.extraction.ExtractingRequestHandler" >
<lst name="defaults">
<str name="lowernames">true</str>
<str name="fmap.content">content</str>
<!--<str name="uprefix">ignored_</str>-->
</lst>
</requestHandler>

however when I search some pdf files with Solr content starts with:

stream_content_type text/plain stream_size 959 Content-Encoding ISO-8859-1

and after that real content of file comes. Why I see them? I do not copy
and field into content field and I think I ignore any other fields that is
not defined in schema. How to remove them?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message