lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charlie Hubbard <charlie.hubb...@gmail.com>
Subject Solrj Tika/Cell not using defaultField
Date Sun, 14 Jun 2015 17:08:37 GMT
I'm having trouble getting Solr to pay attention to the defaultField value
when I send a document to Solr Cell or Tika.  Here is my post I'm sending
using Solrj

POST
/solr/collection1/update/extract?extractOnly=true&defaultField=text&wt=javabin&version=2
HTTP/1.1

When I get the response back the NamedList contains the content it
extracted but it's under the name null and null_metadata respectively.
I've seen it return the defaultField I give it before, but for some reason
now it's not returning it.  I've even tried to configure the
ExtractRequestHandler like so:

    <requestHandler name="/update/extract"
                    startup="lazy"
                    class="solr.extraction.ExtractingRequestHandler">
        <lst name="defaults">
            <str name="defaultField">text</str>
            <!--<str name="lowernames">true</str>-->
            <!--<str name="uprefix">ignored_</str>-->

            <!-- capture link hrefs but ignore div attributes -->
            <str name="captureAttr">true</str>
            <str name="fmap.content">text</str>
            <str name="fmap.a">links</str>
            <str name="fmap.div">ignored_</str>
        </lst>
        <!--<str name="tika.config">tika.config</str>-->
    </requestHandler>

But even that doesn't get picked up.  Here is the SOLR code I use to set
the parameters:

    public SolrRequest toSolrExtractRequest() throws IOException {
        ContentStreamUpdateRequest req = new
ContentStreamUpdateRequest("/update/extract");
        req.addFile(getLocation(), null);

        req.setParam(EXTRACT_ONLY, "true");
        req.setParam(DEFAULT_FIELD, "text");

        return req;
    }

So why is this not working?

Charlie

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message