manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
Date Wed, 25 Apr 2018 11:36:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452086#comment-16452086
] 

Karl Wright commented on CONNECTORS-1503:
-----------------------------------------

[~shinichiro abe], you have confirmed that if you set MCF Solr Connection to not use extracting
update handler, and you set an argument "processor=<something>", it properly obeys the
processor argument?

The reason I wonder about this is because the code in HttpPoster for SolrInputDocument does
not distinguish between fields and arguments: it uses addField() for both, e.g.:

{code}
      if (contentAttributeName != null)
      {
        // Copy the content into a string.  This is a bad thing to do, but we have no choice
given SolrJ architecture at this time.
        // We enforce a size limit upstream.
        Reader r = new InputStreamReader(is, Consts.UTF_8);
        StringBuilder sb = new StringBuilder((int)length);
        char[] buffer = new char[65536];
        while (true)
        {
          int amt = r.read(buffer,0,buffer.length);
          if (amt == -1)
            break;
          sb.append(buffer,0,amt);
        }
        outputDoc.addField( contentAttributeName, sb.toString() );
      }
...
      // Write the arguments
      for ( String name : arguments.keySet() )
      {
        List<String> values = arguments.get( name );
        outputDoc.addField( name, values );
      }
...
{code}

I am pretty sure that fields and arguments would need to be handled differently, no?


> UpdateProcessor SolrCloud and ManifoldCF
> ----------------------------------------
>
>                 Key: CONNECTORS-1503
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1503
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Solr 6.x component
>    Affects Versions: ManifoldCF 2.9.1
>         Environment: SolrCloud 6.6
> ManifoldCF 2.9.1
>            Reporter: Maxence SAUNIER
>            Assignee: Shinichiro Abe
>            Priority: Major
>         Attachments: 20170421-1740.png, jira_update_processor.png, manifoldcf_arguments_uniqFields.png,
manifoldcf_output_conf.zip
>
>
> Hello,
> [Link to Apache mail archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E]
> When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they arguments
on the POST request and not on the url parameters. So, for add a (pre)processor or a post-processor
with the url, it's not possible.
> [SolrConfig updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_]
> [call UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters]
> [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png]
> Solr response:
> org.apache.solr.common.SolrException: ERROR: [doc=file://///srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc]
unknown field 'processor'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message