manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-981) Solr Connector - classic Solrj SolrInputDocument support
Date Tue, 24 Jun 2014 23:53:25 GMT


Karl Wright commented on CONNECTORS-981:

If you look at UpdateRequest, it appears that the whole reason it goes always to strings is
because that's simply more convenient for the coder:

  public Collection<ContentStream>  getContentStreams() throws IOException {
    return ClientUtils.toContentStreams(getXML(), ClientUtils.TEXT_XML);

  public String  getXML() throws IOException {
    StringWriter writer = new StringWriter();

    // If action is COMMIT or OPTIMIZE, it is sent with params
    String xml = writer.toString();
    // System.out.println( "SEND:"+xml );
    return (xml.length() > 0) ? xml : null;

Turning a Writer-based method into a Reader-based one is where the inconvenience lies.  But
we could easily override this one method, and reimplement writeXML(), to get the behavior
we need.  I just did this for JSON in fact for the Amazon connector.

Let me think about this and get back to you.

> Solr Connector - classic Solrj SolrInputDocument support
> --------------------------------------------------------
>                 Key: CONNECTORS-981
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Alessandro Benedetti
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>         Attachments: CONNECTORS-981.patch
> The solr connector, according with the development of the Tika Connector processor, should
be able to operate in 2 ways :
> 1) as usual
> 2) using the classic Solrj SolrInputDocument approach with already extracted metadata
> To allow the choice a flag will be added in the UI in the mapping tab ( as it's related
with how the fields will be processed)

This message was sent by Atlassian JIRA

View raw message