manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-981) Solr Connector - classic Solrj SolrInputDocument support
Date Tue, 24 Jun 2014 14:30:25 GMT


Karl Wright commented on CONNECTORS-981:

Hi Alessandro,

One of the principle ways we make ManifoldCF be robust is to make sure that memory usage is
"bounded".  That is, the crawler cannot use more than a set amount of memory no matter what
the inputs are.  See:, chapter 6,
section 6.3.5.

Solr can, of course, make a different decision as a project.  We choose to enforce a limit.
 I suppose you could, say, limit the maximum number of bytes sent to Solr to, say, 64K.  But
I suspect people would not like that.

> Solr Connector - classic Solrj SolrInputDocument support
> --------------------------------------------------------
>                 Key: CONNECTORS-981
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Lucene/SOLR connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Alessandro Benedetti
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>         Attachments: CONNECTORS-981.patch
> The solr connector, according with the development of the Tika Connector processor, should
be able to operate in 2 ways :
> 1) as usual
> 2) using the classic Solrj SolrInputDocument approach with already extracted metadata
> To allow the choice a flag will be added in the UI in the mapping tab ( as it's related
with how the fields will be processed)

This message was sent by Atlassian JIRA

View raw message