manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antonio David Perez Morales <ape...@zaizi.com>
Subject Re: Solr Extracting request handler
Date Mon, 16 Jun 2014 14:51:52 GMT
Hi Matteo

Manifold already handles the extraction, but the only way to send binary
content and document metadata to Solr is using the update/extract handler,
where the metadata is sent as query parameters and the binary content is
sent in the body of the requests, allowing Solr to use Tika to obtain the
raw content to be stored in Solr.

Regards


On Mon, Jun 16, 2014 at 4:35 PM, Matteo Grolla <m.grolla@sourcesense.com>
wrote:

> Hi During my first indexing I noticed that manifold uses Solr extracting
> request handler to extract the content of an xml file
> For performance reasons it would be better if Manifold handled the
> extraction letting Solr do the search engine
> Is this because of the connector design, framework design or just to be
> done?
>
> --
> Matteo Grolla
> Sourcesense - making sense of Open Source
> http://www.sourcesense.com
>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message