manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikita Ahuja <nik...@smartshore.nl>
Subject Re: Fetching output Elastic Search data in pipelines
Date Thu, 15 Mar 2018 09:04:02 GMT
Hi Karl,

There is still problem with the same mapper attachment with the Elastic
connector, even if the box is "checked". The same error still comes there.





Please suggest a way out.

Thanks and Regards,
Nikita



On Wed, Mar 7, 2018 at 6:32 PM, Karl Wright <daddywri@gmail.com> wrote:

> Hi Nikita,
>
> You have not selected the "use mapper attachment" checkbox in the
> configuration for the ES output connector.  But you are using it in Elastic
> Search.  The ES output connector will not convert binary to base64 unless
> you check that box.
>
> Karl
>
>
> On Wed, Mar 7, 2018 at 6:18 AM, Nikita Ahuja <nikita@smartshore.nl> wrote:
>
>> Hi Karl,
>>
>>
>> This is not only for  Sharepoint it is same for FileShare, Sharepoint and
>> Web crawler.
>>
>> For Elastic Search Output, following parameters are defined.
>>
>>
>>
>>
>> In the simple history tab, following errors are there.
>>
>>
>>
>> Server exception like this comes down, every time it goes for the
>> indexation:
>>
>>
>> *Server exception:
>> {"error":{"root_cause":[{"type":"exception","reason":"java.lang.IllegalArgumentException:
>> java.lang.IllegalArgumentException: Illegal base64 character
>> 3f","header":{"processor_type":"attachment"}}],"type":"exception","reason":"java.lang.IllegalArgumentException:
>> java.lang.IllegalArgumentException: Illegal base64 character
>> 3f","caused_by":{"type":"illegal_argument_exception","reason":"java.lang.IllegalArgumentException:
>> Illegal base64 character
>> 3f","caused_by":{"type":"illegal_argument_exception","reason":"Illegal
>> base64 character
>> 3f"}},"header":{"processor_type":"attachment"}},"status":500} *
>>
>>
>>
>> But if we don't define any value in the pipeline tab, it goes directly in
>> the index. there is some problem with the code. Here I need to use
>> different pipelines in the same index like for Website: web and for
>> FileShare: file, etc.
>>
>>
>> Thanks and Regards,
>> Nikita
>>
>>
>>
>>
>>
>>
>> On Wed, Mar 7, 2018 at 2:45 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Hi Nikita,
>>>
>>> The downstream pipeline for a connector determines which mime types are
>>> indexed and which are rejected.  If you look in the Simple History report
>>> for one of the rejected SharePoint documents, there should be information
>>> recorded about why it was rejected.  If there's no non-image documents at
>>> all described from SharePoint, then the issue would have to be how the
>>> SharePoint repository connection in the job is specified.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Wed, Mar 7, 2018 at 2:29 AM, Nikita Ahuja <nikita@smartshore.nl>
>>> wrote:
>>>
>>>> Hi Karl,
>>>>
>>>>
>>>> I am trying to ingest the data from website ans Sharepoint to Elastic
>>>> Search output in different pipelines in same index.
>>>>
>>>> But the ManifoldCF is not able to ingest all the data. It only put
>>>> image files present in the source to ElasticSearch output.
>>>>
>>>> Is there anything which is being missed?
>>>>
>>>>
>>>> Please guide for a solution.
>>>>
>>>> Thanks and Regards,
>>>> Nikita
>>>>
>>>
>>>
>>
>

Mime
View raw message