manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikita Ahuja <nik...@smartshore.nl>
Subject Re: Generic Output Connection
Date Fri, 23 Feb 2018 11:09:40 GMT
Thanks Karl,

I just want to work on the connector which can access the data using REST
API calls from a particular site.

Is there any other option which can be used in ManifoldCF which can be used?


Thanks and Regards,
Nikita


On Fri, Feb 23, 2018 at 4:26 PM, Karl Wright <daddywri@gmail.com> wrote:

> Nikita,
>
> The Generic connectors work by requesting XML from a target URL for all
> the major connector functions, which include seeding, getting versions, and
> getting documents.  The XML they receive must be well formed and parseable.
>
> I do not know very much more about the generic connectors; I have
> advocated removing them from our suite of connectors in the past but been
> overruled.  If you need to get them to work for you, you are on your own.
>
> Thanks,
> Karl
>
>
> On Fri, Feb 23, 2018 at 4:20 AM, Nikita Ahuja <nikita@smartshore.nl>
> wrote:
>
>> Karl,
>>
>> Thanks for the reply.
>> But there is some confusion here.
>> Shouldn't the details need to be mentioned in the repository path oh the
>> repository generic connector?
>>
>>
>> [image: Inline image 1]
>>
>>
>> Or XML should be passed in the input?
>>
>>
>> Thanks and Regards,
>> Nikita
>>
>>
>> On Thu, Feb 22, 2018 at 6:41 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> The generic connector, as I understand it, communicates via XML.  This,
>>> to me, means that your seed XML is badly formed and cannot be parsed:
>>>
>>> *Caused by: org.apache.manifoldcf.core.int
>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException:
>>> addSeedDocuments error: Content is not allowed in prolog.*
>>>
>>> Karl
>>>
>>> On Thu, Feb 22, 2018 at 7:06 AM, Nikita Ahuja <nikita@smartshore.nl>
>>> wrote:
>>>
>>>> Hi Karl,
>>>>
>>>>
>>>> Whenever I run the generic API repository connector the job does not
>>>> start,
>>>>
>>>> [image: Inline image 2]
>>>>
>>>>
>>>>
>>>> And in the log file I have getting this exception for parsing:
>>>> [image: Inline image 1]
>>>>
>>>>
>>>>
>>>> *org.apache.manifoldcf.core.int
>>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException*
>>>> *java.lang.RuntimeException: Unhandled exception of type:
>>>> org.apache.manifoldcf.core.int
>>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException*
>>>> * at
>>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector$ExecuteSeedingThread.finishUp(GenericConnector.java:1121)
>>>> ~[?:?]*
>>>> * at
>>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector.addSeedDocuments(GenericConnector.java:249)
>>>> ~[?:?]*
>>>> * at
>>>> org.apache.manifoldcf.crawler.system.StartupThread.run(StartupThread.java:154)
>>>> [mcf-pull-agent.jar:?]*
>>>> *Caused by: org.apache.manifoldcf.core.int
>>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException:
>>>> addSeedDocuments error: Content is not allowed in prolog.*
>>>> * at
>>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector$ExecuteSeedingThread.run(GenericConnector.java:1149)
>>>> ~[?:?]*
>>>> *Caused by: org.xml.sax.SAXParseException: Content is not allowed in
>>>> prolog.*
>>>> * at
>>>> org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
>>>> ~[xercesImpl-2.10.0.jar:?]*
>>>> * at
>>>> org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at
>>>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>>> ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>>> ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>>>> ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>>>> ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
>>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>>> * at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
>>>> ~[xercesImpl-2.10.0.jar:?]*
>>>> * at java.xml/javax.xml.parsers.SAXParser.parse(Unknown Source) ~[?:?]*
>>>> * at
>>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector$ExecuteSeedingThread.run(GenericConnector.java:1143)
>>>> ~[?:?]*
>>>>
>>>>
>>>>
>>>>
>>>> Please suggest a solution for this.
>>>>
>>>>
>>>> Thanks and Regards,
>>>> Nikita
>>>>
>>>>
>>>> On Thu, Feb 22, 2018 at 5:34 PM, Karl Wright <daddywri@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Nikita,
>>>>> The picture is too small for me to read.
>>>>>
>>>>> Thanks,
>>>>> Karl
>>>>>
>>>>>
>>>>> On Thu, Feb 22, 2018 at 4:18 AM, Nikita Ahuja <nikita@smartshore.nl>
>>>>> wrote:
>>>>>
>>>>>> Hi Karl,
>>>>>>
>>>>>>
>>>>>> Whenever I run the generic API repository connector the job does
not
>>>>>> start,
>>>>>>
>>>>>> [image: Inline image 2]
>>>>>>
>>>>>>
>>>>>>
>>>>>> And in the log file I have getting this exception for parsing:
>>>>>> [image: Inline image 1]
>>>>>>
>>>>>>
>>>>>>
>>>>>> Please suggest a solution for this.
>>>>>>
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Nikita
>>>>>>
>>>>>> On Mon, Feb 19, 2018 at 2:45 PM, Karl Wright <daddywri@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Nikita,
>>>>>>>
>>>>>>> If you want to develop connectors, I recommend reading the book
>>>>>>> "ManifoldCF In Action". It's online, free:
>>>>>>>
>>>>>>> https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs
>>>>>>>
>>>>>>> Karl
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018 at 3:39 AM, Nikita Ahuja <nikita@smartshore.nl>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Karl,
>>>>>>>>
>>>>>>>>
>>>>>>>> But what should be steps then,to be followed  to work on
API calls
>>>>>>>> and fetch the data using ManifoldCF.
>>>>>>>>
>>>>>>>> Is there any need of creating a new custom connector? If
so, then
>>>>>>>> please share steps or the flow which is followed in the creating
the
>>>>>>>> connectors.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks and Regards,
>>>>>>>> Nikita
>>>>>>>>
>>>>>>>> On Thu, Feb 15, 2018 at 4:41 PM, Karl Wright <daddywri@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Nikita,
>>>>>>>>>
>>>>>>>>> I do not understand your question.
>>>>>>>>>
>>>>>>>>> The Generic Connector was written by a committer who
has since
>>>>>>>>> become unavailable, and nobody here knows how it is supposed
to work.  All
>>>>>>>>> that we have is the code and the documentation.
>>>>>>>>>
>>>>>>>>> Karl
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 15, 2018 at 5:58 AM, Nikita Ahuja <
>>>>>>>>> nikita@smartshore.nl> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Karl,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I am trying to connect through api and fetch the
data inside
>>>>>>>>>> that, but there are many issues while creating the
conncetor also the
>>>>>>>>>> connector never stops running. Will you please provide
any example for the
>>>>>>>>>> Generic API connection.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message