manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Generic Output Connection
Date Fri, 23 Feb 2018 10:56:10 GMT
Nikita,

The Generic connectors work by requesting XML from a target URL for all the
major connector functions, which include seeding, getting versions, and
getting documents.  The XML they receive must be well formed and parseable.

I do not know very much more about the generic connectors; I have advocated
removing them from our suite of connectors in the past but been overruled.
If you need to get them to work for you, you are on your own.

Thanks,
Karl


On Fri, Feb 23, 2018 at 4:20 AM, Nikita Ahuja <nikita@smartshore.nl> wrote:

> Karl,
>
> Thanks for the reply.
> But there is some confusion here.
> Shouldn't the details need to be mentioned in the repository path oh the
> repository generic connector?
>
>
> [image: Inline image 1]
>
>
> Or XML should be passed in the input?
>
>
> Thanks and Regards,
> Nikita
>
>
> On Thu, Feb 22, 2018 at 6:41 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> The generic connector, as I understand it, communicates via XML.  This,
>> to me, means that your seed XML is badly formed and cannot be parsed:
>>
>> *Caused by: org.apache.manifoldcf.core.int
>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException:
>> addSeedDocuments error: Content is not allowed in prolog.*
>>
>> Karl
>>
>> On Thu, Feb 22, 2018 at 7:06 AM, Nikita Ahuja <nikita@smartshore.nl>
>> wrote:
>>
>>> Hi Karl,
>>>
>>>
>>> Whenever I run the generic API repository connector the job does not
>>> start,
>>>
>>> [image: Inline image 2]
>>>
>>>
>>>
>>> And in the log file I have getting this exception for parsing:
>>> [image: Inline image 1]
>>>
>>>
>>>
>>> *org.apache.manifoldcf.core.int
>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException*
>>> *java.lang.RuntimeException: Unhandled exception of type:
>>> org.apache.manifoldcf.core.int
>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException*
>>> * at
>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector$ExecuteSeedingThread.finishUp(GenericConnector.java:1121)
>>> ~[?:?]*
>>> * at
>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector.addSeedDocuments(GenericConnector.java:249)
>>> ~[?:?]*
>>> * at
>>> org.apache.manifoldcf.crawler.system.StartupThread.run(StartupThread.java:154)
>>> [mcf-pull-agent.jar:?]*
>>> *Caused by: org.apache.manifoldcf.core.int
>>> <http://org.apache.manifoldcf.core.int>erfaces.ManifoldCFException:
>>> addSeedDocuments error: Content is not allowed in prolog.*
>>> * at
>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector$ExecuteSeedingThread.run(GenericConnector.java:1149)
>>> ~[?:?]*
>>> *Caused by: org.xml.sax.SAXParseException: Content is not allowed in
>>> prolog.*
>>> * at
>>> org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown
>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown
>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at
>>> org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown
>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>> * at
>>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
>>> Source) ~[xercesImpl-2.10.0.jar:?]*
>>> * at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
>>> ~[xercesImpl-2.10.0.jar:?]*
>>> * at java.xml/javax.xml.parsers.SAXParser.parse(Unknown Source) ~[?:?]*
>>> * at
>>> org.apache.manifoldcf.crawler.connectors.generic.GenericConnector$ExecuteSeedingThread.run(GenericConnector.java:1143)
>>> ~[?:?]*
>>>
>>>
>>>
>>>
>>> Please suggest a solution for this.
>>>
>>>
>>> Thanks and Regards,
>>> Nikita
>>>
>>>
>>> On Thu, Feb 22, 2018 at 5:34 PM, Karl Wright <daddywri@gmail.com> wrote:
>>>
>>>> Hi Nikita,
>>>> The picture is too small for me to read.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>> On Thu, Feb 22, 2018 at 4:18 AM, Nikita Ahuja <nikita@smartshore.nl>
>>>> wrote:
>>>>
>>>>> Hi Karl,
>>>>>
>>>>>
>>>>> Whenever I run the generic API repository connector the job does not
>>>>> start,
>>>>>
>>>>> [image: Inline image 2]
>>>>>
>>>>>
>>>>>
>>>>> And in the log file I have getting this exception for parsing:
>>>>> [image: Inline image 1]
>>>>>
>>>>>
>>>>>
>>>>> Please suggest a solution for this.
>>>>>
>>>>>
>>>>> Thanks and Regards,
>>>>> Nikita
>>>>>
>>>>> On Mon, Feb 19, 2018 at 2:45 PM, Karl Wright <daddywri@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Nikita,
>>>>>>
>>>>>> If you want to develop connectors, I recommend reading the book
>>>>>> "ManifoldCF In Action". It's online, free:
>>>>>>
>>>>>> https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 19, 2018 at 3:39 AM, Nikita Ahuja <nikita@smartshore.nl>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Karl,
>>>>>>>
>>>>>>>
>>>>>>> But what should be steps then,to be followed  to work on API
calls
>>>>>>> and fetch the data using ManifoldCF.
>>>>>>>
>>>>>>> Is there any need of creating a new custom connector? If so,
then
>>>>>>> please share steps or the flow which is followed in the creating
the
>>>>>>> connectors.
>>>>>>>
>>>>>>>
>>>>>>> Thanks and Regards,
>>>>>>> Nikita
>>>>>>>
>>>>>>> On Thu, Feb 15, 2018 at 4:41 PM, Karl Wright <daddywri@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Nikita,
>>>>>>>>
>>>>>>>> I do not understand your question.
>>>>>>>>
>>>>>>>> The Generic Connector was written by a committer who has
since
>>>>>>>> become unavailable, and nobody here knows how it is supposed
to work.  All
>>>>>>>> that we have is the code and the documentation.
>>>>>>>>
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Feb 15, 2018 at 5:58 AM, Nikita Ahuja <nikita@smartshore.nl
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi Karl,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am trying to connect through api and fetch the data
inside that,
>>>>>>>>> but there are many issues while creating the conncetor
also the connector
>>>>>>>>> never stops running. Will you please provide any example
for the Generic
>>>>>>>>> API connection.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message