manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lamp, Ed" <Ed.L...@arcadis.com>
Subject RE: Generic Repository and Solr
Date Mon, 04 Jan 2016 13:53:10 GMT
Thanks.

The target application is a homegrown Coldfusion based app, so we have a little java knowledge
but not a ton.  Is there a connector that you would recommend as a starting point?

Thanks

Ed

From: Karl Wright [mailto:daddywri@gmail.com]
Sent: Monday, January 4, 2016 2:54 AM
To: user@manifoldcf.apache.org
Subject: Re: Generic Repository and Solr

Hi Ed,

Yes, it sounds like you have an issue on the solr side.

>>>>>>
I do have a question about the seed request.  My understanding is when the generic seed endpoint
is called, the target system should return all item IDs and then subsequent requests pass
a date/time to then only get new/update items.  For my target system, that initial requests
would return  200K+  items  which may be slow to produce or run into xml generation issues.
 Is there a better way of handling this scenario?
<<<<<<

Please bear in mind that the generic connector is what it is; it was designed for relatively
straightforward implementations and is not a replacement for developing your own connector
for more technically challenging situations.  For the situation you describe, the connector
itself will not have a problem with large XML because it parses that XML as a stream.  Generation
on your side could do the same thing -- that is, generate the seed document dynamically and
stream it out.  But if you're going to go that far you might as well develop your own connector,
unless you're working in a non-Java world.

Karl


On Mon, Jan 4, 2016 at 1:14 AM, Lamp, Ed <Ed.Lamp@arcadis.com<mailto:Ed.Lamp@arcadis.com>>
wrote:
Thanks for clarifying about the continuous mode, I figured it would keep starting and stopping

Job status:    it shows 5 documents, 0 active and 5 processed
Simple History:  it just shows the job starting and stopping

I looked through the logs again and I do see http posts to Solr and they get a response status
of 0  which is a success.  I will look into the Solr side of things to see why they aren’t
in the index.

I do have a question about the seed request.  My understanding is when the generic seed endpoint
is called, the target system should return all item IDs and then subsequent requests pass
a date/time to then only get new/update items.  For my target system, that initial requests
would return  200K+  items  which may be slow to produce or run into xml generation issues.
 Is there a better way of handling this scenario?

Thanks

Ed


From: Karl Wright [mailto:daddywri@gmail.com<mailto:daddywri@gmail.com>]
Sent: Sunday, January 3, 2016 2:45 AM
To: user@manifoldcf.apache.org<mailto:user@manifoldcf.apache.org>
Subject: Re: Generic Repository and Solr

Hi Ed,

The job running continuously when in continuous mode is what is supposed to happen.  And the
job completing when not in continuous mode argues that it is working, after a fashion, but
the documents are all being rejected by the solr connector.  This can happen if the solr connection
is configured to reject the mime type(s) that you documents have, for example.

First question; do you see non-zero document counts on the job status page?
If so, then second question: have you looked at the Simple History report to figure out why
documents aren't being indexed?

Please have a look at let me know what you find.

Thanks,
Karl


On Sat, Jan 2, 2016 at 9:17 PM, Lamp, Ed <Ed.Lamp@arcadis.com<mailto:Ed.Lamp@arcadis.com>>
wrote:
Hi

I am trying to connect a generic repository to a Solr output.  The job runs and I see in my
application (connected via the generic connector) the proper requests and I don’t see any
errors in the manifold logs.  The job completes and does not send anything to Solr.  If I
run it in continuous mode,  it stays stuck on running.  I have tested the Solr connect with
a test file system connector so I believe that part is ok.  I am not implementing any security
pieces yet.
Help ☺

Thanks
Ed


Edward Lamp II | Sr. Management Consultant | ed.lamp@arcadis.com<mailto:ed.lamp@arcadis.com>
Arcadis | Arcadis U.S., Inc.
14025 Riveredge Drive, Suite 600 Tampa FL| 33637 | USA
T. +1 813 353 5809<tel:%2B1%20813%20353%205809> | M. +1 914 602 5251<tel:%2B1%20914%20602%205251>
Connect with us! www.arcadis.com<http://www.arcadis.com/> | LinkedIn<https://www.linkedin.com/company/arcadis-north-america?trk=biz-companies-cym>
| Twitter<http://www.twitter.com/arcadis_us> | Facebook<https://www.facebook.com/ArcadisNorthAmerica>

[cid:image002.png@01D0F12E.705124A0]

Be green, leave it on the screen.


This e-mail and any files transmitted with it are the property of Arcadis. All rights, including
without limitation copyright, are reserved. This e-mail contains information which may be
confidential and may also be privileged. It is for the exclusive use of the intended recipient(s).
If you are not the intended recipient(s) please note that any form of distribution, copying
or use of this communication or the information in it is strictly prohibited and may be unlawful.
If you have received this communication in error please return it to the sender and then delete
the e-mail and destroy any copies of it. Whilst reasonable precautions have been taken to
ensure no software viruses are present in our emails we cannot guarantee that this e-mail
or any attachment is virus-free or has not been intercepted or changed. Any opinions or other
information in this e-mail that do not relate to the official business of Arcadis are neither
given nor endorsed by it.


This e-mail and any files transmitted with it are the property of Arcadis. All rights, including
without limitation copyright, are reserved. This e-mail contains information which may be
confidential and may also be privileged. It is for the exclusive use of the intended recipient(s).
If you are not the intended recipient(s) please note that any form of distribution, copying
or use of this communication or the information in it is strictly prohibited and may be unlawful.
If you have received this communication in error please return it to the sender and then delete
the e-mail and destroy any copies of it. Whilst reasonable precautions have been taken to
ensure no software viruses are present in our emails we cannot guarantee that this e-mail
or any attachment is virus-free or has not been intercepted or changed. Any opinions or other
information in this e-mail that do not relate to the official business of Arcadis are neither
given nor endorsed by it.


This e-mail and any files transmitted with it are the property of Arcadis. All rights, including
without limitation copyright, are reserved. This e-mail contains information which may be
confidential and may also be privileged. It is for the exclusive use of the intended recipient(s).
If you are not the intended recipient(s) please note that any form of distribution, copying
or use of this communication or the information in it is strictly prohibited and may be unlawful.
If you have received this communication in error please return it to the sender and then delete
the e-mail and destroy any copies of it. Whilst reasonable precautions have been taken to
ensure no software viruses are present in our emails we cannot guarantee that this e-mail
or any attachment is virus-free or has not been intercepted or changed. Any opinions or other
information in this e-mail that do not relate to the official business of Arcadis are neither
given nor endorsed by it.
Mime
View raw message