manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Apache Manifold 2.10
Date Sat, 01 Dec 2018 20:05:36 GMT
Another thing: it's quite important to guarantee a working setup here,
otherwise you're just wasting everyone's time.  So, please base your
installation on the multiprocess-zk-example.  Start off by running the
example as is, on a small test crawl.  Once you know how it works, then
move next to changing only what you have to -- namely, the database
properties in the global properties file, to point to your MySQL instance.
Try that also on a small test case (crawl some files for instance), before
trying it on your large case.  Every step of the way should work, and if it
doesn't, figure out why not before you move onto the next step.

Thanks,
Karl


On Sat, Dec 1, 2018 at 2:59 PM Karl Wright <daddywri@gmail.com> wrote:

> Zookeeper does not require a locking directory.  It is a process that
> synchronizes other processes, and they connect to it by port.
>
> Karl
>
>
> On Sat, Dec 1, 2018 at 2:55 PM krishna agrawal <krish.agwl@gmail.com>
> wrote:
>
>> Thanks for the information.
>> if we use Zookeeper how can we make sure all our ManifoldCF processes use
>> same locking directory does it can be done at the configuration level
>> while
>> installing.
>>
>> thanks,
>> Krishna A
>>
>> On Sat, Dec 1, 2018 at 1:39 PM Karl Wright <daddywri@gmail.com> wrote:
>>
>> > That error is the result of the database not managing transactions
>> > properly.  It can occur if the locking system is not set up properly,
>> or if
>> > you are using multiple agents processes and each process does not have
>> its
>> > own ID.  We have also seen it reported before just because MySQL seems
>> to
>> > have bugs and sometimes writes are delayed or don't go through.
>> >
>> > My recommendation would be to:
>> > (1) use zookeeper, not file locking
>> > (2) Make sure all your ManifoldCF processes use the SAME locking
>> directory
>> > or Zookeeper instance
>> > (3) If you are using multiple agents process, be certain that each such
>> > process gets its own ID (as is done in the examples).
>> >
>> > Karl
>> >
>> >
>> > On Sat, Dec 1, 2018 at 11:43 AM krishna agrawal <krish.agwl@gmail.com>
>> > wrote:
>> >
>> > > Thanks Karl,
>> > >
>> > > I will take a look at it
>> > >
>> > > But there is the error keep on tossing at manifold log
>> > >
>> > > ERROR 2018-12-01T11:13:26,297 (Job reset thread) - Exception tossed:
>> > > Unexpected job status encountered: 33
>> > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected
>> job
>> > > status encountered: 33
>> > > at
>> > >
>> org.apache.manifoldcf.crawler.jobs.Jobs.returnJobToActive(Jobs.java:2145)
>> > > ~[mcf-pull-agent.jar:?]
>> > > at
>> > >
>> > >
>> >
>> org.apache.manifoldcf.crawler.jobs.JobManager.resetJobs(JobManager.java:8449)
>> > > ~[mcf-pull-agent.jar:?]
>> > > at
>> > >
>> > >
>> >
>> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:77)
>> > > [mcf-pull-agent.jar:?]
>> > >
>> > > Thanks,
>> > > Krishna A
>> > >
>> > >
>> > > On Fri, Nov 30, 2018 at 7:00 PM Karl Wright <daddywri@gmail.com>
>> wrote:
>> > >
>> > > > Hi Krishna,
>> > > >
>> > > > First of all I suggest that you *not* use multiprocess-file-example,
>> > and
>> > > > instead use multiprocess-zk-example.
>> > > >
>> > > > Your symptoms suggest many possibilities.  But if you move to
>> Zookeeper
>> > > we
>> > > > will be able to eliminate dangling file locks as a complication. 
So
>> > > please
>> > > > do that first.
>> > > >
>> > > > Karl
>> > > >
>> > > >
>> > > > On Fri, Nov 30, 2018 at 6:29 PM krishna agrawal <
>> krish.agwl@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Yeah in our local set up we did Simple example but in  server
we
>> did
>> > > > > multiprocess-file-example are you suggesting us to upgrade from
>> 2.10
>> > to
>> > > > > 2.11 ?
>> > > > >
>> > > > > and we are using MY Sql database ,
>> > > > >
>> > > > > So most of time i saw nothing is running and still it say job
is
>> > > running
>> > > > > and you have to wait for it to complete.
>> > > > >
>> > > > > and restarting also not helping.
>> > > > >
>> > > > > Any other solution woould be greatly appreciated.
>> > > > >
>> > > > > Thanks,
>> > > > > Krishna A
>> > > > >
>> > > > > On Fri, Nov 30, 2018 at 10:50 AM Karl Wright <daddywri@gmail.com>
>> > > wrote:
>> > > > >
>> > > > > > It also may be useful to start with the simple example,
which is
>> > not
>> > > > > > multiprocess, and get familiar with using ManifoldCF that
way,
>> > before
>> > > > you
>> > > > > > try to go to a more complicated setup.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Karl
>> > > > > >
>> > > > > >
>> > > > > > On Fri, Nov 30, 2018 at 9:46 AM Karl Wright <daddywri@gmail.com
>> >
>> > > > wrote:
>> > > > > >
>> > > > > > > "simplified multi-process"?  There is no such example.
>> > > > > > >
>> > > > > > > These are the examples available.  Which one are you
using?
>> > > > > > >
>> > > > > > > 11/15/2018  03:40 AM    <DIR>          example
>> > > > > > > 11/15/2018  03:40 AM    <DIR>          example-proprietary
>> > > > > > > 11/15/2018  03:40 AM    <DIR>
>> multiprocess-file-example
>> > > > > > > 11/15/2018  03:40 AM    <DIR>
>> > > > > > > multiprocess-file-example-proprietary
>> > > > > > > 11/15/2018  03:40 AM    <DIR>          multiprocess-zk-example
>> > > > > > > 11/15/2018  03:40 AM    <DIR>
>> > > > > > multiprocess-zk-example-proprietary
>> > > > > > >
>> > > > > > > Cleaning locks makes no sense unless you are using
the
>> > > > > multiprocess-file
>> > > > > > > setup.  This is deprecated, by the way, in favor of
the
>> Zookeeper
>> > > > > setup.
>> > > > > > >
>> > > > > > > As for the buttons, please read:
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://manifoldcf.apache.org/release/release-2.11/en_US/end-user-documentation.html#outputs
>> > > > > > >
>> > > > > > > The buttons in question are "Reindex all..." and "Remove
>> all..."
>> > > > > > >
>> > > > > > > Karl
>> > > > > > >
>> > > > > > >
>> > > > > > > On Fri, Nov 30, 2018 at 9:36 AM krishna agrawal <
>> > > > krish.agwl@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > >> We have deployed the Manifold using
>> > > > > > >>
>> > > > > > >>    - Simplified multi-process model
>> > > > > > >>
>> > > > > > >> We did try clean up of lock Sh but that also did
not work.
>> > > > > > >>
>> > > > > > >> I dont have forget all document button in output
connector.
>> > > > > > >>
>> > > > > > >> [image: image.png]
>> > > > > > >>
>> > > > > > >> On Thu, Nov 29, 2018 at 6:52 PM Karl Wright <
>> daddywri@gmail.com
>> > >
>> > > > > wrote:
>> > > > > > >>
>> > > > > > >>> Hi Krishna,
>> > > > > > >>>
>> > > > > > >>> Please give us some background as to how you've
deployed
>> > > > ManifoldCF.
>> > > > > > Are
>> > > > > > >>> you using one of the examples?  If so, which
one?
>> > > > > > >>>
>> > > > > > >>> The detailed answer to your question is: the
job must delete
>> > all
>> > > > > > >>> documents
>> > > > > > >>> it indexed before it can be deleted.  That
is the typical
>> way
>> > > jobs
>> > > > > > work.
>> > > > > > >>> Thus, if you shut down the target of your output
connection,
>> > you
>> > > > may
>> > > > > be
>> > > > > > >>> blocked in deleting your job.
>> > > > > > >>>
>> > > > > > >>> At that point, you can either (a) restart the
target of your
>> > > output
>> > > > > > >>> connection, or (b) go to the "view" page for
the output
>> > > connection
>> > > > > and
>> > > > > > >>> click both of the "forget all documents" buttons
on it.
>> (b) is
>> > > not
>> > > > > > >>> recommended unless you really want to start
over fresh on
>> your
>> > > > output
>> > > > > > >>> index.
>> > > > > > >>>
>> > > > > > >>> Thanks,
>> > > > > > >>> Karl
>> > > > > > >>>
>> > > > > > >>>
>> > > > > > >>> On Thu, Nov 29, 2018 at 3:21 PM krishna agrawal
<
>> > > > > krish.agwl@gmail.com>
>> > > > > > >>> wrote:
>> > > > > > >>>
>> > > > > > >>> > Hi We are facing issue of action button
is not available
>> > > > > > >>> >
>> > > > > > >>> > [image: image.png]
>> > > > > > >>> >
>> > > > > > >>> > I have stop the agent process but still
 i am not able to
>> > > remove
>> > > > > the
>> > > > > > >>> job
>> > > > > > >>> > it say it
>> > > > > > >>> >
>> > > > > > >>> > there should be some way to forcefully
restart and stop
>> the
>> > > > running
>> > > > > > >>> > process ?
>> > > > > > >>> >
>> > > > > > >>> > Job 1542835910915 is busy; you must wait
and/or shut it
>> down
>> > > > before
>> > > > > > >>> > deleting it
>> > > > > > >>> > but there is no job running, and i am
seeing this message
>> > from
>> > > > > past 3
>> > > > > > >>> days.
>> > > > > > >>> >
>> > > > > > >>> > is there any ways to clear this?
>> > > > > > >>> >
>> > > > > > >>> >
>> > > > > > >>> > Any help in this matter will be appreciated.
>> > > > > > >>> >
>> > > > > > >>> > Thanks,
>> > > > > > >>> > Krishna A
>> > > > > > >>> >
>> > > > > > >>>
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message