manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mario Bisonti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1554) Job stuck during crawl documents on folder
Date Wed, 07 Nov 2018 10:27:01 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677994#comment-16677994
] 

Mario Bisonti commented on CONNECTORS-1554:
-------------------------------------------

Hallo Karl.

Great news!

I migrated my MCF configuration versus /opt/manifoldcf/multiprocess-zk-example-proprietary/
After the configuration, I started manually:

sudo -u tomcat /opt/manifoldcf/multiprocess-zk-example-proprietary/runzookeeper.sh

and

sudo -u tomcat /opt/manifoldcf/multiprocess-zk-example-proprietary/start-agents.sh

I had initially an heap java memory error so I set the memory for manifoldcf from 512m to
2048m  :

sudo nano /opt/manifoldcf/multiprocess-zk-example-proprietary/options.env.unix
-Xms2048m

-Xmx2048m

Now the scan is working from two hours and no more seems to hang!

Probably the filesystem syncronization put into problem as you said, and zookeper seems (i
across  my fingers:-) ) to work very well!

I would like to understand, if I would like that agent and zookeper start with Tomcat and
not as manually process how to do this?

I understand that, to start the agent I need to append to the /etc/systemd/system/tomcat.service
the start of the agent so appending the:
-Dorg.apache.manifoldcf.agents.AgentRun    ?

So, my /etc/systemd/system/tomcat.service  would become:

[Unit]
Description=Apache Tomcat Web Application Container
After=network.target

[Service]
Type=forking

Environment=JAVA_HOME= /usr/lib/jvm/java-1.11.0-openjdk-amd64
Environment=CATALINA_PID=/opt/tomcat/temp/tomcat.pid
Environment=CATALINA_HOME=/opt/tomcat
Environment=CATALINA_BASE=/opt/tomcat
Environment='CATALINA_OPTS=-Xms512M -Xmx1024M -server -XX:+UseParallelGC -Dorg.apache.manifoldcf.configfile=/opt/manifoldcf/multiprocess-zk-example-proprietary/properties.xml' -Dorg.apache.manifoldcf.agents.AgentRun
Environment='JAVA_OPTS=-Djava.awt.headless=true -Djava.security.egd=file:/dev/./urandom'

ExecStart=/opt/tomcat/bin/startup.sh
ExecStop=/opt/tomcat/bin/shutdown.sh

User=tomcat
Group=tomcat
UMask=0007
RestartSec=10
Restart=always

[Install]
WantedBy=multi-user.target

 
Is this right?
If yes, how can I add the start of the zookeper automatically?


Thanks a lot for your great help, Karl!
Best regards.
Mario

> Job stuck during crawl documents on folder
> ------------------------------------------
>
>                 Key: CONNECTORS-1554
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1554
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Active Directory authority, File system connector, Tika extractor
>    Affects Versions: ManifoldCF 2.11
>         Environment: Ubuntu Server 18.04
> ManifoldCF 2.11
> Solr 7.5.0
> Tika Server 1.19.1
>            Reporter: Mario Bisonti
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 2.11
>
>         Attachments: SimpleHistory.png, manifoldcf.log
>
>
> Hallo.
> When I start a job that index a Windows Share, it stucks after a 15 minutes near.
>  
> I see error in ManifoldCF.log as you can see in the attachment
>  
> I attached "Simple History" with the last documents crawled.
> Thanks a lot.
> Mario
> [^manifoldcf.log]!SimpleHistory.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message