manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Scaling in MCF
Date Thu, 20 Nov 2014 14:41:35 GMT
Hi Aeham,

(1) The query "SELECT COUNT(t2.x) AS doccount FROM (SELECT 'x' AS x FROM
jobqueue WHERE
jobid=$1 AND  (status=$2 OR status=$3 OR status=$4 OR status=$5 OR
status=$6 OR status=$7) LIMIT 500001) t2" is coming from the UI when you
get a status update. It is slow because Postgresql needs to do a sequential
scan to count anything.  It is, however, limited to a maximum of 500001, so
it will not get any worse with more documents.

(2) I'll look into the deadlock and get back to you.  In general, deadlocks
are expected upon occasion, and we usually deal with them by a backoff and
retry approach.  Looks like that's not implemented for this particular case
though.

Karl




On Thu, Nov 20, 2014 at 9:24 AM, Aeham Abushwashi <
aeham.abushwashi@exonar.com> wrote:

> Hi Karl,
>
> A couple of initial observations from a fresh install - 7 jobs, 4 nodes in
> a single cluster, # jobqueue records < 1M,
>
> 1. When a job is started or stopped, a particular SQL query, which I hadn't
> noticed in previous versions, pops up again and again  and seems to take a
> few minutes each time (judging by the query_start column in the
> pg_stat_activity table):
>
> SELECT COUNT(t2.x) AS doccount FROM (SELECT 'x' AS x FROM jobqueue WHERE
> jobid=$1 AND  (status=$2 OR status=$3 OR status=$4 OR status=$5 OR
> status=$6 OR status=$7) LIMIT 500001) t2
>
> The query continues to be re-executed after the job is marked as inactive.
>
> The closest match to this query that I could find in code is the one fired
> by JobManager#getRunningJobs but the number of of terms in the WHERE clause
> is different
>
>
> 2. As I was stopping and restarting a bunch of jobs concurrently, SQL
> deadlocks ensued and were reported on 3 of the 4 MCF nodes in the cluster.
> All of the exceptions reference the method JobQueue#clearDocPriorities.
> Here's snippets of log files from the 4 nodes:
>
> **NODE #1**
>
>  INFO 2014-11-19 17:04:25,851 (Job notification thread) - Found job
> 1416410450171 in need of notification
>  INFO 2014-11-19 17:06:08,438 (qtp720239731-20) - Manually aborting job
> 1416411618209
>  INFO 2014-11-19 17:06:08,447 (qtp720239731-20) - Job 1416411618209 abort
> signal successfully sent
>  INFO 2014-11-19 17:06:11,335 (qtp720239731-18) - Manually aborting job
> 1416411742909
>  INFO 2014-11-19 17:06:11,351 (qtp720239731-18) - Job 1416411742909 abort
> signal successfully sent
>  INFO 2014-11-19 17:06:13,689 (qtp720239731-17) - Manually aborting job
> 1416411915906
>  INFO 2014-11-19 17:06:13,704 (qtp720239731-17) - Job 1416411915906 abort
> signal successfully sent
>  INFO 2014-11-19 17:06:15,860 (qtp720239731-16) - Manually aborting job
> 1416412103264
>  INFO 2014-11-19 17:06:15,886 (qtp720239731-16) - Job 1416412103264 abort
> signal successfully sent
>  INFO 2014-11-19 17:06:18,076 (qtp720239731-19) - Manually aborting job
> 1416411677979
>  INFO 2014-11-19 17:06:18,118 (qtp720239731-19) - Job 1416411677979 abort
> signal successfully sent
> ERROR 2014-11-19 17:06:24,765 (Job reset thread) - Exception tossed: ERROR:
> deadlock detected
>   Detail: Process 17695 waits for ShareLock on transaction 572361982;
> blocked by process 16640.
> Process 16640 waits for ShareLock on transaction 572361975; blocked by
> process 17695.
>   Hint: See server log for query details.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: deadlock
> detected
>   Detail: Process 17695 waits for ShareLock on transaction 572361982;
> blocked by process 16640.
> Process 16640 waits for ShareLock on transaction 572361975; blocked by
> process 17695.
>   Hint: See server log for query details.
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:628)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:660)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performUpdate(DBInterfacePostgreSQL.java:254)
>         at
>
> org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobQueue.clearDocPriorities(JobQueue.java:1046)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobManager.finishJobStops(JobManager.java:8170)
>         at
>
> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:69)
> Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
>   Detail: Process 17695 waits for ShareLock on transaction 572361982;
> blocked by process 16640.
> Process 16640 waits for ShareLock on transaction 572361975; blocked by
> process 17695.
>   Hint: See server log for query details.
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:894)
>         at
>
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
> ERROR 2014-11-19 17:06:43,054 (Job reset thread) - Exception tossed: ERROR:
> deadlock detected
>   Detail: Process 16640 waits for ShareLock on transaction 572362009;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362011; blocked by
> process 16640.
>   Hint: See server log for query details.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: deadlock
> detected
>   Detail: Process 16640 waits for ShareLock on transaction 572362009;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362011; blocked by
> process 16640.
>   Hint: See server log for query details.
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:628)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:660)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performUpdate(DBInterfacePostgreSQL.java:254)
>         at
>
> org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobQueue.clearDocPriorities(JobQueue.java:1046)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobManager.finishJobStops(JobManager.java:8170)
>         at
>
> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:69)
> Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
>   Detail: Process 16640 waits for ShareLock on transaction 572362009;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362011; blocked by
> process 16640.
>   Hint: See server log for query details.
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:894)
>         at
>
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
>
>
>
> **NODE #2**
>
>  INFO 2014-11-19 17:04:22,524 (Job reset thread) - Stopped job
> 1416410450171
>
>
> **NODE #3**
>
> INFO 2014-11-19 17:06:21,994 (Job notification thread) - Found job
> 1416411618209 in need of notification
>  INFO 2014-11-19 17:06:37,105 (Job reset thread) - Stopped job
> 1416411742909
>  INFO 2014-11-19 17:06:38,234 (Job reset thread) - Stopped job
> 1416411915906
> ERROR 2014-11-19 17:06:39,826 (Job reset thread) - Exception tossed: ERROR:
> deadlock detected
>   Detail: Process 16086 waits for ShareLock on transaction 572361994;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362003; blocked by
> process 16086.
>   Hint: See server log for query details.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: deadlock
> detected
>   Detail: Process 16086 waits for ShareLock on transaction 572361994;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362003; blocked by
> process 16086.
>   Hint: See server log for query details.
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:628)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:660)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performUpdate(DBInterfacePostgreSQL.java:254)
>         at
>
> org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobQueue.clearDocPriorities(JobQueue.java:1046)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobManager.finishJobStops(JobManager.java:8170)
>         at
>
> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:69)
> Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
>   Detail: Process 16086 waits for ShareLock on transaction 572361994;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362003; blocked by
> process 16086.
>   Hint: See server log for query details.
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:894)
>         at
>
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
>  INFO 2014-11-19 17:06:42,043 (Job notification thread) - Found job
> 1416411742909 in need of notification
>  INFO 2014-11-19 17:06:42,044 (Job notification thread) - Found job
> 1416411915906 in need of notification
> ERROR 2014-11-19 17:06:44,690 (Job reset thread) - Exception tossed: ERROR:
> deadlock detected
>   Detail: Process 16086 waits for ShareLock on transaction 572362024;
> blocked by process 17903.
> Process 17903 waits for ShareLock on transaction 572362026; blocked by
> process 16086.
>   Hint: See server log for query details.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: deadlock
> detected
>   Detail: Process 16086 waits for ShareLock on transaction 572362024;
> blocked by process 17903.
> Process 17903 waits for ShareLock on transaction 572362026; blocked by
> process 16086.
>   Hint: See server log for query details.
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:628)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:660)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performUpdate(DBInterfacePostgreSQL.java:254)
>         at
>
> org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobQueue.clearDocPriorities(JobQueue.java:1046)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobManager.finishJobStops(JobManager.java:8170)
>         at
>
> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:69)
> Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
>   Detail: Process 16086 waits for ShareLock on transaction 572362024;
> blocked by process 17903.
> Process 17903 waits for ShareLock on transaction 572362026; blocked by
> process 16086.
>   Hint: See server log for query details.
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:894)
>         at
>
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
>  INFO 2014-11-19 17:06:52,274 (Job notification thread) - Found job
> 1416412103264 in need of notification
>  INFO 2014-11-19 17:07:02,336 (Job notification thread) - Found job
> 1416411677979 in need of notification
>
>
>
> **NODE #4**
>
>  INFO 2014-11-19 17:06:20,875 (Job reset thread) - Stopped job
> 1416411618209
> ERROR 2014-11-19 17:06:22,873 (Job reset thread) - Exception tossed: ERROR:
> deadlock detected
>   Detail: Process 17903 waits for ShareLock on transaction 572361703;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572361970; blocked by
> process 17903.
>   Hint: See server log for query details.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: deadlock
> detected
>   Detail: Process 17903 waits for ShareLock on transaction 572361703;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572361970; blocked by
> process 17903.
>   Hint: See server log for query details.
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:628)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:660)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performUpdate(DBInterfacePostgreSQL.java:254)
>         at
>
> org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobQueue.clearDocPriorities(JobQueue.java:1046)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobManager.finishJobStops(JobManager.java:8170)
>         at
>
> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:69)
> Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
>   Detail: Process 17903 waits for ShareLock on transaction 572361703;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572361970; blocked by
> process 17903.
>   Hint: See server log for query details.
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:894)
>         at
>
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
> ERROR 2014-11-19 17:06:41,599 (Job reset thread) - Exception tossed: ERROR:
> deadlock detected
>   Detail: Process 17903 waits for ShareLock on transaction 572362009;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362004; blocked by
> process 17903.
>   Hint: See server log for query details.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: ERROR: deadlock
> detected
>   Detail: Process 17903 waits for ShareLock on transaction 572362009;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362004; blocked by
> process 17903.
>   Hint: See server log for query details.
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:628)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performModification(DBInterfacePostgreSQL.java:660)
>         at
>
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performUpdate(DBInterfacePostgreSQL.java:254)
>         at
>
> org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobQueue.clearDocPriorities(JobQueue.java:1046)
>         at
>
> org.apache.manifoldcf.crawler.jobs.JobManager.finishJobStops(JobManager.java:8170)
>         at
>
> org.apache.manifoldcf.crawler.system.JobResetThread.run(JobResetThread.java:69)
> Caused by: org.postgresql.util.PSQLException: ERROR: deadlock detected
>   Detail: Process 17903 waits for ShareLock on transaction 572362009;
> blocked by process 17172.
> Process 17172 waits for ShareLock on transaction 572362004; blocked by
> process 17903.
>   Hint: See server log for query details.
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
>         at
>
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
>         at
>
> org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
>         at
> org.apache.manifoldcf.core.database.Database.execute(Database.java:894)
>         at
>
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
>  INFO 2014-11-19 17:06:50,652 (Job reset thread) - Stopped job
> 1416412103264
>  INFO 2014-11-19 17:06:56,152 (Job reset thread) - Stopped job
> 1416411677979
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message