manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1367) Alfresco Connector deadlock with postgresql
Date Mon, 16 Jan 2017 08:32:26 GMT


Karl Wright commented on CONNECTORS-1367:

This is a pretty odd looking stack frame; none of the thread ID's are there and thus it looks
like it was snapped when the agents process was shutting down, rather than actually running
and stuck.  It does look, however, like an HTTP transaction with Alfresco is hung for some

Thread 19436: (state = IN_NATIVE)
 -, byte[], int, int, int) @bci=0
(Compiled frame; information may be imprecise)
 -, byte[], int, int, int) @bci=8,
line=116 (Compiled frame)
 -[], int, int, int) @bci=79, line=170 (Compiled frame)
 -[], int, int) @bci=11, line=141 (Compiled frame)
 -[], int, int) @bci=16, line=139
(Compiled frame)
 - @bci=68, line=155 (Compiled
@bci=227, line=284 (Compiled frame)
 - org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(
@bci=16, line=140 (Compiled frame)
 - org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(
@bci=2, line=57 (Compiled frame)
 - @bci=38, line=261 (Compiled frame)
 - org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader() @bci=8, line=165
(Compiled frame)
 - org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader() @bci=4, line=167 (Compiled
 - org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(org.apache.http.HttpRequest,
org.apache.http.HttpClientConnection, org.apache.http.protocol.HttpContext) @bci=41, line=272
(Compiled frame)
 - org.apache.http.protocol.HttpRequestExecutor.execute(org.apache.http.HttpRequest, org.apache.http.HttpClientConnection,
org.apache.http.protocol.HttpContext) @bci=39, line=124 (Compiled frame)
 - org.apache.http.impl.execchain.MainClientExec.execute(org.apache.http.conn.routing.HttpRoute,
org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext,
org.apache.http.client.methods.HttpExecutionAware) @bci=714, line=271 (Compiled frame)
 - org.apache.http.impl.execchain.ProtocolExec.execute(org.apache.http.conn.routing.HttpRoute,
org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext,
org.apache.http.client.methods.HttpExecutionAware) @bci=447, line=184 (Compiled frame)
 - org.apache.http.impl.execchain.RetryExec.execute(org.apache.http.conn.routing.HttpRoute,
org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext,
org.apache.http.client.methods.HttpExecutionAware) @bci=39, line=88 (Compiled frame)
 - org.apache.http.impl.execchain.RedirectExec.execute(org.apache.http.conn.routing.HttpRoute,
org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext,
org.apache.http.client.methods.HttpExecutionAware) @bci=85, line=110 (Compiled frame)
 - org.apache.http.impl.client.InternalHttpClient.doExecute(org.apache.http.HttpHost, org.apache.http.HttpRequest,
org.apache.http.protocol.HttpContext) @bci=168, line=184 (Compiled frame)
 - org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest,
org.apache.http.protocol.HttpContext) @bci=14, line=82 (Compiled frame)
 - org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest)
@bci=6, line=107 (Compiled frame)
 - com.github.maoo.indexer.client.WebScriptsAlfrescoClient.getDocumentsActions(java.lang.String)
@bci=24, line=118 (Compiled frame)
 - com.github.maoo.indexer.client.WebScriptsAlfrescoClient.fetchNodes(long, long, com.github.maoo.indexer.client.AlfrescoFilters)
@bci=32, line=103 (Compiled frame)
 - org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector.addSeedDocuments(org.apache.manifoldcf.crawler.interfaces.ISeedingActivity,
org.apache.manifoldcf.core.interfaces.Specification, java.lang.String, long, int) @bci=140,
line=188 (Compiled frame)
 - @bci=614, line=154 (Interpreted

If any one thread that throws locks gets stuck in this kind of way, then it can cause a "train
wreck" that causes transactions to eventually time out -- which seems to be what's happening
-- so this could be the smoking gun.

This is occurring while the job is being "seeded".  In jobs that are continuous, that happens
periodically.  But if the job is *not* continuous, that occurs once on every job run -- before
any documents are crawled.  [~chalitha.perera], is this a continuous job, or not?  Very important
to know the answer to that question.

One possibility is that the connector's seeding request is causing Alfresco to fall apart
in trying to respond to it, probably because Alfresco is running out of memory.  The thread
that is answering the HTTP request is still alive but the thread that is computing the result
dies, so the connection stays up but it will never be fulfilled.  I don't know if this is
possible or not but if it is you would see evidence for that in the Alfresco logs.  [~maoo],
what do you think?

> Alfresco Connector deadlock with postgresql
> -------------------------------------------
>                 Key: CONNECTORS-1367
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Alfresco webscript connector
>    Affects Versions: ManifoldCF 2.5
>            Reporter: Chalitha Perera
>            Assignee: Karl Wright
>         Attachments: jstack5.out
> Alfresco connector gets stuck when processing documents around 35,000 and postgres manifold
database in idle state and after days gets deadlock message 
> "
> 2016-12-21 10:54:17.946 UTC >ERROR:  deadlock detected
> < 2016-12-21 10:54:17.946 UTC >DETAIL:  Process 28072 waits for ShareLock on transaction
432231; blocked by process 28038.
>     Process 28038 waits for ShareLock on transaction 432227; blocked by process 28072.
>     Process 28072: UPDATE jobqueue SET failtime=NULL,checktime=$1,failcount=NULL,checkaction=$2
WHERE jobid=$3 AND  (status=$4 OR status=$5)
>     Process 28038: SELECT needpriority FROM jobqueue WHERE id=$1 FOR UPDATE
> < 2016-12-21 10:54:17.946 UTC >HINT:  See server log for query details."
> On separate note, ran ant load-pg and it was completed successfully.

This message was sent by Atlassian JIRA

View raw message