jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cody Burleson (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (JCR-3588) Response time higher on Node1 with load when Node2 has no load
Date Fri, 03 May 2013 21:22:16 GMT

     [ https://issues.apache.org/jira/browse/JCR-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cody Burleson updated JCR-3588:
-------------------------------

    Attachment: Screen Shot 2013-05-03 at 4.14.52 PM.png

Also peculiar - see Screen Shot 2013-05-03 at 4.14. In these charts, from positions 1-36/37ish,
there was no user load on one of the nodes. But jackrabbit is keeping up with the revisions
happening on the node that still has load. There's no user load, but it's still taking 30-40%
of the CPU. You can see that the java heap is showing a lot of activity for doing this revision
catch-up work. At position 36-37, we added the load back onto the node. So, this may be a
little bit unrelated, but could be we're not sure. It just seems to be more memory and processor
intensive than expected for a node that has no user load. 
                
> Response time higher on Node1 with load when Node2 has no load
> --------------------------------------------------------------
>
>                 Key: JCR-3588
>                 URL: https://issues.apache.org/jira/browse/JCR-3588
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: clustering
>    Affects Versions: 2.4.3
>         Environment: CentOS 6.4 running WebSphere Application Server 7.0.0.19. Jackrabbit
cluster configuration with 2 WAS servers. Repository on DB2 9.7.
>            Reporter: Cody Burleson
>         Attachments: JackrabbitCluster-ResponseTime.png, Node1repository.xml, Node2repository.xml,
Screen Shot 2013-05-03 at 3.49.52 PM.png, Screen Shot 2013-05-03 at 4.04.32 PM.png, Screen
Shot 2013-05-03 at 4.14.52 PM.png
>
>
> In our performance analysis, we are seeing a strange effect, which we does not make sense
to us. It may or may not be a defect, but we need to understand why the effect occurs. In
a 2 node cluster, we can run a certain load (reading and writing) directly on Node1 and an
equivalent load (reading and writing on Node2). We measure the response time on both nodes,
and it's less than 2 seconds. If we stop the load to one of the servers, the response time
on the other server triples (with no additional load). See attached image "JackrabbitCluster-ResponseTime.png".
The left side of the report shows when only one node (Node1) has load and Node2 has no load.
In this case, the response times on Node1 are at about 6 seconds. Then, on the right side
of the report, we add an equivalent load to Node2 and then the response times on Node1 drop
to 2 seconds. So, the load on Node1 was always consistent, yet ADDING load to Node2 actually
improves response time on Node1. Logically, it doesn't make much sense, eh? Someone, please,
at least help us understand why this may be happening.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message