jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexandru Popescu" <the.mindstorm.mailingl...@gmail.com>
Subject Re: possible performance problem (need a way to test it)
Date Mon, 18 Sep 2006 23:51:31 GMT
I definitely don't know Jackrabbit source base so well to comment, nor
the Lucene implementation, but at the first glance everything seems
oke. So, it looks like it is a limitation of Lucene in highly
concurrent environments. Maybe somebody with better knowledge can
comment on this.

I have just a small remark about the code in QueryImpl.execute method:
it looks like the ACLs are double checked: once in this method when
reading for the 1st time the UUIDs and a 2nd time in the iterator when
fetching the node. However, I don't think this is something that has
any impact on the performance.

I am looking forward for any comments, opinions, advise.

./alex
--
:Architect of InfoQ.com:
 .w( the_mindstorm )p.
Co-founder of InfoQ.com


On 9/19/06, Alexandru Popescu <the.mindstorm.mailinglist@gmail.com> wrote:
> Oke, I am now able to reproduce the problem using an environment as
> close to the real one and with a TestNG test that runs the same
> invocation in parallel threads for a hundred of times.
>
> [code]
>     @Test(invocationCount=100, threadPoolSize=50)
>     public void fetchRssNews() {
>         m_rssContentDao.findOrderedContentList(0, 15, new Filtering());
>     }
> [/code]
>
> The test was used just to be able to profile the code, and the results
> I am getting are the following: the most time is spent in the
> following two calls:
>
> o.a.j.c.query.lucene.QueryHits.doc(int) ->
> o.a.lucene.search.Hits.doc(int) (8.046ms from 11.468ms)
>
> o.a.j.c.query.lucene.SearchIndex.executeQuery(QueryImpl, Query,
> QName[], boolean[]) -> o.a.lucene.search.Searcher.search(Query, Sort)
> (1.390ms from 11.468ms)
>
> The executed query is:
>
> /jcr:root/news/element(*,cmed:translatable)/* order by @cmed:timestamp
> descending
>
> and there are about 200 nodes under /news.
>
> Do you think there is something I can do to optimize this behavior
> before jumping to caching?
>
> I am getting the impression that if I would read node by node and
> check the properties by myself I could get better performance so I
> really think there is something I can do.
>
> Any help is highly appreciated,
>
> ./alex
> --
> :Architect of InfoQ.com:
>  .w( the_mindstorm )p.
> Co-founder of InfoQ.com
>
>
> > > > This was the first part. Now about the real problem I am seeing: when
> > > > accessing the JCR repo from multiple concurrent threads (each using
> > > > its own Session) and we perform querying we see a huge CPU load and
> > > > the response times are growing very fast:
> > > > - for 5 concurrent threads the query reponse times are around 200-500
> > > > ms; server load about 0.65-0.7
> > > > - for 100 concurrent threads the query response times are around
> > > > 150000-200000 ms; server load about 7-7.5
> > > >
> > > > As you can see these are very dangerous numbers, and I would
> > > > definitely like to figure out what is the problem behind them, because
> > > > in my application I can expect something around 300 concurrent threads
> > > > access.
> > > >
> > > > I know I can start looking at different options like caching and
> > > > similar ideas, but firstly understanding the real problem will help me
> > > > a lot.
> > > >
> > > > Many thanks for any helping ideas and comments,
> > > >
> > > > ./alex
> > > > --
> > > > :Architect of InfoQ.com:
> > > >  .w( the_mindstorm )p.
> > > > Co-founder of InfoQ.com
> > > >
>

Mime
View raw message