roller-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "\(David\) Ming Xia" <david.ming....@ibol.biz>
Subject About weblog view data access
Date Thu, 27 May 2010 00:30:45 GMT




Hi, Dave.

    



  Still, this is about the weblog view data access.  
   The web handles specified in roller properties rendering
weblogMapper.rollerProtectedUrls are all for user account console and they are
not going to appear in user created websites. 
They are not of any concern.  
What concern us are the requests with URI pattern
‘/roller-ui/rendering/resources’, which are specified in theme.xml as elements
of <resource/>.   WeblogRequestMapper
validates the handle of an incoming web page text/html content and then
validates the handle of each incoming request sent from the corresponding
browser client following the URL links specified in that incoming text/html
content.  The validating function is WeblogRequestMapper.isWeblog(String
potentialHandle).

 

  Take an example, for a web page has ten
links for css, js and images, we are going to have one request and then eleven
requests.  For each request Roller will
do the following things:

 

Retrieve a connection instance
     from connection pool, or create a new JDBC connectionRetrieve the prepared statement
     from server statement cache, or create a prepared statement for the named
     querySet parameter ‘handle’ and
     execute the sql queryGet all the data for the
     specified weblog, this includes instances of root category and categoriesRecycle the
connection or close
     and discard it for GC Create a new weblog object and
     populate data to this object

 

   So in this
example, for one web page request Roller consumes eleven JDBC connection
instances, and creates eleven weblog objects to just check whether the object
exists or not.  If some websites on
Roller take high volume of http requests, the Roller database could easily be
overwhelmed and turn into deadlock. 
With all those later incoming requests in line, the memory usage will
touch the ceiling.   And now the
database is the single point of failure. 
Without the database standing there validate web handle for each request
and Last-Modified for each text/html request, we are going to see a dead-white
page that will go nowhere.  I believe
this is highly possible.  Take a look at
those technical parameters and usage of database servers, it is obvious that
database servers are not designed for a kind of tasks Roller is doing now in validating each
http request.   

 

 

    I would suggest that cache should be used for weblog page
view.  Put is simply, Roller should have
cache for weblog and weblog entries. 
Roller users manage their account, persist changes to database and
update the changes into cache.   Roller
users' passwords are not cached, this is for security reason.  Roller viewers retrieve web
content, all they see are from cache,
they should never touch database.  Something
like referrer address or hit counts will be cached and be persisted to database
at server stopping, or at administrators’ command.   

 

 

   The current caching system does not fit the task I described.  Current Roller caches
are just local hash
maps or hash tables, they are not distributed; It has no synchronization of
weblog content, especially the value ‘Last-Modified’ for multiple server threads.  
While nowadays most production environments
are clustering environment, composed of multiple JVMs and application server
runtimes.  

 

I learned that Ehcache support distributed map.  I know that WebSphere cache instance
implements IBM distributed map.  The
best solution for Roller is an interface for third party distributed cache
accessed with JNDI lookup, otherwise, Roller bundled with Ehcache is also very
good.  


Thank you.




David


--- On Wed, 5/26/10, Dave <snoopdave@gmail.com> wrote:

From: Dave <snoopdave@gmail.com>
Subject: Re: Roller's implementation on conditional Get
To: user@roller.apache.org, david.ming.xia@ibol.biz
Date: Wednesday, May 26, 2010, 7:59 AM

On Wed, May 26, 2010 at 12:11 AM, (David) Ming Xia
<david.ming.xia@ibol.biz> wrote:
>    I took a look into it and I found another place that has very intensive database
queries.
>
>    RequestMappingFilter.doFilter() --> WeblogRequestMapper.handleRequest().
>
>   RequestMapingFilter's URL mapping is /*, so it check every http request.
>
>   WeblogRequestMapper.handleRequest() verifies ALL requests, I mean, including those
css, js and image files with named JPA queries.
>
>
>   Actually,  both PageServlet and RequestMappingFilter query weblog with handle. 
It looks like database is used as hashtable in these two functions.   While database is usually
used for account data transaction, relational data management.
>
>   Now for each web page request there are at least 'eleven' database queries, one for
the text/html content in PageServelt and ten requests in mapping filter for everything including
the text/html.
>
>   I feel that there could be even more database wires.  Since many people work on Roller
and everyone tends to add some more wires.
>
>    It seems that there should be a top-down design solution for this issue.
>
>     Like to hear something from you.

Hi David,

You are correct, WeblogRequestMapper is invoked on every request, but
does nothing when it encounters URLs that begin with these patterns:

   rendering.weblogMapper.rollerProtectedUrls=\
   roller-ui,images,theme,themes,CommentAuthenticatorServlet,\
   index.jsp,favicon.ico,robots.txt,\
   page,flavor,rss,atom,language,search,comments,rsd,resource,xmlrpc,planetrss

It ignores static theme resources (images, CSS, JS, etc.) and
everything else that is not dynamically generated by a weblog page
template. Perhaps the problem is not quite as bad as you think.

There have not been that many people working on Roller and the ones
that have worked on the code have been pretty disciplined about when
database calls are made. But of course, even disciplined developers
make mistakes. I'm sure there is much room for improvement and I
encourage you to continue your research into performance bottlenecks.

If you have a proposal for a top-down solution, or some patches to
improve things -- I'd be happy to review them or even commit them for
you if they look good.

- Dave

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message