polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tasos Parisinos <tas...@projectbeagle.com>
Subject Re: [qi4j-dev] Using Qi4j as a skeleton framework in a high throughput, highly concurrent servlet deployment (and problems with race conditions)
Date Fri, 03 Apr 2015 18:51:08 GMT
Hi again

I would also like to add at this point, that if anyone of you Zest
developers can suggest, pinpoint or at least narrow down the pieces of code
that implement the related problematic parts, it would be tremendously
helpful to us, in order to refactor this code and try to suggest a solution
on our own.

For us, it is paramount to resolve this as fast as possible, to carry on
implementing our core business code.

Also, using this occasion, I would like to join Zest's core developer team,
as I'm a great fan of the framework and thus we have based our platform on
it.

As a CTO of projectbeagle I'm also very eager to contribute parts of our
implementation back to the Open Source Community as a Qi4j (well... Zest)
library, extension or tool!

Best Regards
Tasos Parisinos






On Fri, Apr 3, 2015 at 3:58 PM, Tasos Parisinos <tasosp@projectbeagle.com>
wrote:

> Hello Niclas and all
>
> I'll start from the bottom of your response and work my way up.
>
> First of all, thanks for your response, i appreciate it.
> Congratulations to the whole Qi4j team for becoming an ASF project,
> although i prefer the old name... Nevertheless it is a milestone for this
> awesome framework.
>
> About performance. We have been writing an availability query for a
> bed-bank. These queries are massive, working on tens of tables at once, on
> big data. So the very question of throughput for us is not only code
> related. In the final picture we will be talking about massive database -
> servlet container clusters that will be able to spit out 15.000 A.C.I.D
> transactions per second.
>
> For our prototyping phase, achieving 5000 of them running the full query
> on test/sample data on a single machine was a breakthrough on its own. And
> we haven't really started to push this system, just code and basic system
> optimizations. This will grow.
>
> Oh by the way, we are www.projectbeagle.com, based in Greece.
>
> Our first attempt was to have a single Qi4j runtime and application PER
> request thread. This has become a non-trivial application with multiple
> services, lots of layers and modules, so assembling it into an application
> takes time. We can't afford this. That's why we moved all this code to be
> executed during deployment time. All requests (all servicing threads) will
> use this unique application to perform DI and composition. In the future, a
> secondary, contextual Qi4j application maybe added to the picture.
>
> So, when we did that, throughput skyrocketed but race conditions started.
> Let me give you some examples. All are related with composition, with
> either value and transient builders and their factories.We don't use any
> kind of entity composites (we have Hibernate as ORM and we do persistence
> in a tricky way - another story). All composites once built work fine, no
> problem with them.
>
> So this is a small code example from our project's QueryBuilder.
> QueryBuilder has multiple APIs (multiple interfaces) and each is
> implemented by a different Mixin (abstract classes). This is its Hibernate
> implementation. We have also a mock one. This is one of the QueryBuilder
> API methods, that creates a WHERE clause (field >= value ) for an SQL query:
>
> @Override
> @Factory
> public <T> Clause ge(String field, T value)
> {
>    synchronized(selfContainer) {
>       ValueBuilder<Clause> builder = selfContainer.newValueBuilder(Clause.class);
>
>       builder.prototype()
>              .expression()
>              .set(Restrictions.ge(field, value));
>
>       return builder.newInstance();
>    }
> }
>
>
> These are called very, very often in the project. After all it is a query
> engine. Variable 'selfContainer' is injected as
>
> @Structure
> protected Module selfContainer;
>
>
> When we don't lock the buiilder factory (the module), in the way we do we
> get all sorts of race conditions. For example when newInstance() is called
> it can fail with a constraint violation exception saying that expression is
> not optional. But the call Restrictions.ge() can never return null. So when
> one thread comes to call newInstance(), another thread has already messed
> up with the builder factory. The builders themselves as you see are local
> variables (but they may not be, it depends on how they are implemented
> inside their factory)
>
> There are other ways it can fail. For example saying that the builder
> can't find a proper fragment with a ge implementation. All these errors are
> so absurd, for such simple code that they can only be race conditions. I
> will collect as much exception dumps of such errors and send them to you in
> a future attachment.
>
> When we synchronize in this fashion, problems go away. But this has two
> basic caveats
>
> 1. Performance penalty (obvious)
> 2. A schroedinger's cat situation. We don't know if the problem went away
> because we synchronize or because concurrency falls to such a degree that
> the propability of a race conditions falls dramatically, only to appear on
> production machines later on
>
>
> Best regards
> Tasos Parisinos
>
> On Thu, Apr 2, 2015 at 11:03 AM, Niclas Hedhman <niclas@hedhman.org>
> wrote:
>
>>
>> The general "rule" is that Factories (i.e. implemented by Module
>> nowadays) should be thread safe, Builders are NOT thread-safe, and are
>> expected to be created at each use. Are you trying to re-use the Builders?
>> If not, i.e. you do newXyzBuilder() on each use, and you are seeing
>> threading issues, then that is bug(s) and I would love to get hold of the
>> details.
>>
>> ValueComposites -> thread-safe by definition, once created.
>>
>> EntityComposites -> MUST NOT be handed between threads, and is therefor
>> indirectly thread-safe.
>>
>> TransitentComposites -> Internals are expected to be thread-safe, but
>> changes at 'user level' needs to be taken care of.
>>
>> ServiceComposites -> Internals are expected to be thread-safe, but user
>> level might need care.
>>
>> ConfigurationComposites -> They are entities, and therefor inherits
>> concurrency characteristics.
>>
>>
>> Qi4j isn't really intended for being a speed demon, so 15000 tx/sec
>> sounds a bit too ambitious to me. Please report back what kind of numbers
>> you will eventually manage, even if it is not good enough for you.
>>
>> Niclas
>>
>> P.S. Qi4j has just been accepted into the Apache Software Foundation, and
>> will emerge as Apache Zest. dev@zest.apache.org is CC'd for that reason.
>>
>>
>> On Wed, Apr 1, 2015 at 10:50 PM, Tasos Parisinos <
>> tasosp@projectbeagle.com> wrote:
>>
>>> Thanx for you reply Kent
>>>
>>> I agree with you that builder instances should be created used and
>>> discarded inside a single request (a single thread from the servlet
>>> container pool). The builder factories though, as the application itself
>>> should be used commonly across all request threads (in a synchronized
>>> manner) in order to avoid instantiating such an application PER thread, as
>>> this will greatly compromise performance. The use of putIfAbsent in that
>>> context seems to be corrent. I'll give it a try and update you with results
>>>
>>>
>>> On Wednesday, April 1, 2015 at 10:26:16 PM UTC+3, kent.soelvsten wrote:
>>>>
>>>>  I am not an expert so it might be the blind leading the deaf ......
>>>>
>>>> but i sense a potential problem with concurrent access to various
>>>> variants of ValueBuilderFactory#newValueBuilder and
>>>> TransientBuilderFactory#newTransientBuilder.
>>>> (the internal usage of ConcurrentHashMap inside TypeLookup - shouldn't
>>>> we use putIfAbsent?).
>>>>
>>>> So that would be good candidates for synchronization. If that solves
>>>> your problem i believe you might have found a bug - and a work-around.
>>>> ValueBuilder and TransientBuilder instances should probably be created,
>>>> used and discarded inside a single web request and not reused.
>>>>
>>>> /Kent
>>>>
>>>>
>>>> Den 01-04-2015 kl. 20:07 skrev Tasos Parisinos:
>>>>
>>>> Hi all
>>>>
>>>>  Let me describe my problem. We have implemented a servlet (deployed
>>>> in tomcat) that takes a REST request and based on its query parameters, it
>>>> builds and executes a single query (using Hibernate ORM) within a JTA
>>>> transaction (using Atomikos). The application specifics are not important,
>>>> what is important is that we need high throughput (15.000 trx / sec is our
>>>> objective).
>>>>
>>>>  We have implemented all infrastructure code using Qi4j for COP and DI
>>>> as well as Property<T> data validation (constraint annotations). In
>>>> deployment time (in a separate thread) we assemble and activate two Qi4j
>>>> runtimes, each with a Qi4j application. The first is used only during
>>>> deployment, while the second is used in ALL  threads that serve requests.
>>>> Using Qi4j this second application, starts various ServiceComposite while
>>>> the servlet deployes, for eager initialization (logger service, mapping
>>>> service, repository service, rest service, application services, domain
>>>> services, transaction service, token service to name only some). We
>>>> implement our Use Cases with a DCI design.
>>>>
>>>>  These services and DCI code uses various ValueBuilder<T> and
>>>> TransientBuilder<T> to do composition.
>>>>
>>>>  The problem is:
>>>>
>>>>  Because ALL request threads, use the same Qi4j application, we have
>>>> various race conditions that are mainly associated with the various
>>>> builders. These race conditions appear when the servlet serves more that
>>>> 2000 trx / sec. Sacrificing some throughput we can synchronize shared
>>>> variables, but to minimize performance impact we need to know:
>>>>
>>>>  1. What is the best practice for such cases
>>>> 2. Which part of ValueBuilderFactory, ValueBuilder<T>,
>>>> TransientBuilderFactory, TransientBuilder<T> is best to synchronize?
>>>>
>>>>  Thanx in advance
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "qi4j-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to qi4j-dev+u...@googlegroups.com.
>>>> To post to this group, send email to qi4j...@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/qi4j-dev.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>>>   --
>>> You received this message because you are subscribed to the Google
>>> Groups "qi4j-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to qi4j-dev+unsubscribe@googlegroups.com.
>>> To post to this group, send email to qi4j-dev@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/qi4j-dev.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Niclas Hedhman, Software Developer
>> http://www.qi4j.org - New Energy for Java
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message