batchee-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Struberg <strub...@yahoo.de>
Subject Re: TomEE mem leak using batchee with JTA transactions
Date Thu, 05 Mar 2015 09:35:14 GMT
> You didnt get it, if you use the same cdi bean in front and batch (instead of sharing
the logic through a not scoped component) I can bet you’ll get a day where you'll use the
httpservlet (caricatural but that's to make it clear) so your batch will be broken.

This has nothing to do with CDI and how you make your beans. If someone has a clean distinction
of backend services in some backend jars then all is fine. If someone packages a servlet-api
dependency into a jar which he uses in batches then it’s not the system which is to blame.
It’s rather like the old saying: „A fool with a tool is still a fool!“

> Didnt check but pretty sure it is wrong since it is against the spec. 
why should that be against the spec? JTA is set up, EJBs are set up. Why should CDI not be
set up? Who prevents the container from doing so? What section of the spec are you refering
too?
According to the CDI specification the @RequestScoped context MUST even be set up if you use
a CDI bean as batch artifact (batchlet, reader, writer, processor, …). So where in the batch
spec does this get overruled?

LieGrue,
strub


> Am 05.03.2015 um 09:57 schrieb Romain Manni-Bucau <rmannibucau@gmail.com>:
> 
> 
> 2015-03-05 9:38 GMT+01:00 Mark Struberg <struberg@yahoo.de>:
> > Issue reusing such shared beans is in long term it is broken
> > (you take the risk the logic moves to web side for instance)
> Why? It’s just a simple jar dependency and that’s it. If you don’t package it correctly
then it’s a clear user error.
> 
> 
> You didnt get it, if you use the same cdi bean in front and batch (instead of sharing
the logic through a not scoped component) I can bet you'll get a day where you'll use the
httpservlet (caricatural but that's to make it clear) so your batch will be broken.
>  
> > you already break portability if you care about it since ee6 module doesnt work
with jberet for
> Bullshit, in jberet you don’t need our ee-module as full EE integration is the default
behaviour over there. Actually it is the ONLY mode they know. In BatchEE we also support plain
SE and a mixture.
> With TomEE we probably also don’t need it as TomEE automatically finds the TransactionManager.
At least we have _much_ better ways to provide the same in TomEE. But it still works fine.
> WebSphere-8.0 and 8.5 has a reported PMR as it fails to give you a JTA aware TransactionManager
in the specified JNDI location - you will always only get a ‚resource local‘ TransactionManager.
But this is open since a year and still unfixed thus I gave up and built this portable workaround.
> 
> 
> Didnt check but pretty sure it is wrong since it is against the spec. I intentionnaly
didnt want to provide cdi scopes in batchee since:
> 1) it is not in the spec
> 2) it is opposed to ee concurrency utils which is likely the closer pool we could use
>  
> > It is better to change the code to aggregate the logic but use different beans (composition).
> absolutely no need. Where would you delegate to in your composition? Makes no sense to
me at all.
> 
> 
> I know cause you use this broken pattern but believe me you just rely on a workaround.
>  
> LieGrue,
> strub
> 
> 
> > Am 05.03.2015 um 09:18 schrieb Romain Manni-Bucau <rmannibucau@gmail.com>:
> >
> > Issue reusing such shared beans is in long term it is broken (you take the risk
the logic moves to web side for instance) + you already break portability if you care about
it since ee6 module doesnt work with jberet for instance.
> >
> > It is better to change the code to aggregate the logic but use different beans (composition).
> >
> > Finally about step scope:  it is useful whzn you rely on bean lookup (BeanProvider)
but less if you use direct injections.
> >
> > Le 5 mars 2015 09:02, "Mark Struberg" <struberg@yahoo.de> a écrit :
> > For JSF projects I’d rather use @RequestScoped EntityManager like explained before.
> > Especially if you don’t already have all entities wrapped in DTOs.
> > The only thing you need to do is to replace
> >
> > @PersistenceContext
> > private EntityManager em;
> >
> > with
> >
> > @Inject
> > private EntityManger em;
> >
> > and then your producer method (which you have already as I’ve seen in your sample
code?) will get used.
> >
> >
> > And even if you have DTOs. Dealing with all those DTOs is really a not that easy
in practice. Most people e.g. totally trash their locking…
> > If you have DTOs and be aware of all their pitfalls (direct JPA Entity usage has
their own set of brokenness as well of course), then I’d go @Stateless.
> >
> > Be aware that you e.g. also need an @Stateless Facade for ALL your JSF backing beans
which invoke more than a single EJB backend call (otherwise you will end up with n different
transactions in a single page request -> rollback would be broken).
> >
> > LieGrue,
> > strub
> >
> >
> > > Am 05.03.2015 um 08:51 schrieb Karl Kildén <karl.kilden@gmail.com>:
> > >
> > > StepScoped does not seem very useful. JobScoped seems really great but it did
not work for me on my first try so I gave it up :)
> > >
> > > Thanks for explaining some more Mark. Would you keep the ee6 module for a tomee
- stateless - jsf project? Some batches and rest to obtain the data...
> > >
> > > On 5 March 2015 at 07:49, Mark Struberg <struberg@yahoo.de> wrote:
> > > Yes, the main benefit of the entitymanager-per-request is if you e.g. use JSF
and like to have lazy loading in your render-response phase.
> > > Or if you use JSP you touch entity methods in your tags or rendering without
having the whole page in a big transaction wrapper.
> > >
> > > If you use DTOs then it does not add much benefit. But be aware that dealing
with DTOs can be _very_ tricky. E.g. you do not always get the id and version when you build
your DTOs (but only at the time the flush to the db happens in JPA). So your application might
be plastered with em.flush() which can heavily slow down your app.
> > > You also have to manually do the optimistic locking check and ACTIVELY maintain
the version in your DTOs (which sometimes is pure pain).
> > >
> > > Otoh it nicely integrates with EJB, CDI and batches.
> > >
> > > LIeGrue,
> > > strub
> > >
> > >
> > >
> > >
> > > > Am 04.03.2015 um 22:06 schrieb Karl Kildén <karl.kilden@gmail.com>:
> > > >
> > > > OK. So I am not sure if I should use that module or not :)
> > > >
> > > > I will remove requestscoped first thing tomorrow. The thing is I was thinking
about doing the thing where you inject the EntityManagerFactory instead and manually produce
entitymanager and give them requestscoped so the entitymanager would live for a jsf request.
But then again we are already used to the entities getting detached when leaving stateless
so I will probably never introduce it
> > > >
> > > > On 4 March 2015 at 17:07, Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:
> > > > batchee-ee6 is the one, using an ejb so request scoped is implicitely
started. remove it and you should get context not active exception. jbatch just use a plain
old trheadpoolexecutor
> > > >
> > > >
> > > > Romain Manni-Bucau
> > > > @rmannibucau |  Blog | Github | LinkedIn | Tomitriber
> > > >
> > > > 2015-03-04 16:47 GMT+01:00 Karl Kildén <karl.kilden@gmail.com>:
> > > > Well I have it @RequestScoped and @PersistenceContext because of a mistake
and it works everywhere including stateless and Jbatch and I do no tricks. I will however
remove it and try again because it does not make sense.
> > > >
> > > > I copied by dependencies from Caroline or something:
> > > >
> > > >               <dependency>
> > > >                       <groupId>org.apache.geronimo.specs</groupId>
> > > >                       <artifactId>geronimo-jbatch_1.0_spec</artifactId>
> > > >                       <version>${jbatch-api.version}</version>
> > > >                       <scope>compile</scope>
> > > >                       <!-- this JSR spec API is not yet provided in
our EE6 containers -->
> > > >               </dependency>
> > > >
> > > >               <dependency>
> > > >                       <groupId>org.apache.batchee</groupId>
> > > >                       <artifactId>batchee-jbatch</artifactId>
> > > >                       <version>${batchee.version}</version>
> > > >               </dependency>
> > > >               <dependency>
> > > >                       <groupId>org.apache.batchee</groupId>
> > > >                       <artifactId>batchee-extras</artifactId>
> > > >                       <version>${batchee.version}</version>
> > > >               </dependency>
> > > >               <dependency>
> > > >                       <groupId>org.apache.batchee</groupId>
> > > >                       <artifactId>batchee-jsefa</artifactId>
> > > >                       <version>${batchee.version}</version>
> > > >               </dependency>
> > > >               <dependency>
> > > >                       <groupId>org.apache.batchee</groupId>
> > > >                       <artifactId>batchee-cdi</artifactId>
> > > >                       <version>${batchee.version}</version>
> > > >               </dependency>
> > > >               <dependency>
> > > >                       <groupId>org.apache.batchee</groupId>
> > > >                       <artifactId>batchee-ee6</artifactId>
> > > >                       <version>${batchee.version}</version>
> > > >               </dependency>
> > > >
> > > >
> > > >
> > > > Does that mean I have that extra module that uses stateless instead activated
or not? Would be good to know how the batch threads are started...
> > > >
> > > > On 4 March 2015 at 13:23, Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:
> > > > @Mark: this has no link with EE 6 or 7, this is just a feature you want
- which is fine. JBatch doesn't deal with request scoped at all for instance. That said for
batches we have @JobScoped and @StepScoped which are still exeprimental in batchee-cdi but
can be more adapted. I know you are used to it but I just find it a non-sense to have named
request scoped something which is not bound to any http request but that's another topic ;)
> > > >
> > > >
> > > > Romain Manni-Bucau
> > > > @rmannibucau |  Blog | Github | LinkedIn | Tomitriber
> > > >
> > > > 2015-03-04 13:11 GMT+01:00 Mark Struberg <struberg@yahoo.de>:
> > > > I did not read the full thread, but @Stateless and a @RequestScoped EntityManager
doesn’t make sense.
> > > > @Stateless basically _only_ works well with @PersistenceContext. If you
use DeltaSpike JPA then I’d rather use @AppliationScoped + @Transactional (from deltaspike,
not the half-broken one from EE7).
> > > >
> > > >
> > > > The EE support module btw is not just for WAS - it’s for all environments
which support EE but not yet EE7. The point is that with wrapping new thread creating in @Asynchronous
ejb call you get all the ThreadLocals set up for free. And it’s even needed on some EE7
container as the concurrency-utils spec doesn’t define that the Context for @RequestScoped
needs to get started. Some containers do it, others don’t…
> > > >
> > > > LieGrue,
> > > > strub
> > > >
> > > >
> > > >
> > > > > Am 02.03.2015 um 22:46 schrieb Romain Manni-Bucau <rmannibucau@gmail.com>:
> > > > >
> > > > > Depend your conf for both but a thread stack will say you in 2s
> > > > >
> > > > > Le 2 mars 2015 22:35, <karl.kilden@gmail.com> a écrit :
> > > > > Hrmm. Probably not. But maybe, I would expect a clear error message
though? Maybe some other pool like stateless? Or will it get tired of waiting and throw?
> > > > >
> > > > > Skickat från min iPhone
> > > > >
> > > > > 2 mar 2015 kl. 21:58 skrev Romain Manni-Bucau <rmannibucau@gmail.com>:
> > > > >
> > > > >> Full db connection pool?
> > > > >>
> > > > >> Le 2 mars 2015 21:04, "Karl Kildén" <karl.kilden@gmail.com>
a écrit :
> > > > >> Hi Romain, I removed all @Async usage and now it's the request
thread that hangs :D
> > > > >>
> > > > >> Actually when I dump the thread it seems to work forever being
here and there inside Eclipselink internals. Wonder if I triggered some kind of endless loop.
It looks like it because my heap is going way up and down and I am the only one using the
app and whatever task I started should be done aaaages ago.
> > > > >>
> > > > >> Big help getting my attention away from batch and async :-)
> > > > >>
> > > > >> I will keep analyzing. If it's not local to my app I will try
to reproduce it in a sample (but it's always quite hard to do that :/)
> > > > >>
> > > > >> thanks again
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 2 March 2015 at 20:07, Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:
> > > > >> yes surely
> > > > >>
> > > > >> if you can put some effort to create a github project it can
really help since we'll identify the issue really faster (and where it comes from ;))
> > > > >>
> > > > >>
> > > > >> Romain Manni-Bucau
> > > > >> @rmannibucau |  Blog | Github | LinkedIn | Tomitriber
> > > > >>
> > > > >> 2015-03-02 19:32 GMT+01:00 Karl Kildén <karl.kilden@gmail.com>:
> > > > >> Romain you are right I am to tired now... Maybe I am quite stupid
for putting @RequestScoped on it since that is how I used to do it when I did tomcat.  It
should not even do anything when I think about it.
> > > > >>
> > > > >> This problem seems very related to how I use @Async. Maybe I
should move my topic with a new mail to tomee list?
> > > > >>
> > > > >> On 2 March 2015 at 19:27, Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:
> > > > >> well
> > > > >>
> > > > >> deltaspike data doesn't want @RequestScoped, it just used the
contextual entity manager - this comes from what JSF guys do AFAIK.
> > > > >>
> > > > >> Wonder if you could reproduce it with OpenJPA or if it is due
to the fact eclipselink is storing itself a state somewhere. Any idea?
> > > > >>
> > > > >>
> > > > >>
> > > > >> Romain Manni-Bucau
> > > > >> @rmannibucau |  Blog | Github | LinkedIn | Tomitriber
> > > > >>
> > > > >> 2015-03-02 19:13 GMT+01:00 Karl Kildén <karl.kilden@gmail.com>:
> > > > >> Romain,
> > > > >>
> > > > >> Deltaspike Data wants a @RequestScoped entityManager. If I want
to use Data module from my batches, how to combine that?
> > > > >>
> > > > >> Also, this whole problem seems linked to @Async not batch (I
thought batch was implemented with @Async)
> > > > >>
> > > > >> On 2 March 2015 at 18:50, Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:
> > > > >> batchee default impl shouldnt be @Async excepted if you imported
the module Mark added for WAS - but your thread naming is closer to tomee ;).
> > > > >>
> > > > >> batches are by design asynchronous so no need of @Async to launch
them.
> > > > >>
> > > > >> then all depends your @requestScoped. if it matches nothing the
container handles (http request or synchronous ejb call) then you should handle it yourself
but sounds like a workaround more than a fix which would be using a correct scope.
> > > > >>
> > > > >>
> > > > >>
> > > > >> Romain Manni-Bucau
> > > > >> @rmannibucau |  Blog | Github | LinkedIn | Tomitriber
> > > > >>
> > > > >> 2015-03-02 18:44 GMT+01:00 Karl Kildén <karl.kilden@gmail.com>:
> > > > >> I was wrong - this problem is in many other places not just batches!
> > > > >>
> > > > >> regarding batch:
> > > > >>
> > > > >> Interesting, I have not done anything (what I know) to enable
requestscoped...
> > > > >>
> > > > >> I thought Mark once told me that the impl in batchee for creating
threads is actually @Asynchronous. I also kind of recall not getting any extra threads in
my batchee jobs until I increased the @Async thread pool.
> > > > >>
> > > > >> I do use @Async myself also here and there... In fact I think
in one or two cases Asynchronous will start the batch. I use <class>org.apache.deltaspike.jpa.impl.transaction.EnvironmentAwareTransactionStrategy</class>
> > > > >>
> > > > >> Then I use this producer:
> > > > >>
> > > > >>      @PersistenceContext(unitName = APP_NAME)
> > > > >>      private EntityManager entityManager;
> > > > >>
> > > > >>      @Produces
> > > > >>      @RequestScoped
> > > > >>      protected EntityManager createEntityManager() {
> > > > >>              return this.entityManager;
> > > > >>      }
> > > > >>
> > > > >>
> > > > >>
> > > > >> And a normal stateless that uses either the entityManager or
a repository from deltaspike data (actually almost always the repository). This is the only
way I produce entityManagers.
> > > > >>
> > > > >>
> > > > >> Anyways my problem seems to be also in JSF @ViewScoped beans
and whatnot. Can it be that I must dispose my entitymanagers myself somehow?
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 2 March 2015 at 18:15, Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:
> > > > >> Hmm
> > > > >>
> > > > >> for a batch this code doesnt mean anything - request scope. Did
you hack something around detaspike to make it working?
> > > > >>
> > > > >> If this entity manager is used in an EJB this should be fine,
if not then you need to ensure transaction are handled as you expect - should be the case
with batchee but doesnt cost anything to validate it .
> > > > >>
> > > > >> Finally do you use @Asynchronous in your code otherwise you shouldn't
see it
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> Romain Manni-Bucau
> > > > >> @rmannibucau |  Blog | Github | LinkedIn | Tomitriber
> > > > >>
> > > > >> 2015-03-02 18:10 GMT+01:00 Karl Kildén <karl.kilden@gmail.com>:
> > > > >> Hello,
> > > > >>
> > > > >> I have some @Stateless that I use from batches. After the job
has finished I can see after a heap dump that the async thread seems to keep a reference to
the RepeatableWriteUnitOfWork. When I google I understand that this is the EclipseLink entitymanager
and since nobody seems to have called clear on it my heap is getting pretty full...
> > > > >>
> > > > >> I have defined my Batches with normal read process write. They
are @Named and simply inject my @Stateless. They @Stateless uses EntityManager and it is produced
like this:
> > > > >>
> > > > >>      @PersistenceContext(unitName = APP_NAME)
> > > > >>      private EntityManager entityManager;
> > > > >>
> > > > >>      @Produces
> > > > >>      @RequestScoped
> > > > >>      protected EntityManager createEntityManager() {
> > > > >>              return this.entityManager;
> > > > >>      }
> > > > >>
> > > > >>
> > > > >> Not sure if I am missing some kind of disposal here?  I don't
think so because only the jobs get the UnitOfWork stuck on the heap.
> > > > >>
> > > > >> Not sure I understand any of this very well. I can just clearly
see that my entire heap is now RepeatableWriteUnitOfWork tied to @ASynchronous threads.
> > > > >>
> > > > >> My memory dump could of course be sent to someone or shared desktop
if someone want's to help me understand this... Or maybe a pointer on where to debug?
> > > > >>
> > > > >> cheers
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >


Mime
View raw message