cayenne-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juergen Saar" <juer...@jsaar.org>
Subject Re: Batch processing with large data sets
Date Thu, 03 Aug 2006 06:37:16 GMT
Hi Nikolai,

please keep me informed ...
I've tried a lot, but in all cases the reading DataContext was running out
of memory when the tables had enough data.

The problem with this root-context is, that it is not possible to throw it
out and continue with a new one ...

--- Juergen ---

2006/8/3, Nikolai Raitsev <nikolai.raitsev@gmail.com>:
>
> Mike you is very fast with answering:)
>
> Thank you!! I will try it out tomorrow
>
> Nikolai
>
> 2006/8/2, Mike Kienenberger <mkienenb@gmail.com>:
> >
> > I'm not quite answering your specific question, but some of the things
> > people have done in the past is to throw out the old Cayenne Data
> > context periodically (every N number of objects written out) and
> > create a new one.
> >
> > Also, you can use performIteratedQuery instead of performQuery to only
> > fetch a limited number of records for processing at a time.
> >
> > On 8/2/06, Nikolai Raitsev <nikolai.raitsev@gmail.com> wrote:
> > > Hello all, I hope, this is my last question...:)
> > >
> > > Example:
> > >
> > > I would like to perform a copy from table 1 ("InterfaceTable") to
> table
> > 2
> > > ("BaseTable").
> > >
> > > On both tables i have defined business objects (with cayenne modeler,
> > > InterfaceObj and BaseObj).
> > >
> > > BaseObj possesses plausibility (validate) methods with which his
> > attributes
> > > are examined for correctness.
> > >
> > > The copying process runs like that:
> > >
> > > //open DataContext
> > > dataContext = DataContext.createDataContext ();
> > >
> > > //read InterfaceData
> > > SelectQuery selQueryInterface = new SelectQuery(sClassNameInterface);
> > > dataObjectsInInterface = new DataObjectList(dataContext.performQuery
> > > (selQueryInterface));
> > >
> > > //read BaseData
> > > SelectQuery selQueryBasis = new SelectQuery(sClassNameBasis);
> > > dataObjectsInBasis = new DataObjectList(dataContext.performQuery
> > > (selQueryBasis));
> > >
> > >
> > > int nSizeBasisTable = dataObjectInBasis.size();
> > >
> > > int nSize = dataObjectsInInterface.size();
> > >
> > > //transfer data
> > >
> > > for(int i = 0; i<nSize; i++)
> > >  {
> > >             dataObjectInt = (CayenneDataObject)
> > dataObjectsInInterface.get
> > > (i);
> > >
> > >             if(nSizeBasisTable > 1) //here possible updates
> > >             {
> > >                 if(locateInBasisData())//locate BaseObj with primary
> > keys
> > > from dataObjectInt, update if a BaseObj was found
> > >                 {
> > >                     dataObjectBasis.setValuesFromInterface
> > (dataObjectInt);
> > >                 }
> > >             }
> > >             else//here inserts
> > >             {
> > >                 dataObjectBasis =
> > >
> >
> (MaDataObjectBasis)dataContext.createAndRegisterNewObject(sClassNameBasis) ;
> > >
> > >                 dataObjectBasis.setValuesFromInterface(dataObjectInt);
> > >                 countInsert++;
> > >             }
> > >
> > >           //Validate Data
> > >          try
> > >          {
> > >             ValidationResult validationResult = new
> ValidationResult();
> > >
> > >             dataObjectBasis.validateForSave(validationResult);
> > >             if(validationResult.hasFailures())
> > >             {
> > >                 //do something with failures
> > >             }
> > >
> > >         }
> > >         catch(ValidationException vex)
> > >         {
> > >
> > >         }
> > >
> > > }
> > >
> > >
> > > //and now commitChanges
> > > dataContext.setValidatingObjectsOnCommit(false);
> > > dataContext.commitChanges();
> > >
> > > //end
> > >
> > > in section //read BasisData are gotten all data from the "BaseTable"
> so
> > that
> > > locateInBasisData becomes fast (otherwise i have for every object from
> > > "InterfaceTable" a select query on "BaseTable")
> > >
> > > With such procedure I can process problem-free 30,000 data records (on
> > both
> > > tables) under 10 sec without a large memory capacity.
> > >
> > > but...
> > >
> > > It can occur in extreme cases by far more data records. 500,000,
> > 1,000,000…
> > > and so on
> > >
> > > Of course, i would like use my procedure for small data sets and for
> > large
> > > data sets. So that the memory in case of large data sets does not run
> > out, i
> > > need a possibility to cache the data local in files.
> > > Is that possible with Cayenne? I mean, is possible to set cayenne
> > > (property?)  so that their ObjectStore becomes out in files and work
> > with
> > > that?
> > >
> > > I have read the tips here:
> > > http://cwiki.apache.org/CAYDOC/performance-tuning.html, but it is not
> a
> > > solution for very large datasets (that memory can run out)...
> > >
> > > i hope my question is clear, if not, please mail me;)
> > >
> > > Thanks at all,
> > >
> > > Nikolai
> > >
> > > P.S. I use standard environment, not a app. or web server
> > >
> > >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message