cayenne-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nikolai Raitsev" <nikolai.rait...@gmail.com>
Subject Re: Batch processing with large data sets
Date Wed, 02 Aug 2006 22:02:07 GMT
Mike you is very fast with answering:)

Thank you!! I will try it out tomorrow

Nikolai

2006/8/2, Mike Kienenberger <mkienenb@gmail.com>:
>
> I'm not quite answering your specific question, but some of the things
> people have done in the past is to throw out the old Cayenne Data
> context periodically (every N number of objects written out) and
> create a new one.
>
> Also, you can use performIteratedQuery instead of performQuery to only
> fetch a limited number of records for processing at a time.
>
> On 8/2/06, Nikolai Raitsev <nikolai.raitsev@gmail.com> wrote:
> > Hello all, I hope, this is my last question...:)
> >
> > Example:
> >
> > I would like to perform a copy from table 1 ("InterfaceTable") to table
> 2
> > ("BaseTable").
> >
> > On both tables i have defined business objects (with cayenne modeler,
> > InterfaceObj and BaseObj).
> >
> > BaseObj possesses plausibility (validate) methods with which his
> attributes
> > are examined for correctness.
> >
> > The copying process runs like that:
> >
> > //open DataContext
> > dataContext = DataContext.createDataContext ();
> >
> > //read InterfaceData
> > SelectQuery selQueryInterface = new SelectQuery(sClassNameInterface);
> > dataObjectsInInterface = new DataObjectList(dataContext.performQuery
> > (selQueryInterface));
> >
> > //read BaseData
> > SelectQuery selQueryBasis = new SelectQuery(sClassNameBasis);
> > dataObjectsInBasis = new DataObjectList(dataContext.performQuery
> > (selQueryBasis));
> >
> >
> > int nSizeBasisTable = dataObjectInBasis.size();
> >
> > int nSize = dataObjectsInInterface.size();
> >
> > //transfer data
> >
> > for(int i = 0; i<nSize; i++)
> >  {
> >             dataObjectInt = (CayenneDataObject)
> dataObjectsInInterface.get
> > (i);
> >
> >             if(nSizeBasisTable > 1) //here possible updates
> >             {
> >                 if(locateInBasisData())//locate BaseObj with primary
> keys
> > from dataObjectInt, update if a BaseObj was found
> >                 {
> >                     dataObjectBasis.setValuesFromInterface
> (dataObjectInt);
> >                 }
> >             }
> >             else//here inserts
> >             {
> >                 dataObjectBasis =
> >
> (MaDataObjectBasis)dataContext.createAndRegisterNewObject(sClassNameBasis) ;
> >
> >                 dataObjectBasis.setValuesFromInterface(dataObjectInt);
> >                 countInsert++;
> >             }
> >
> >           //Validate Data
> >          try
> >          {
> >             ValidationResult validationResult = new ValidationResult();
> >
> >             dataObjectBasis.validateForSave(validationResult);
> >             if(validationResult.hasFailures())
> >             {
> >                 //do something with failures
> >             }
> >
> >         }
> >         catch(ValidationException vex)
> >         {
> >
> >         }
> >
> > }
> >
> >
> > //and now commitChanges
> > dataContext.setValidatingObjectsOnCommit(false);
> > dataContext.commitChanges();
> >
> > //end
> >
> > in section //read BasisData are gotten all data from the "BaseTable" so
> that
> > locateInBasisData becomes fast (otherwise i have for every object from
> > "InterfaceTable" a select query on "BaseTable")
> >
> > With such procedure I can process problem-free 30,000 data records (on
> both
> > tables) under 10 sec without a large memory capacity.
> >
> > but...
> >
> > It can occur in extreme cases by far more data records. 500,000,
> 1,000,000…
> > and so on
> >
> > Of course, i would like use my procedure for small data sets and for
> large
> > data sets. So that the memory in case of large data sets does not run
> out, i
> > need a possibility to cache the data local in files.
> > Is that possible with Cayenne? I mean, is possible to set cayenne
> > (property?)  so that their ObjectStore becomes out in files and work
> with
> > that?
> >
> > I have read the tips here:
> > http://cwiki.apache.org/CAYDOC/performance-tuning.html, but it is not a
> > solution for very large datasets (that memory can run out)...
> >
> > i hope my question is clear, if not, please mail me;)
> >
> > Thanks at all,
> >
> > Nikolai
> >
> > P.S. I use standard environment, not a app. or web server
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message