Mailing-List: contact user-help@cayenne.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cayenne.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: <4AFD800D.1090806@tsi-solutions.nl>
References: <4AFD7943.4010604@tsi-solutions.nl>
 <5adb61290911130729r6298f127x70cf4d068e83f2e9@mail.gmail.com>
	<4AFD800D.1090806@tsi-solutions.nl>
From: Michael Gentry <mgentry@masslight.net>
Date: Fri, 13 Nov 2009 11:05:45 -0500
Message-ID: <5adb61290911130805i7c9d259es219315548484a3ea@mail.gmail.com>
Subject: Re: Object Caching
To: user@cayenne.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Hans,

Even using a paginated query in Cayenne, it would've eventually pulled
everything into memory.  The paginated query is really designed to be
used in a UI where the user is going to see a limited amount of data
and probably not page too many times.  The iterated query is the best
approach for trying to process a large number of records in Cayenne.

Good luck!

mrg

On Fri, Nov 13, 2009 at 10:49 AM, Hans Pikkemaat
<h.pikkemaat@tsi-solutions.nl> wrote:
> Hi,
>
> Of course I don't want to load the whole thing into memory.
> I want to run the query and use an iterator to go through the results.
> Using paging the jdbc driver is able to produce chunks which prevents
> the whole resultset to be loaded into memory.
>
> This same principle I was trying to accomplish using cayenne but clearly
> without success.
>
> So I'm going to fall back to cayenne iterated query or even jdbc.
>
> tx
>
> Hans
>
>
> Michael Gentry wrote:
>>
>> I'm not exactly sure what you are trying to accomplish, but could you
>> use plain SQL to do the job (run it from an SQL prompt)? =A0That's the
>> approach I normally take when I have to do updates to large amounts of
>> data. =A0Especially for a one-off task or something ill-suited to Java
>> code. =A0Even if you were using raw JDBC (no ORM) and tried to pull back
>> 2.5 million records it would be difficult. =A0I don't know the size of
>> the data record you are using, but if it is even 1k (not an
>> unreasonable size) it would require 2.5 GB of RAM just to hold the
>> records.
>>
>> mrg
>>
>>
>> On Fri, Nov 13, 2009 at 10:20 AM, Hans Pikkemaat
>> <h.pikkemaat@tsi-solutions.nl> wrote:
>>
>>>
>>> Hi,
>>>
>>> That was the initial approax I tried. The problem with this is that I
>>> cannot
>>> manually
>>> create relations between objects constructed from data rows. This means
>>> that
>>> when
>>> I access the detail table through the relation it will execute a query =
to
>>> get them from
>>> the database.
>>>
>>> If I have 100 main records it runs 100 queries to get all the details.
>>> This is not performing well. I need to run 1 query which is doing a lef=
t
>>> join and
>>> gets all the data in one go.
>>>
>>> But I totally agree with you that ORM is too much overhead. I don't nee=
d
>>> caching
>>> or something like that. Actually I'm trying to prevent that it is cachi=
ng
>>> the records.
>>> I'm working on a solution now that is using the iterated query which is
>>> returning
>>> datarows where I construct new objects and the relationsship between th=
em
>>> myself.
>>>
>>> tx
>>>
>>> Hans
>>>
>>>
>>> Michael Gentry wrote:
>>>
>>>>
>>>> Not just Cayenne, Hans. =A0No ORM efficiently handles the scale you ar=
e
>>>> talking about. =A0You need to find a way to break your query down into
>>>> smaller chunks to process. =A0What you are doing might be workable wit=
h
>>>> 50k records, but not 2.5m. =A0Find a way to break your query down into
>>>> smaller units to process or explore what Andrus suggested with
>>>> ResultIterator:
>>>>
>>>> http://cayenne.apache.org/doc/iterating-through-data-rows.html
>>>>
>>>> If you can loop over one record at a time and process it (thereby
>>>> letting the garbage collector clean out the ones you have processed)
>>>> then your memory usage should be somewhat stable and manageable, even
>>>> if the initial query time takes a while.
>>>>
>>>> mrg
>>>>
>>>>
>>>> On Fri, Nov 13, 2009 at 7:09 AM, Hans Pikkemaat
>>>> <h.pikkemaat@tsi-solutions.nl> wrote:
>>>>
>>>>
>>>>>
>>>>> Anyway, my conclusion is indeed: don't use cayenne for large query
>>>>> processing.
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
>