cayenne-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Huss <johnth...@gmail.com>
Subject Re: Partitioning a query result..
Date Thu, 15 Dec 2016 00:06:49 GMT
Unless your DB disk is stripped into at least four parts this won't be
faster.
On Wed, Dec 14, 2016 at 5:46 PM Tony Giaccone <tgiaccone@gmail.com> wrote:

> I want to speed thing up, by running multiple instances of a job that
> fetches data from a table.  So that for example if I need to process 10,000
> rows
> the query runs on each instance and returns 4 sets of 2500 rows one for
> each instance with no duplication.
>
> My first thought in SQL was to add something like this to the where
> clause..
>
> and MOD(ID, INSTANCE_COUNT) == INSTANCE_ID;
>
> so that if the instance count was 4 then the instance IDs would run
> 0,1,2,3.
>
> I'm not quite sure how you would structure that using the queryAPI. Any
> suggestions about that?
>
> And there are some problems with this idea, as you have to be certain your
> IDs increase in a manner that aligns with your math so that the
> partitioning is equal in size.
> For example if your sequence increments by 20, then you would have to futz
> around with your math to get the right partitioning and that is the problem
> with this technique.
>  It's brittle it depends on getting a bunch of things in  "sync".
>
> Does anyone have another idea of how to segment out rows that would yield a
> solution that's not quite so brittle?
>
>
>
> Tony Giaccone
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message