cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lu, Boying" <Boying...@emc.com>
Subject RE: A question to 'paging' support in DataStax java driver
Date Tue, 10 May 2016 05:08:39 GMT
I filed a JIRA https://issues.apache.org/jira/browse/CASSANDRA-11741 to track this.

From: DuyHai Doan [mailto:doanduyhai@gmail.com]
Sent: 2016年5月10日 12:47
To: user@cassandra.apache.org
Subject: Re: A question to 'paging' support in DataStax java driver

I guess it's technically possible but then we'll need to update the binary protocol. Just
create a JIRA and ask for this feature

On Tue, May 10, 2016 at 5:00 AM, Lu, Boying <Boying.Lu@emc.com<mailto:Boying.Lu@emc.com>>
wrote:
Thanks very much.

I understand that the data needs to be read from the DB to get the next ‘PagingState’.

But is it possible not to return those data to the client side, just returning the ‘PagingState’?
I.e. the data is read on the server side, but not return to client side, this can save some
bandwidth
between client and server.


From: DuyHai Doan [mailto:doanduyhai@gmail.com<mailto:doanduyhai@gmail.com>]
Sent: 2016年5月9日 21:06
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: A question to 'paging' support in DataStax java driver

In a truly consistent world (should I say "snapshot isolation" world instead), re-reading
the same page should yield the same results no matter how many new inserts have occurred since
the last page read.

Caching previous page at app level can be a solution but not viable if the amount of data
is huge, also you'll need a cache layer and deal with cache invalidation etc ...

The point is, providing snapshot isolation in a distributed system is hard without some sort
of synchronous coordination e.g. global lock (read http://www.bailis.org/papers/hat-vldb2014.pdf)


On Mon, May 9, 2016 at 2:17 PM, Bhuvan Rawal <bhu1rawal@gmail.com<mailto:bhu1rawal@gmail.com>>
wrote:
Hi Doan,

What does it have to do being eventual consistency? Lets assume a scenario with complete consistency
and we are at page X, and at the same time some inserts/updates happened at page X-2 and we
jumped to that.
User will see inconsistent page in that case as well, right? Also in such cases how would
you design a user facing application (Cache previous pages at app level?)

Regards,
Bhuvan

On Mon, May 9, 2016 at 4:18 PM, DuyHai Doan <doanduyhai@gmail.com<mailto:doanduyhai@gmail.com>>
wrote:
"Is it possible to just return PagingState object without returning data?" --> No

Simply because before reading the actual data for each page of N rows, you cannot know at
which token value a page of data starts...

And it is worst than that, with paging you don't have any isolation. Let's suppose you keep
in your application/web front-end the paging states for page 1, 2 and 3. Since there are concurrent
inserts on the cluster at the same time, when you re-use the paging state 2 for example, you
may not get the same results as the previous read.

And it is inevitable in an eventual consistent distributed DB world

On Mon, May 9, 2016 at 12:25 PM, Lu, Boying <Boying.Lu@emc.com<mailto:Boying.Lu@emc.com>>
wrote:
dHi, All,

We are considering to use DataStax java driver in our codes. One important feature provided
by the driver we want to use is ‘paging’.
But according to the https://datastax.github.io/java-driver/3.0.0/manual/paging/, it seems
that we can’t jump between pages.

Is it possible to just return PagingState object without returning data? e.g.  If I want to
jump to the page 5 from the page 1,
I need to go through each page from page 1 to page 5,  Is it possible to just return the PagingState
object of page 1, 2, 3 and 4 without
actual data of each page? This can save some bandwidth at least.

Thanks in advance.

Boying






Mime
View raw message