hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-13099) Scans as in DynamoDB
Date Thu, 26 Feb 2015 07:07:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338032#comment-14338032
] 

Lars Hofhansl edited comment on HBASE-13099 at 2/26/15 7:06 AM:
----------------------------------------------------------------

That's what small scans do (in a nutshell), when they are not small :)

That does mean that at every 1mb chunk we need to reseek all \{region|store|storeFile\}Scanners.
I.e. the server state allows us to avoid the expensive seeking each RPC. Maybe with 1mb chunks
it does not matter. (but you can pull 1mb over 1ge in < 10ms, which is less then the seek
time of an HDD).

Some of the chunking logic we get with HBASE-12976.



was (Author: lhofhansl):
That's what small scans do (in a nutshell), when they are not small :)

That does mean that at every 1mb chunk we need to reseek are {region|store|storeFile}Scanners.
I.e. the server state allows us to avoid the expensive seeking each RPC. Maybe with 1mb chunks
it does not matter. (but you can pull 1mb over 1ge in < 10ms, which is less then the seek
time of an HDD).

Some of the chunking logic we get with HBASE-12976.


> Scans as in DynamoDB
> --------------------
>
>                 Key: HBASE-13099
>                 URL: https://issues.apache.org/jira/browse/HBASE-13099
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Client, regionserver
>            Reporter: Nicolas Liochon
>
> cc: [~saint.ack@gmail.com] - as discussed offline.
> DynamoDB has a very simple way to manage scans server side:
> ??citation??
> The data returned from a Query or Scan operation is limited to 1 MB; this means that
if you scan a table that has more than 1 MB of data, you'll need to perform another Scan operation
to continue to the next 1 MB of data in the table.
> If you query or scan for specific attributes that match values that amount to more than
1 MB of data, you'll need to perform another Query or Scan request for the next 1 MB of data.
To do this, take the LastEvaluatedKey value from the previous request, and use that value
as the ExclusiveStartKey in the next request. This will let you progressively query or scan
for new data in 1 MB increments.
> When the entire result set from a Query or Scan has been processed, the LastEvaluatedKey
is null. This indicates that the result set is complete (i.e. the operation processed the
“last page” of data).
> ??citation??
> This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message