hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Holstad (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1304) New client server implementation of how gets and puts are handled.
Date Sun, 17 May 2009 08:23:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710181#action_12710181
] 

Erik Holstad commented on HBASE-1304:
-------------------------------------

@Ryan
The way I see it is that the fact that deletes only apply to earlier files is not something
that is going to speed up the early out scenario for all cases, where it will help is when
you have queries that don't need to touch files but only get data from memcache, since you
don't need to process any deletes in memcache. The fact that deletes do, in the new implementation,
only apply to older files is more like a bi product from the fact that deletes in memcache
are immediately applied to the data in there.

If that is the right approach, that is a different story. The reason that I think that it
makes sense comes from the fact that deletes take up a lot of resources and time when processing
data, so I would like for them to be as efficient as possible. The best thing would be to
apply them to the whole store as soon as they came in, but since that is not realistic we
have to do something else.
So be deleting everything in memcache that is effected by the incoming  delete we save time
and space, by having less data to process and less flushes calls leading to fewer compactions
of any kind.

The above reasoning might not make sense in all cases, but for a majority I think it does.

When it comes down to minor compactions, not sure if you are worried about them taking longer
time than before where we "just" merged the results. If that is the case, most of the work
for that merge is to find out which KeyValue should be the next, actually deleting the entries
effected by a delete wouldn't add that much overhead. 

What are your concerns when it comes to removing deleted KeyValues in a minor compaction,
they are still going to be removed eventually and there is currently now way to undo your
delete to get them back, so the way I see it they are just a burden for the system. What kinda
of behaviour would you like to see?

Regards Erik

> New client server implementation of how gets and puts are handled. 
> -------------------------------------------------------------------
>
>                 Key: HBASE-1304
>                 URL: https://issues.apache.org/jira/browse/HBASE-1304
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Erik Holstad
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: hbase-1304-v1.patch, HBASE-1304-v2.patch, HBASE-1304-v3.patch, HBASE-1304-v4.patch,
HBASE-1304-v5.patch, HBASE-1304-v6.patch, HBASE-1304-v7.patch
>
>
> Creating an issue where the implementation of the new client and server will go. Leaving
HBASE-1249 as a discussion forum and will put code and patches here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message