hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8496) Implement tags and the internals of how a tag should look like
Date Wed, 05 Jun 2013 02:54:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675541#comment-13675541
] 

ramkrishna.s.vasudevan commented on HBASE-8496:
-----------------------------------------------

The strucuture of tag may look like this
{color:red}
<1 byte type code><2 byte tag length><tag>
{color}

We need to provide some TagIterators inside the CellUtil so that we will be able to iterate
the tag array.
The Iterator must use the above tag structure to build this info.
Other utility methods may also be needed for this like getNumTags(), given a type get the
tags of that type etc.
If we are having the structure with the type in it then it may not be possible to actually
have some validation on the client side for specific tag types.
The reason for having type is to have different usecases for tags and the CP that we add for
the different usecase should help us in achieving it.

We also need to identify different usecases for tags other than Visibility and ACLs so that
we can ensure that we provide proper client support for tags.  Currently the idea is to go
with the CP based approach.
>From the client perspective the tags will now be added as part of Puts?
Put.add(KeyValue) will now have an option to pass tag array. One more option that we thought
of is to have OperationAttributes and set the tags over there.
Tried out different options on getting Tags working with the KeyValues and existing formats.
The KV can be modified to 
{color:red}
<keylength><valuelength><keyarray><valuearray><taglength><tagarray>
{color}
So if a kv does not have any tags still the taglength will be 0 but there will not be any
tag array.  
This will involve some changes in the format of the HFileWriter and reader probably a new
version of the Writer/Reader is needed.  (Minor should be enough?)
Incase of encoders the base encoder BufferedDataEncoder will be tag aware and currently there
is not encoding logic applies on the tag part.  It is just written and parsed so that while
scan we are able to get the tags in the output KVs.

Similar applies for the PrefixTree codec.  In this case the backward compatability should
be taken care of.

Incase we don't need to do the above one more thing that can be done is 
        {color:red}
	<Existing KV format><int – negative integer indicating the length of the tag><tag
array>
        {color}
Here the negative length is used only when there is a tag and the existing KV format is left
untouched when there is no tag.
In this approach we would be every time reading the next KVs keylength and then decide if
there is a tag presence or not.  If not present we just rewind the position of the buffer.
This has a performance impact but does not involve changes to the HFileFormats.


So in both of the cases we tend to write the tag info whether or not user needs it.  So one
way to avoid it could be like the way we do for MemstoreTS.
Add a meta data to the hfile saying tagpresent = true/false based on the KVs in that HFile.
 
Even if there is only one KV with tag this meta data will be true.
Now on compaction we will read this metadata and decide whether to compact data with Tag or
without tag.
The advantage is that for scenarios where there are no tags we will have not have a drop in
read performance (this applies after compaction is done).
The downside of this approach is that the KeyValue format itself now becomes 2 ways of representation.
 Sometimes the KV that we retrieve will have tag info sometimes will not be having tag.
Thanks to Anoop and Andy for their suggestions/inputs.

I have some patches ready for the above approaches except for that option tag part.  Wanted
to know if that can be provided as a feature in the future?  anyway will try out the optional
part also to see what type of changes/issues we may face while implementing it.
Comments/feedback welcome.  Anyother ideas am open to hear them also.  
                
> Implement tags and the internals of how a tag should look like
> --------------------------------------------------------------
>
>                 Key: HBASE-8496
>                 URL: https://issues.apache.org/jira/browse/HBASE-8496
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.98.0
>
>
> The intent of this JIRA comes from HBASE-7897.
> This would help us to decide on the structure and format of how the tags should look
like. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message