hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: HBase type support
Date Tue, 19 Mar 2013 00:42:09 GMT
Thanks for the clarification Doug. 

Back to my point, I was saying that MD5 and SHA-1 are already part of the Java package so
if you're running Java 1.6_xx or Java 1.7_xx, you will have MD5 available.  So it could be
a good thing. 


Murmur is released under MIT... Is there going to be a licensing issue? (Thinking back to
the delay in getting Snappy.) Note: I don't know which is why I am asking so I don't want
to be accused of FUD. 
:-P

On Mar 18, 2013, at 2:16 PM, Doug Meil <doug.meil@explorysmedical.com> wrote:

> 
> Sorry I'm late to this thread but I was the guy behind HBASE-7221 and the
> algorithms specifically mentioned were MD5 and Murmur (not SHA-1).  And
> implementation of Murmur already exists in Hbase, and the MD5
> implementation was the one that ships with Java.
> 
> The intent was to include hashing appropriate for use with key
> distribution of rowkeys in tables as is often suggested on the dist-lists.
> SHA-1 is probably overkill for the rowkey case, but I wouldn't want to
> stop anybody from using SHA-1 if it was appropriate for their needs.
> 
> 
> 
> 
> 
> On 3/18/13 8:02 AM, "Michel Segel" <michael_segel@hotmail.com> wrote:
> 
>> Andrew, 
>> 
>> I was aware of you employer, which I am pretty sure that they have
>> already dealt with the issue of  exporting encryption software and
>> probably hardware too.
>> 
>> Neither of us are lawyers and what I do know of dealing with the
>> government bureaucracies, it's not always as simple of just filing the
>> correct paperwork. (Sometimes it is, sometimes not so much, YMMV...)
>> 
>> Putting the hooks for encryption is probably a good idea. Shipping the
>> encryption w the release or making it part of the official release, not
>> so much. Sorry, I'm being a bit conservative here.
>> 
>> IMHO I think fixing other issues would be of a higher priority, but
>> that's just me;-)
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Mar 17, 2013, at 12:12 PM, Andrew Purtell <apurtell@apache.org> wrote:
>> 
>>>> This then leads to another question... suppose Apache does add
>>>> encryption
>>> to Hadoop. While the Apache organization does have the proper paperwork
>>> in
>>> place, what then happens to Cloudera, Hortonworks, EMC, IBM, Intel, etc
>>> ?
>>> 
>>> Well I can't put that question aside since you've brought it up now
>>> twice and encryption feature candidates for Apache Hadoop and Apache
>>> HBase
>>> are something I have been working on. Its a valid question but since as
>>> you
>>> admit you don't know what you are talking about, perhaps stating
>>> uninformed
>>> opinions can be avoided. Only the latter is what I object to. I think
>>> the
>>> short answer is as an Apache contributor I'm concerned about the Apache
>>> product. Downstream repackagers can take whatever action needed
>>> including
>>> changes, since it is open source, or feedback about it representing a
>>> hardship. At this point I have heard nothing like that. I work for Intel
>>> and can say we are good with it.
>>> 
>>> On Sunday, March 17, 2013, Michael Segel wrote:
>>> 
>>>> Its not a question of FUD, but that certain types of
>>>> encryption/decryption
>>>> code falls under the munitions act.
>>>> See: http://www.fas.org/irp/offdocs/eo_crypt_9611_memo.htm
>>>> 
>>>> Having said that, there is this:
>>>> http://www.bis.doc.gov/encryption/encfaqs6_17_02.html
>>>> 
>>>> In short, I don't as a habit export/import encryption technology so I
>>>> am
>>>> not up to speed on the current state of the laws.
>>>> Which is why I have to question the current state of the US encryption
>>>> laws.
>>>> 
>>>> This then leads to another question... suppose Apache does add
>>>> encryption
>>>> to Hadoop. While the Apache organization does have the proper
>>>> paperwork in
>>>> place, what then happens to Cloudera, Hortonworks, EMC, IBM, Intel,
>>>> etc ?
>>>> 
>>>> But lets put that question aside.
>>>> 
>>>> The point I was trying to make was that the core Sun JVM does support
>>>> MD5
>>>> and SHA-1 out of the box, so that anyone running Hadoop and using the
>>>> 1.6_xx or the 1.7_xx versions of the JVM will have these packages.
>>>> 
>>>> Adding hooks that use these classes are a no brainer.  However, beyond
>>>> this... you tell me.
>>>> 
>>>> -Mike
>>>> 
>>>> On Mar 16, 2013, at 7:59 AM, Andrew Purtell <apurtell@apache.org>
>>>> wrote:
>>>> 
>>>>> The ASF avails itself of an exception to crypto export which only
>>>> requires
>>>>> a bit of PMC housekeeping at release time. So "is not [ok]" is FUD. I
>>>>> humbly request we refrain from FUD here. See
>>>>> http://www.apache.org/dev/crypto.html. To the best of our knowledge we
>>>>> expect this to continue, though the ASF has not updated this policy
>>>>> yet
>>>> for
>>>>> recent regulation updates.
>>>>> 
>>>>> On Saturday, March 16, 2013, Michel Segel wrote:
>>>>> 
>>>>>> I also want to add that you could add MD5 and SHA-1, but I'd check
>>>>>> on us
>>>>>> laws... I think these are ok, however other encryption/decryption
>>>>>> code
>>>> is
>>>>>> not.
>>>>>> 
>>>>>> They are part of the std sun java libraries ...
>>>>>> 
>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>> 
>>>>>> Mike Segel
>>>>>> 
>>>>>> On Mar 16, 2013, at 7:18 AM, Michel Segel <michael_segel@hotmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Isn't that what you get through add on frameworks like TSDB and
>>>>>>> Kiji ?
>>>>>> Maybe not on the client side, but frameworks that extend HBase...
>>>>>>> 
>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>> 
>>>>>>> Mike Segel
>>>>>>> 
>>>>>>> On Mar 16, 2013, at 12:45 AM, lars hofhansl <larsh@apache.org>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I think generally we should keep HBase a byte[] based key
value
>>>>>>>> store.
>>>>>>>> What we should add to HBase are tools that would allow client
side
>>>> apps
>>>>>> (or libraries) to built functionality on top of plain HBase.
>>>>>>>> 
>>>>>>>> Serialization that maintains a correct semantic sort order
is
>>>> important
>>>>>> as a building block, so is code that can build up correctly
>>>>>> serialized
>>>> and
>>>>>> sortable compound keys, as well as hashing algorithms.
>>>>>>>> 
>>>>>>>> Where I would draw the line is adding types to HBase itself.
As
>>>>>>>> long
>>>> as
>>>>>> one can write a client, or Filters, or Coprocessors with the tools
>>>> provided
>>>>>> by HBase we're good. Higher level functionality can then be built
of
>>>>>> on
>>>> top
>>>>>> of HBase.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> For example, maybe we need to add better access API to the
HBase
>>>>>>>> WAL
>>>> in
>>>>>> order to have an external library implement idempotent transactions
>>>> (which
>>>>>> can be used to implement 2ndary indexes).
>>>>>>>> Maybe some other primitives have to be exposed in order to
allow an
>>>>>> external library to implement full transactions.
>>>>>>>> Or we might need a statistics framework (such as the one
that
>>>>>>>> Jesse is
>>>>>> working on).
>>>>>>>> 
>>>>>>>> These are all building blocks that do not presume specific
access
>>>>>> patterns or clients, but can be used to implement them.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> As usual, just my $0.02.
>>>>>>>> 
>>>>>>>> -- Lars
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ________________________________
>>>>>>>> From: Nick Dimiduk <ndimiduk@gmail.com>
>>>>>>>> To: user@hbase.apache.org
>>>>>>>> Sent: Friday, March 15, 2013 10:57 AM
>>>>>>>> Subject: Re: HBase type support
>>>>>>>> 
>>>>>>>> I'm talking about MD5, SHA1, etc. It's something explicitly
>>>>>>>> mentioned
>>>>>>>> in HBASE-7221.
>>>>>>>> 
>>>>>>>> On Fri, Mar 15, 2013 at 10:55 AM, James Taylor <
>>>> jtaylor@salesforce.com
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Nick,
>>>>>>>>> What do you mean by "hashing algorithms"?
>>>>>>>>> Thanks,
>>>>>>>>> James
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 03/15/2013 10:11 AM, Nick Dimiduk wrote:
>>>>>>>>> 
>>>>>>>>>> Hi David,
>>>>>>>>>> 
>>>>>>>>>> Native support for a handful of hashing algorithms
has also been
>>> 
>>> 
>>> 
>>> -- 
>>> Best regards,
>>> 
>>>  - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>> 
> 
> 
> 
> 


Mime
View raw message