drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sorabh Hamirwasia (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5501) Improve the negotiation of max_wrapped_size for encryption.
Date Wed, 10 May 2017 20:37:04 GMT
Sorabh Hamirwasia created DRILL-5501:

             Summary: Improve the negotiation of max_wrapped_size for encryption.
                 Key: DRILL-5501
                 URL: https://issues.apache.org/jira/browse/DRILL-5501
             Project: Apache Drill
          Issue Type: Improvement
            Reporter: Sorabh Hamirwasia
            Assignee: Sorabh Hamirwasia
             Fix For: Future

With 1.11 Drill will have the support for encryption using SASL framework. As part of encryption
negotiation SASL exposes bunch of parameters like QOP, strength, maxbuffer and rawsendsize.
The details on these parameters can be found [here|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/javax/security/sasl/Sasl.java#Sasl].
This JIRA specifically is in reference to _maxbuffer_ and _rawsendsize_ parameter.

*rawsendsize* is the maximum plain text size which application should pass to wrap function
of a mechanism to produce an encoded buffer not exceeding *maxbuffer* size. It is retrieved
by application after negotiation is done for _maxbuffer_.
*maxbuffer* parameter is the maximum received buffer size (encoded) that client/server side
agrees to receive. It is configurable in Drill using
*encryption.sasl.max_wrapped_size* configuration for client and bit to bit connections. This
parameter is global for all the supported mechanisms configured. For an optimization this
configuration is also used by each connection SaslDecryptionHandler to create a pre-allocated
buffer of that size. Since each encrypted chunk will not exceed this configured value hence
we can re-use the same buffer each time to copy the encrypted chunk from wire and decrypt
it, instead of creating a buffer each time a message is received. Since currently GSSAPI (or
Kerberos) is the only available mechanism which is supported by Drill with encryption so having
this global parameter is fine. But in future if more mechanisms are supported then it can
be a issue, if the mechanism doesn't support negotiation of this parameter instead just defines
internally to be a fixed value.

As per [SASL RFC|https://tools.ietf.org/html/rfc4422#section-3.7]:
_The maximum size that each side expects is fixed by the mechanism, either through negotiation
or by its specification_

This means this parameter can either be negotiated or can be fixed by mechanisms. So in a
case let say the parameter is configured to a value of 1MB and there are 2 mechanisms which
are configured {kerberos, custom}. custom mechanism has defined fixed value of this parameter
to be 64K whereas kerberos can negotiate for 1MB size (since max allowed by GSSAPI is 16MB).
Now each connection will have a pre-define buffer of 1MB allocated in it's SaslDecryptionHandler.
For connection using custom mechanism there is wastage in memory since the maximum encoded
buffer it will ever receive is 64K. To resolve this issue following solution is proposed:

1) Use the drill configuration _max_wrapped_size_ as the global value for  _maxbuffer_ parameter
for all the mechanisms which support negotiation. For mechanisms which has it's own pre-defined
value of _maxbuffer_ the configured value will be ignored.
2) In Drill we implement a factory like KerberosFactory / Plain Factory for all the supported
mechanisms. Each factory will be aware of the behavior of it's underlying supported mechanism
and use the configured value accordingly i.e. with all the bounds checking / ignoring it totally
as well. For example: 
* Kerberos factory will know that it supports negotiation of _maxbuffer_ upto max value of
16MB. So it can use the Drill configured value and perform the bound check before setting
it in SASL layer (i.e. when saslClient/saslServer are created for negotiation)
* Custom factory will ignore this configuration value since it's underlying mechanism has
fixed defined value of _maxbuffer_ and will use that.

3) Once the Sasl layer is created the negotiation for the connection will happen based on
chosen mechanism. After negotiation is completed Drill can retrieve the value of *maxbuffer*
and corresponding *rawsendsize*  using saslClient/saslServer.getNegotiatedProperty() and set
that in the EncryptionContext instance of that connection.
4) I didn't found that the value of *maxbuffer* parameter is updated based on negotiation
internally in mechanism implementation (looked for GSSAPI) .So it looks to me mechanism expects
application to pass correct value within bounds. Hence the need to check for bounds of configured
value in corresponding factory is needed (as mentioned in step 2), so that when the parameter
value is retrieved after negotiation the connection get's the correct value in it's EncryptionContext.
5) Later when security handlers are added as part of each connection, it's corresponding SaslDecryptionHandler
will use the buffer size in EncryptionContext (which was updated after negotiation) to allocate
the buffer.

The above solution will resolve the issue seen in example discussed before. As now kerberos
mechanism will negotiate for 1MB MaxEncoded buffer size since it's within it's max bound of
16MB whereas custom mechanism will ignore the configured value and use fixed size of 64K defined
by mechanism. Later when Sasl negotiation is completed, the connection using Kerberos will
set the EncryptionContext.maxWrappedSize as 1MB and connection using custom mechanism will
set it's EncryptionContext.maxWrappedSize as 64K. And inside SaslDecryptionHandler of each
connection corresponding buffer of that size will be allocated and there won't be any wastage
of memory.

With the above approach we will achieve below:
* Have global config value for mechanism which have negotiating capability.
* We can have custom mechanism which doesn't support negotiating maxbuffer value and not waste
memory in SaslDecryptionHandler but still use the optimization.
* Different connection using different mechanisms will have different footprint of memory
allocated as part of the re-usable buffer.
** For this we can have a counter for total memory in-use by SaslDecryptionHandler re-usable
buffer for each connection type (user/control/data) across all such connections.

This message was sent by Atlassian JIRA

View raw message