kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <g...@confluent.io>
Subject Re: Gauging Interest in adding Encryption to Kafka
Date Mon, 03 Aug 2015 17:22:59 GMT
If I understand you correctly, you are saying that the kerberos keytab that
the broker uses to authenticate with the KMS will be somewhere on the
broker node and can be used by a malicious admin.

I agree this is a valid concern.
I am not opposed to client-side encryption, I am more concerned that the
modifications this will require in Kafka broker implementation make the
idea impractical. And obviously, as in any security discussion - there are
lots of details regarding key exchange, management and protection that are
critical.

Perhaps given a design doc, we can better evaluate the proposed tradeoffs.

Gwen



On Sat, Aug 1, 2015 at 10:10 AM, Don Bosco Durai <bosco@apache.org> wrote:

> >Any reason you think its better to let the clients handle it?
> Gwen, I agree with Todd, depending on the goal, the requirements might
> vary. If the goal is that someone stills the disk, then they should be
> able to access the data, then encrypting at Broker is enough. However, if
> the requirement is that the admin/operator should not be able to access
> the data, then client side is the only option.
>
> Hadoop/HDFS transparent data encryption has a similar philosophy, where
> the actual encryption/decryption happens at the client side.
>
> >1. Key management
> Hadoop common has a KMS. And there are industry standards like KMIP. If
> Broker does the encrypt/decrypt, then the solution is much easier. If the
> client does it, then sharing the key would be a challenge. It might be
> even necessary to use asymmetric encryption to limit sharing of the keys.
>
> Bosco
>
>
>
>
> On 7/31/15, 9:31 PM, "Jiangjie Qin" <jqin@linkedin.com.INVALID> wrote:
>
> >I agree with Todd, the major concern I have is still the complexity on
> >broker which can kill the performance - which a key advantage of Kafka. I
> >think there are two separate issues here:
> >1. Key management
> >2. the actual encryption/decryption work.
> >
> >Personally I think it might be OK to have [1] supported in Kafka given we
> >might need to be compatible with different key management system anyway.
> >But we should just make Kafka compatible with other key management systems
> >instead of letting Kafka itself to manage the keys. For [2], I think we
> >should keep it on the client side.
> >
> >Jiangjie (Becket) Qin
> >
> >On Fri, Jul 31, 2015 at 5:06 PM, Todd Palino <tpalino@gmail.com> wrote:
> >
> >> 1 - Yes, authorization combined with encryption does get us most of the
> >>way
> >> there. However, depending on the auditor it might not be good enough.
> >>The
> >> problem is that if you are encrypting at the broker, then by definition
> >> anyone who has access to the broker (i.e. operations staff) have access
> >>to
> >> the data. Consider the case where you are passing salary and other
> >> information through the system, and those people do not need a view of
> >>it.
> >> I admit, the 90% solution might be better here than going for a perfect
> >> solution, but it is something to think about.
> >>
> >> 2 - My worry is people wanting to integrate with different key systems.
> >>For
> >> example, one person may be fine with providing it in a config file,
> >>while
> >> someone else may want to use the solution from vendor A, someone else
> >>wants
> >> vendor B, and yet another person wants this obscure hardware-based
> >>solution
> >> that exists elsewhere.
> >>
> >> The compaction concern is definitely a good one I hadn't thought of. I'm
> >> wondering if it's reasonable to just say that compaction will not work
> >> properly with encrypted keys if you do not have consistent encryption
> >>(that
> >> is, the same string encrypts to the same string every time).
> >>
> >> Ultimately I don't like the idea of the broker doing any encrypt/decrypt
> >> steps OR compression/decompression. This is all CPU overhead that you're
> >> concentrating in one place instead of distributing the load out to the
> >> clients. Now yes, I know that the broker decompresses to check the CRC
> >>and
> >> assign offsets and then compresses, and we can potentially avoid the
> >> compression step with assigning the batch an offset and a count instead
> >>but
> >> we still need to consider the CRC. Adding encrypt/decrypt steps adds
> >>even
> >> more overhead and it's going to get very difficult to handle even 2
> >>Gbits
> >> worth of traffic at that rate.
> >>
> >> There are other situations that concern me, such as revocation of keys,
> >>and
> >> I'm not sure whether it is better with client-based or server-based
> >> encryption. For example, if I want to revoke a key with client-based
> >> encryption it becomes similar to how we handle Avro schemas (internally)
> >> now - you change keys, and depending on what your desire is you either
> >> expire out the data for some period of time with the older keys, or you
> >> just let it sit there and your consuming clients won't have an issue.
> >>With
> >> broker-based encryption, the broker has to work with the multiple keys
> >> per-topic.
> >>
> >> -Todd
> >>
> >>
> >> On Fri, Jul 31, 2015 at 2:38 PM, Gwen Shapira <gshapira@cloudera.com>
> >> wrote:
> >>
> >> > Good points :)
> >> >
> >> > 1) Kafka already (pending commit) has an authorization layer, so
> >> > theoretically we are good for SOX, HIPAA, PCI, etc. Transparent broker
> >> > encryption will support PCI
> >> > never-let-unencrypted-card-number-hit-disk.
> >> >
> >> > 2) Agree on Key Management being complete PITA. It may better to
> >> > centralize this pain in the broker rather than distributing it to
> >> > clients. Any reason you think its better to let the clients handle it?
> >> > The way I see it, we'll need to handle key management the way we did
> >> > authorization - give an API for interfacing with existing systems.
> >> >
> >> > More important, we need the broker to be able to decrypt and encrypt
> >> > in order to support compaction (unless we can find a cool
> >> > key-uniqueness-preserving encryption algorithm, but this may not be as
> >> > secure). I think we also need the broker to be able to re-compress
> >> > data, and since we always encrypt compressed bits (compressing
> >> > encrypted bits doesn't compress), we need the broker to decrypt before
> >> > re-compressing.
> >> >
> >> >
> >> >
> >> > On Fri, Jul 31, 2015 at 2:27 PM, Todd Palino <tpalino@gmail.com>
> >>wrote:
> >> > > It does limit it to clients that have an implementation for
> >>encryption,
> >> > > however encryption on the client side is better from an auditing
> >>point
> >> of
> >> > > view (whether that is SOX, HIPAA, PCI, or something else). Most of
> >> those
> >> > > types of standards are based around allowing visibility of data to
> >>just
> >> > the
> >> > > people who need it. That includes the admins of the system (who are
> >> often
> >> > > not the people who use the data).
> >> > >
> >> > > Additionally, key management is a royal pain, and there are lots of
> >> > > different types of systems that one may want to use. This is a
> >>pretty
> >> big
> >> > > complication for the brokers.
> >> > >
> >> > > -Todd
> >> > >
> >> > >
> >> > > On Fri, Jul 31, 2015 at 2:21 PM, Gwen Shapira
> >><gshapira@cloudera.com>
> >> > wrote:
> >> > >
> >> > >> I've seen interest in HDFS-like "encryption zones" in Kafka.
> >> > >>
> >> > >> This has the advantage of magically encrypting data at rest
> >>regardless
> >> > >> of which client is used as a producer.
> >> > >> Adding it on the client side limits the feature to the java client.
> >> > >>
> >> > >> Gwen
> >> > >>
> >> > >> On Fri, Jul 31, 2015 at 1:20 PM, eugene miretsky
> >> > >> <eugene.miretsky@gmail.com> wrote:
> >> > >> > I think that Hadoop and Cassandra do [1] (Transparent Encryption)
> >> > >> >
> >> > >> > We're doing [2] (on a side note, for [2] you still need
> >> > authentication on
> >> > >> > the producer side - you don't want an unauthorized user writing
> >> > garbage).
> >> > >> > Right now we have the 'user' doing the  encryption and submitting
> >> raw
> >> > >> bytes
> >> > >> > to the producer. I was suggesting implementing an encryptor
in
> >>the
> >> > >> > producer itself - I think it's cleaner and can be reused
by other
> >> > users
> >> > >> > (instead of having to do their own encryption)
> >> > >> >
> >> > >> > Cheers,
> >> > >> > Eugene
> >> > >> >
> >> > >> > On Fri, Jul 31, 2015 at 4:04 PM, Jiangjie Qin
> >> > <jqin@linkedin.com.invalid
> >> > >> >
> >> > >> > wrote:
> >> > >> >
> >> > >> >> I think the goal here is to make the actual message stored
on
> >> broker
> >> > to
> >> > >> be
> >> > >> >> encrypted, because after we have SSL, the transmission
would be
> >> > >> encrypted.
> >> > >> >>
> >> > >> >> In general there might be tow approaches:
> >> > >> >> 1. Broker do the encryption/decryption
> >> > >> >> 2. Client do the encryption/decryption
> >> > >> >>
> >> > >> >> From performance point of view, I would prefer [2]. It
is just
> >>in
> >> > that
> >> > >> >> case, maybe user does not necessarily need to use SSL
anymore
> >> because
> >> > >> the
> >> > >> >> data would be encrypted anyway.
> >> > >> >>
> >> > >> >> If we let client do the encryption, there are also two
ways to
> >>do
> >> so
> >> > -
> >> > >> >> either we let producer take an encryptor or users can
do
> >> > >> >> serialization/encryption outside the producer and send
raw
> >>bytes.
> >> The
> >> > >> only
> >> > >> >> difference between the two might be flexibility. For
example, if
> >> > someone
> >> > >> >> wants to know the actual bytes of a message that got
sent over
> >>the
> >> > wire,
> >> > >> >> doing it outside the producer would probably more preferable.
> >> > >> >>
> >> > >> >> Jiangjie (Becket) Qin
> >> > >> >>
> >> > >> >> On Thu, Jul 30, 2015 at 12:16 PM, eugene miretsky <
> >> > >> >> eugene.miretsky@gmail.com
> >> > >> >> > wrote:
> >> > >> >>
> >> > >> >> > Hi,
> >> > >> >> >
> >> > >> >> > Based on the security wiki page
> >> > >> >> > <https://cwiki.apache.org/confluence/display/KAFKA/Security>
> >> > >> encryption
> >> > >> >> of
> >> > >> >> > data at rest is out of scope for the time being.
However, we
> >>are
> >> > >> >> >  implementing  encryption in Kafka and would like
to see if
> >>there
> >> > is
> >> > >> >> > interest in submitting a patch got it.
> >> > >> >> >
> >> > >> >> > I suppose that one way to implement  encryption
would be to
> >>add
> >> an
> >> > >> >> > 'encrypted key' field to the Message/MessageSet
 structures in
> >> the
> >> > >> >> > wire protocole - however, this is a very big and
fundamental
> >> > change.
> >> > >> >> >
> >> > >> >> > A simpler way to add encryption support would be:
> >> > >> >> > 1) Custom Serializer, but it wouldn't be compatible
with other
> >> > custom
> >> > >> >> > serializers (Avro, etc. )
> >> > >> >> > 2)  Add a step in KafkaProducer after serialization
to encrypt
> >> the
> >> > >> data
> >> > >> >> > before it's being submitted to the accumulator (encryption
is
> >> done
> >> > in
> >> > >> the
> >> > >> >> > submitting thread, not in the producer io thread)
> >> > >> >> >
> >> > >> >> > Is there interest in adding #2 to Kafka?
> >> > >> >> >
> >> > >> >> > Cheers,
> >> > >> >> > Eugene
> >> > >> >> >
> >> > >> >>
> >> > >>
> >> >
> >>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message