kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: [DISCUSS] KIP-276: Add StreamsConfig prefix for different consumers
Date Thu, 12 Apr 2018 00:35:31 GMT
Thanks for the explanation.

I am not sure if I follow your objection about inheritance. Basically,
every config that is not overwritten inherits the default setting -- so
there is actually already a hierarchy in place from my point of view.

With regard to security config, I guess the issue is, that it's not one
parameter, but multiple and if you only overwrite some but not others
you end up with a bad config -- for this case, my personal
recommendation would be, to not use the base config but put a specific
config for each client to avoid confusion. However, most configs are
"single parameter" configs and I don't see any issue for those with
regard to inheritance.

About your proposal, you basically say the following: if for any client,
at least one client specific config is set (eg, restore.consumer.X),
even if base configs aare specified (consumer.A and consumer.B), the
restore consumer should only set restore.consumer.X and use the default
for config A and B and ignore consumer.A and consumer.B setting? While I
understand why your argue for this, it does not seem to be intuitive.
It's basically a conditional inheritance and maybe even more complex
that what the KIP proposes?

Having said this, if we follow your proposal, it might even be better to
do a more radical change and drop inheritance completely. Ie, config `X`
would be ignored (if it's not a Streams Config), and users need to
prefix it for all cases with either `producer.`, `main.consumer.`,
`restore.consumer.`, or `global.consumer`.

We could introduce this, however, it breaks the current pattern, because
inheritance is already in place. For configs that are common for
producer and consumer (for example `request.timeout.ms`) today users can
specify the following:

request.timeout.ms=X
consumer.request.timeout.ms=Y
producer.retries=Z

With this, producer set both `request.timeout` as well as `retries`.

If we adapt your suggestion we make a backward incompatible change.
Thus, only if we feel strong about your argument (it has a valid concern
about configs with multiple parameters) we should change the behavior
from my point of view. The KIP only introduce one more level in the
hierarchy.


The last point:

>> Why? what are those use cases? What case isn't handled by the configs we
>> have today? It seems weird that different consumers within the same app
>> would need different client-side settings.

For example, it restore consumer is only used when we re-create a store
-- for this case, high throughput is important while latency is not.
Thus, one might want to set different config compared to the main
consumer. Similar to the global consumer: the usual pattern is, that the
global consumer bootstraps some stores at startup and is mainly idle
during regular processing -- thus, poll() interval could be larger to
reduce sending unnecessary requests to the broker.

Hope this makes sense.


-Matthias




On 4/11/18 3:19 PM, Ewen Cheslack-Postava wrote:
> On Thu, Apr 5, 2018 at 3:28 PM, Matthias J. Sax <matthias@confluent.io>
> wrote:
> 
>> Ewen,
>>
>> I cannot completely follow your argument. Can you elaborate a little
>> bit? After reading you mail, I am not sure if you prefer config
>> inheritance or not? And if, to what extend?
>>
> 
> So we've had issues with config complexity in a couple of places. On the
> one hand I think having the ability to specify configs separately for
> different clients is important. In that regard namespacing the different
> configs is important. This is particularly important when you can have
> conflicting configs. This comes up frequently with interceptors because the
> producer and consumer use the same config name, but require classes that
> implement different interfaces. If you have both a producer and consumer,
> you need to have this namespacing.
> 
> On the other hand, in the most common case where the user wants the same
> configs for everything (i.e. just pass the bootstrap brokers, or share
> common authentication configs), the namespacing becomes a pain. We have
> this issue in connect for worker, producer and consumer settings, where if
> you want to connect to a secured cluster you duplicate the info.
> 
> So because of that I like the *idea* of inheritance, but all the proposals
> I've seen so far basically just say "use base configuration and add
> overrides". See notes below about why I don't think this is a good idea.
> 
> 
>>
>>
>>> In that mode if you override *anything*, you
>>>> should specify *everything*.
>>
>> Can you give an example for this?
>>
> 
> Security is the most obvious example of this. If you're overriding any
> security config, you clearly have a different security setup than the
> parent and inheriting stuff is just confusing and requires digging through
> logs to see the "effective" config if something isn't working.
> 
> One way to keep this simpler to understand is that you can use the same
> config from a base config entirely or, if you specify a single config with
> the special prefix then you *only* use configs. There might be cases where
> you duplicate the same settings, but the all or nothing approach makes it
> easy to understand what is happening and you still only pay the cost of
> larger configs if you need the advanced functionality of customizing
> different clients.
> 
> The unfortunate side effect of this is that some things, e.g. bootstrap
> brokers, probably don't make sense to customize under the different types
> of clients but you'd still need to duplicate those settings. This is why I
> was saying I haven't really seen a clean solution to this.
> 
> 
> 
>>
>>
>>> But if it was just, e.g. group.id, client.id,
>>>> and offset.reset that needed adjustment for a restoreConsumer, that
>> would
>>>> be the default and everything else is inherited.
>>
>> Don't understand this part.
>>
>>
>>> I think
>>>> inheritance in these configuration setups needs to be approached very
>>>> carefully.
>>
>> Agreed. I think that our approach is carefully designed though.
>>
>>
>> For follow up on Guozhang's argument, I agree with him, that for most
>> users the existing prefix would be good enough and the three new
>> prefixes are for advanced users.
>>
> 
> This is maybe where the biggest gap in the proposal is. It just says:
> 
>> For some use cases, it's required to set different configs for different
> consumers.
> 
> Why? what are those use cases? What case isn't handled by the configs we
> have today? It seems weird that different consumers within the same app
> would need different client-side settings.
> 
> -Ewen
> 
> 
>>
>>
>>
>> -Matthias
>>
>> On 4/5/18 11:37 AM, Guozhang Wang wrote:
>>> Ewen, thanks for your thoughts.
>>>
>>> Just to clarify the current KIP does not propose for four new prefixes,
>> but
>>> three plus the existing one. So it is
>>>
>>> "consumer"
>>> "main.consumer"
>>> "restore.consumer"
>>> "global.consumer"
>>>
>>>
>>> If we design the configs for the first time, I'd be in favor of only
>>> keeping the last three prefixes. But as of now we have a compatibility
>>> issue to consider. So the question is if it is worthwhile to break the
>>> compatibility and enforce users to make code changes. My rationale is
>> that:
>>>
>>> 1) for normal users they would not bother overriding configs for
>> different
>>> types of consumers, where "consumer" prefix is good enough for them; and
>>> today they probably have already made those overrides via "consumer".
>>>
>>> 2) for advanced users they would need some additional overrides for
>>> different types of consumers, and they would go ahead and learn about the
>>> other three prefixes and set them there.
>>>
>>>
>>> I agree that four prefixes would be more confusing, but if we think use
>>> case 1)'s popularity is much larger than use case 2), which by the way we
>>> can still debate on, then I'd argue it's better to not force normal user
>>> groups from 1) to make code changes to make advanced users from 2) less
>>> confused about the hierarchy.
>>>
>>>
>>> Guozhang
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Apr 4, 2018 at 11:23 PM, Ewen Cheslack-Postava <
>> ewen@confluent.io>
>>> wrote:
>>>
>>>> I think this model is more confusing than it needs to be.
>>>>
>>>> We end up with 4 prefixes despite only have 3 types of consumers. We
>> have
>>>> prefixes for: "base", "main", "global", and "restore". However, we only
>>>> instantiate consumers of type "main", "global", and "restore".
>>>>
>>>> Until now, we've only had two types of configs mapping to two types of
>>>> consumers, despite internally having some common shared configs as a
>>>> baseline to bootstrap the two "public" ones (see
>>>> StreamsConfig.getCommonConsumerConfigs). Do we want to complicate this
>> to
>>>> 4
>>>> levels of "public" configs when there are only 3 types of concrete
>> configs
>>>> we instantiate?
>>>>
>>>> More generally, I worry that we're optimizing too much to avoid
>> copy/paste
>>>> in less common cases to the point that we would confuse users with yet
>> more
>>>> concepts before they can even write their configuration. What if we
>> took an
>>>> (perhaps modified) all or nothing approach to inheriting from the the
>>>> "main" consumer properties? In that mode if you override *anything*, you
>>>> should specify *everything*. But if it was just, e.g. group.id,
>> client.id,
>>>> and offset.reset that needed adjustment for a restoreConsumer, that
>> would
>>>> be the default and everything else is inherited. Same deal for a clearly
>>>> specified set of configs for global consumers that required override.
>>>>
>>>> I feel like I'm also concurrently seeing the opposite side of this
>> problem
>>>> in Connect where we namespaced and didn't proactively implement
>>>> inheritance; and while people find the config duplication annoying (and
>>>> confusing!), we inevitably find cases where they need it. I think
>>>> inheritance in these configuration setups needs to be approached very
>>>> carefully. Admittedly, some of the challenges in Connect don't appear
>> here
>>>> (e.g. conflicts in producer/consumer config naming, since this is a
>>>> Consumer-only KIP), but similar problems arise.
>>>>
>>>> -Ewen
>>>>
>>>> On Wed, Apr 4, 2018 at 10:56 PM, Boyang Chen <bchen11@outlook.com>
>> wrote:
>>>>
>>>>> Thanks Guozhang! I already updated the pull request and KIP to
>> deprecate
>>>>> getConsumerConfigs() function. Do you think we could move to a voting
>>>> stage
>>>>> now?
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: Guozhang Wang <wangguoz@gmail.com>
>>>>> Sent: Thursday, April 5, 2018 9:52 AM
>>>>> To: dev@kafka.apache.org
>>>>> Subject: Re: [DISCUSS] KIP-276: Add StreamsConfig prefix for different
>>>>> consumers
>>>>>
>>>>> I agree that renaming the method in this case may not worth it. Let's
>>>> keep
>>>>> the existing function names.
>>>>>
>>>>> On Wed, Apr 4, 2018 at 6:06 PM, Matthias J. Sax <matthias@confluent.io
>>>
>>>>> wrote:
>>>>>
>>>>>> Thanks for updating the KIP.
>>>>>>
>>>>>> One more comment. Even if we don't expect users to call
>>>>>> `StreamsConfig#getConsumerConfigs()` it is still public API. Thus,
we
>>>>>> cannot just rename the method.
>>>>>>
>>>>>> I think we have two options: either we keep the current name or we
>>>>>> deprecate the method and introduce `getMainConsumerConfigs()` in
>>>>>> parallel. The deprecated method would just call the new method.
>>>>>>
>>>>>> I am not sure if we gain a lot renaming the method, thus I have a
>>>> slight
>>>>>> preference in just keeping the method name as is. (But I am also
ok to
>>>>>> rename it, if people prefer this)
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>> On 4/3/18 1:59 PM, Bill Bejeck wrote:
>>>>>>> Boyang,
>>>>>>>
>>>>>>> Thanks for making changes to the KIP,  I'm +1 on the updated
version.
>>>>>>>
>>>>>>> -Bill
>>>>>>>
>>>>>>> On Tue, Apr 3, 2018 at 1:14 AM, Boyang Chen <bchen11@outlook.com>
>>>>> wrote:
>>>>>>>
>>>>>>>> Hey friends,
>>>>>>>>
>>>>>>>>
>>>>>>>> both KIP<https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>> 276+Add+StreamsConfig+prefix+for+different+consumers>
and pull
>>>>> request<
>>>>>>>> https://github.com/apache/kafka/pull/4805> are updated.
Feel free
>>>> to
>>>>>> take
>>>>>>>> another look.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for your valuable feedback!
>>>>>>>>
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Boyang
>>>>>>>>
>>>>>>>> ________________________________
>>>>>>>> From: Boyang Chen <bchen11@outlook.com>
>>>>>>>> Sent: Tuesday, April 3, 2018 11:39 AM
>>>>>>>> To: dev@kafka.apache.org
>>>>>>>> Subject: Re: [DISCUSS] KIP-276: Add StreamsConfig prefix
for
>>>> different
>>>>>>>> consumers
>>>>>>>>
>>>>>>>> Thanks Matthias, Ted and Guozhang for the inputs. I shall
address
>>>> them
>>>>>> in
>>>>>>>> next round.
>>>>>>>>
>>>>>>>>
>>>>>>>> ________________________________
>>>>>>>> From: Matthias J. Sax <matthias@confluent.io>
>>>>>>>> Sent: Tuesday, April 3, 2018 4:43 AM
>>>>>>>> To: dev@kafka.apache.org
>>>>>>>> Subject: Re: [DISCUSS] KIP-276: Add StreamsConfig prefix
for
>>>> different
>>>>>>>> consumers
>>>>>>>>
>>>>>>>> Yes, your examples make sense to me. That was the idea behind
the
>>>>>> proposal.
>>>>>>>>
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>> On 4/2/18 11:25 AM, Guozhang Wang wrote:
>>>>>>>>> @Matthias
>>>>>>>>>
>>>>>>>>> That's a good question: I think adding another config
for the main
>>>>>>>> consumer
>>>>>>>>> makes good tradeoffs between compatibility and new user
>>>> convenience.
>>>>>> Just
>>>>>>>>> to clarify, from user's pov on upgrade:
>>>>>>>>>
>>>>>>>>> 1) I'm already overriding some consumer configs, and
now I want to
>>>>>>>> override
>>>>>>>>> these values differently for restore consumers, I'd add
one new
>>>> line
>>>>>> for
>>>>>>>>> the restore consumer prefix.
>>>>>>>>>
>>>>>>>>> 2) I'm already overriding some consumer configs, and
now I want to
>>>>> NOT
>>>>>>>>> overriding them for restore consumers, I'd change my
override from
>>>>>>>>> `consumer.X` to `main.consumer.X`.
>>>>>>>>>
>>>>>>>>> 3) I'm new and have not any consumer overrides, and now
if I want
>>>> to
>>>>>>>>> override some, I'd use `main.consumer`, `restore.consumer`
for
>>>>> specific
>>>>>>>>> consumer types, and ONLY consider `consumer` for the
ones that I
>>>> want
>>>>>> to
>>>>>>>>> apply universally.
>>>>>>>>>
>>>>>>>>> 4) I'm already overriding some consumer configs and I'm
happy with
>>>>>> what I
>>>>>>>>> get, I do not change anything.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Guozhang
>>>>>>>>>
>>>>>>>>> On Mon, Apr 2, 2018 at 11:10 AM, Ted Yu <yuzhihong@gmail.com>
>>>> wrote:
>>>>>>>>>
>>>>>>>>>> bq. to introduce one more prefix `main.consumer.`
>>>>>>>>>>
>>>>>>>>>> Makes sense.
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 2, 2018 at 11:06 AM, Matthias J. Sax
<
>>>>>> matthias@confluent.io
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Boyang,
>>>>>>>>>>>
>>>>>>>>>>> thanks a lot for the KIP.
>>>>>>>>>>>
>>>>>>>>>>> Couple of questions:
>>>>>>>>>>>
>>>>>>>>>>>> (MODIFIED) public Map<String, Object>
getRestoreConsumerConfigs(
>>>>>> final
>>>>>>>>>>> String clientId);
>>>>>>>>>>>
>>>>>>>>>>> You mean that the implementation/semantics of
this method
>>>> changes,
>>>>>> but
>>>>>>>>>>> not the API itself? Might be good to add this
as "comment style"
>>>>>>>> instead
>>>>>>>>>>> of "(MODIFIED)" prefix.
>>>>>>>>>>>
>>>>>>>>>>>> By rewriting the getRestoreConsumerConfigs()
function and adding
>>>>> the
>>>>>>>>>>> getGlobalConsumerConfigs() function, if one user
uses
>>>>>>>>>>> restoreConsumerPrefix() or globalConsumerPrefix()
when adding new
>>>>>>>>>>> configurations, the configs shall overwrite base
consumer config.
>>>>> If
>>>>>>>> not
>>>>>>>>>>> specified, restore consumer and global consumer
shall share the
>>>>> same
>>>>>>>>>> config
>>>>>>>>>>> with base consumer.
>>>>>>>>>>>
>>>>>>>>>>> While this does make sense for backward compatibility,
I am
>>>> wonder
>>>>> if
>>>>>>>> it
>>>>>>>>>>> makes the config "inheritance logic" (ie, hierarchy)
too complex?
>>>>> We
>>>>>>>>>>> basically introduce a second level of overwrites.
It might be
>>>>> simpler
>>>>>>>> to
>>>>>>>>>>> not introduce this hierarchy with the cost to
break backward
>>>>>>>>>> compatibility.
>>>>>>>>>>>
>>>>>>>>>>> For example, config `request.timeout.ms`:
>>>>>>>>>>>
>>>>>>>>>>> User sets `request.timeout.ms=<user-value>`
>>>>>>>>>>> To change it for the main consumer, user also
sets
>>>>>>>>>>> `consumer.request.timeout.ms=<consumer-value>`
>>>>>>>>>>>
>>>>>>>>>>> If user only wants to change the config for main
consumer, but
>>>> not
>>>>>> for
>>>>>>>>>>> global or restore consumer, user needs to add
two more configs:
>>>>>>>>>>>
>>>>>>>>>>> `restore.consumer.request.timeout.ms=<user-value>`
>>>>>>>>>>> and
>>>>>>>>>>> `global.consumer.request.timeout.ms=<user-value>`
>>>>>>>>>>>
>>>>>>>>>>> to reset both back to the default config. IMHO,
this is not an
>>>>>> optimal
>>>>>>>>>>> user experience. Thus, it might be worth to change
the semantics
>>>>> for
>>>>>>>>>>> `consumer.` prefix to only apply those configs
to the main
>>>>> consumer.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Not sure what other think what the better solution
is (I am not
>>>>> sure
>>>>>> by
>>>>>>>>>>> myself to be honest---just wanted to point it
out and discuss the
>>>>>>>>>>> pros/cons for both).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Another though would be, to introduce one more
prefix
>>>>>> `main.consumer.`
>>>>>>>>>>> -- using this, the existing `consumer.` prefix
would apply to all
>>>>>>>>>>> consumers (keeping it's current semantics) while
we have
>>>> overwrites
>>>>>> for
>>>>>>>>>>> all three consumers -- this allow to directly
set `main.consumer`
>>>>>>>>>>> instead of `consumer` avoiding the weird pattern
from my example
>>>>>> above
>>>>>>>>>>> and preserves backward compatibility. Ie, if
we introduce an
>>>>>> hierarchy
>>>>>>>>>>> of overwrite, a "full" hierarchy might be better
than a "partial"
>>>>>>>>>>> hierarchy.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Looking forward to your thoughts.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -Matthias
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 4/1/18 5:55 PM, Guozhang Wang wrote:
>>>>>>>>>>>> Thanks for the KIP Boyang, the KIP looks
good to me.
>>>>>>>>>>>>
>>>>>>>>>>>> For config values, we use underscore for
keeping a single word;
>>>>> for
>>>>>>>>>>> config
>>>>>>>>>>>> keys, though, we do not use underscores or
dashes. I'd suggest
>>>>> using
>>>>>>>>>> dots
>>>>>>>>>>>> to be consistent with others.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Otherwise I'm +1 on the KIP.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Apr 1, 2018 at 10:56 AM, Ted Yu <yuzhihong@gmail.com>
>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Looks good overall.
>>>>>>>>>>>>>
>>>>>>>>>>>>> public static final String RESTORE_CONSUMER_PREFIX
=
>>>>>>>>>>> "restore-consumer.";
>>>>>>>>>>>>>
>>>>>>>>>>>>> For other constants in StreamsConfig,
underscore is used
>>>> instead
>>>>> of
>>>>>>>>>>> dash.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Apr 1, 2018 at 9:38 AM, Boyang
Chen <
>>>> bchen11@outlook.com
>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey friends,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would like to start a discussion
thread for KIP 276:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>>>>>>>> 276+Add+StreamsConfig+prefix+for+different+consumers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And pull request is here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4805
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [https://avatars3.githubusercontent.com/u/5845561?s=400&v=4
>>>>>>>>>> ]<https://
>>>>>>>>>>>>>> github.com/apache/kafka/pull/4805>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> KAFKA-6657: Add StreamsConfig prefix
for different consumers
>>>> by
>>>>>>>>>>> abbccdda
>>>>>>>>>>>>> ·
>>>>>>>>>>>>>> Pull Request #4805 · apache/kafka<https://github.
>>>>>>>>>>>>>> com/apache/kafka/pull/4805>
>>>>>>>>>>>>>> github.com
>>>>>>>>>>>>>> This pull request is for jira 6657.
The KIP proposal is here
>>>>> Added
>>>>>>>>>> unit
>>>>>>>>>>>>>> tests for new getGlobalConsumerConfigs
API and make sure
>>>>> existing
>>>>>>>>>>> restore
>>>>>>>>>>>>>> consumer tests are passing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Boyang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -- Guozhang
>>>>>
>>>>
>>>
>>>
>>>
>>
>>
> 


Mime
View raw message