hudi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinoth Chandar <vin...@apache.org>
Subject Re: Possible ambiguity in HoodieKey
Date Tue, 11 Jun 2019 16:15:50 GMT
Thanks for the link. I was grabbing in parallel as well :)

So, the KeyGenerator class works off a GenericRecord and JSON->
GenericRecord is already built in to the DeltaStreamer.
I don't think this will add any particular performance overhead per se.

It may be worth pulling this class (+ tweaked to suit your needs) out of
the PR and merge it into master?



On Tue, Jun 11, 2019 at 9:11 AM Jaimin Shah <shahjaimin0395@gmail.com>
wrote:

> Forgot to add class link
>
> https://github.com/apache/incubator-hudi/blob/e916b21cc5989ab00791467fcc11a02bb0de093a/hoodie-bench/src/main/java/com/uber/hoodie/integrationsuite/generator/ComplexKeyGenerator.java
> This is the class I am referring to.
>
> On Tuesday, 11 June 2019, Jaimin Shah <shahjaimin0395@gmail.com> wrote:
>
> > Hi Vinoth,
> >
> >   Thanks for the prompt reply. This class was shared earlier on the
> > mailing list by someone to handle complex key. I was thinking maybe we
> can
> > create a Jason object and then parse it as string to create key then it
> > will be full proof because we don’t control the characters in the input
> > data.
> >
> >   I am not sure about the performance implications of doing so maybe you
> > can help there.
> >
> > Thanks,
> > Jaimin
> >
> > On Tuesday, 11 June 2019, Vinoth Chandar <vinoth@apache.org> wrote:
> >
> >> Hi Jaimin,
> >>
> >> True. Is this a custom class you have? if we separate the concatenation
> by
> >> a standard special character, it should be fine?  for e.g  CA#US, C#AUS
> ?
> >>
> >> Thanks
> >> Vinoth
> >>
> >> On Mon, Jun 10, 2019 at 4:53 AM Jaimin Shah <shahjaimin0395@gmail.com>
> >> wrote:
> >>
> >> > Hi
> >> >   I was going through the ComplexKeyGenerator class. I found that the
> >> class
> >> > generates key by concatenating the all keys to make compound key. But
> I
> >> am
> >> > wondering that some cases can arise later which can create problems.
> >> >
> >> > For example our data has 2 attributes as key
> >> > key1 key2  data
> >> > CA   US       xyz
> >> > C    AUS       abc
> >> >
> >> > In this case key for both rows will be same will it cause any problem?
> >> > Instead of keeping keys as string keeping them as map will solve the
> >> > problem?
> >> >
> >> > Thanks
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message