hudi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinoth Chandar <vin...@apache.org>
Subject Re: Hoodie dataset write without partition
Date Wed, 26 Jun 2019 13:16:32 GMT
Thanks for chipping in :) Keep it coming

On Tue, Jun 25, 2019 at 1:23 AM Netsanet Gebretsadkan <net22geb@gmail.com>
wrote:

> Amarnath,
>
> Few days ago, i was having the same problem. The hoodie modeled table was
> able to be created without any partition key but the hive sync was failing
> when you sync up without any partition.
> This was happening because the SlashEncodedDayPartionValueExtractor class
> was hard-coded to be used inside the DatasourceUtils class (
>
> https://github.com/apache/incubator-hudi/blob/master/hoodie-spark/src/main/java/com/uber/hoodie/DataSourceUtils.java#L237
> ),
> specifically in the buildHiveSyncConfig method which enables as to
> configure the settings for hive sync. Even though, you are passing the
> nonpartition class extractor as a config in the properties file,  it will
> not be able to see the changes. So you need to change that code to the
> NonPartitionKey class extractor and compile the code again. Make sure to
> provide the following config defined in the properties file to be used by
> delta-streamer:
>
> hoodie.datasource.hive_sync.partition_extractor_class=com.uber.hoodie.hive.NonPartitionedExtractor
>
> It will definitely work for you.
> If you don't won't it to be hard coded, you can make further changes.
>
> Kind regards,
>
> On Tue, Jun 25, 2019 at 6:54 AM Vinoth Chandar <vinoth@apache.org> wrote:
>
> > Amarnath,
> >
> > Mind sending a PR with updated docs once you get it working? :) might be
> > useful for others too. Non partitioned tables have come up few times now
> >
> >
> >
> > On Mon, Jun 24, 2019 at 2:57 PM vbalaji@apache.org <vbalaji@apache.org>
> > wrote:
> >
> > >
> > > Hi Amarnath,
> > > Apart from changing the partition extractor class, you would need to
> > > change the keyGeneratorClass for non-partitioned table.
> > > Use this param "--key-generator-class
> > > com.uber.hoodie.NonpartitionedKeyGenerator" as part of DeltaStreamer
> > > command-line execution.
> > > Also, ensure we have the following configs defined in the properties
> file
> > > used by delta-streamer:
> > >
> >
> hoodie.datasource.write.keygenerator.class=com.uber.hoodie.NonpartitionedKeyGeneratorhoodie.datasource.hive_sync.partition_extractor_class=com.uber.hoodie.hive.NonPartitionedExtractorWe
> > > will eventually remove the DeltaStreamer CLI and rely on the properties
> > > config for uniform handling.
> > >
> > > Thanks,Balaji.V
> > >     On Monday, June 24, 2019, 1:55:51 PM PDT, Balaji Varadarajan
> > > <v.balaji@ymail.com.INVALID> wrote:
> > >
> > >   Hi Amarnath,
> > > I will look into it and reply back by EOD today.
> > > Balaji.V
> > >     On Sunday, June 23, 2019, 8:21:51 AM PDT, Amarnath Venkataswamy <
> > > amarnath.venkataswamy@gmail.com> wrote:
> > >
> > >  Hi
> > >
> > > Is there any option to write the hoodie dataset without any partition?
> > >
> > > I tried but hive sync is failing when you sync up without any
> partition.
> > >
> > > Delta streamer creates with default as partition when there is no
> > > partition column.
> > >
> > >
> > > Sent from my iPhone
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message