From dev-return-712-archive-asf-public=cust-asf.ponee.io@hudi.apache.org Wed Jun 26 13:16:46 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id DCA4218064D for ; Wed, 26 Jun 2019 15:16:45 +0200 (CEST) Received: (qmail 39869 invoked by uid 500); 26 Jun 2019 13:16:45 -0000 Mailing-List: contact dev-help@hudi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hudi.apache.org Delivered-To: mailing list dev@hudi.apache.org Received: (qmail 39858 invoked by uid 99); 26 Jun 2019 13:16:45 -0000 Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jun 2019 13:16:45 +0000 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 045B857EE for ; Wed, 26 Jun 2019 13:16:45 +0000 (UTC) Received: by mail-wr1-f48.google.com with SMTP id f9so2667759wre.12 for ; Wed, 26 Jun 2019 06:16:44 -0700 (PDT) X-Gm-Message-State: APjAAAW7RSD5m3v3wP8oXkcReD5/eFNxYjrwQb8SJw8ABxt+je3EbkbZ Xv0Wfdskj0FwzFwAUJUM1FXPzP7uCUqvt45Rzsg= X-Google-Smtp-Source: APXvYqzCbzdYHEjtzOnILOjkTxEO+K++F25xrXclq8aIb1mNbiW4BAkdu/Utx4VWjQWDwf5sPYZR71Tw6AcPXIugdHU= X-Received: by 2002:a5d:4302:: with SMTP id h2mr3316698wrq.137.1561555004234; Wed, 26 Jun 2019 06:16:44 -0700 (PDT) MIME-Version: 1.0 References: <2091735623.835082.1561409735544@mail.yahoo.com> <1252751165.834264.1561413456543@mail.yahoo.com> In-Reply-To: From: Vinoth Chandar Date: Wed, 26 Jun 2019 06:16:32 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Hoodie dataset write without partition To: dev@hudi.apache.org Content-Type: multipart/alternative; boundary="000000000000532289058c39dbf0" --000000000000532289058c39dbf0 Content-Type: text/plain; charset="UTF-8" Thanks for chipping in :) Keep it coming On Tue, Jun 25, 2019 at 1:23 AM Netsanet Gebretsadkan wrote: > Amarnath, > > Few days ago, i was having the same problem. The hoodie modeled table was > able to be created without any partition key but the hive sync was failing > when you sync up without any partition. > This was happening because the SlashEncodedDayPartionValueExtractor class > was hard-coded to be used inside the DatasourceUtils class ( > > https://github.com/apache/incubator-hudi/blob/master/hoodie-spark/src/main/java/com/uber/hoodie/DataSourceUtils.java#L237 > ), > specifically in the buildHiveSyncConfig method which enables as to > configure the settings for hive sync. Even though, you are passing the > nonpartition class extractor as a config in the properties file, it will > not be able to see the changes. So you need to change that code to the > NonPartitionKey class extractor and compile the code again. Make sure to > provide the following config defined in the properties file to be used by > delta-streamer: > > hoodie.datasource.hive_sync.partition_extractor_class=com.uber.hoodie.hive.NonPartitionedExtractor > > It will definitely work for you. > If you don't won't it to be hard coded, you can make further changes. > > Kind regards, > > On Tue, Jun 25, 2019 at 6:54 AM Vinoth Chandar wrote: > > > Amarnath, > > > > Mind sending a PR with updated docs once you get it working? :) might be > > useful for others too. Non partitioned tables have come up few times now > > > > > > > > On Mon, Jun 24, 2019 at 2:57 PM vbalaji@apache.org > > wrote: > > > > > > > > Hi Amarnath, > > > Apart from changing the partition extractor class, you would need to > > > change the keyGeneratorClass for non-partitioned table. > > > Use this param "--key-generator-class > > > com.uber.hoodie.NonpartitionedKeyGenerator" as part of DeltaStreamer > > > command-line execution. > > > Also, ensure we have the following configs defined in the properties > file > > > used by delta-streamer: > > > > > > hoodie.datasource.write.keygenerator.class=com.uber.hoodie.NonpartitionedKeyGeneratorhoodie.datasource.hive_sync.partition_extractor_class=com.uber.hoodie.hive.NonPartitionedExtractorWe > > > will eventually remove the DeltaStreamer CLI and rely on the properties > > > config for uniform handling. > > > > > > Thanks,Balaji.V > > > On Monday, June 24, 2019, 1:55:51 PM PDT, Balaji Varadarajan > > > wrote: > > > > > > Hi Amarnath, > > > I will look into it and reply back by EOD today. > > > Balaji.V > > > On Sunday, June 23, 2019, 8:21:51 AM PDT, Amarnath Venkataswamy < > > > amarnath.venkataswamy@gmail.com> wrote: > > > > > > Hi > > > > > > Is there any option to write the hoodie dataset without any partition? > > > > > > I tried but hive sync is failing when you sync up without any > partition. > > > > > > Delta streamer creates with default as partition when there is no > > > partition column. > > > > > > > > > Sent from my iPhone > > > --000000000000532289058c39dbf0--