From dev-return-4268-archive-asf-public=cust-asf.ponee.io@hudi.apache.org Sun Aug 1 15:49:35 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id B123D180608 for ; Sun, 1 Aug 2021 17:49:35 +0200 (CEST) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id D7C9C6028F for ; Sun, 1 Aug 2021 15:49:33 +0000 (UTC) Received: (qmail 85312 invoked by uid 500); 1 Aug 2021 15:49:32 -0000 Mailing-List: contact dev-help@hudi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hudi.apache.org Delivered-To: mailing list dev@hudi.apache.org Received: (qmail 85289 invoked by uid 99); 1 Aug 2021 15:49:31 -0000 Received: from spamproc1-he-de.apache.org (HELO spamproc1-he-de.apache.org) (116.203.196.100) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Aug 2021 15:49:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamproc1-he-de.apache.org (ASF Mail Server at spamproc1-he-de.apache.org) with ESMTP id 2B3461FF517; Sun, 1 Aug 2021 15:49:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamproc1-he-de.apache.org X-Spam-Flag: NO X-Spam-Score: 0.249 X-Spam-Level: X-Spam-Status: No, score=0.249 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=0.2, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamproc1-he-de.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([116.203.227.195]) by localhost (spamproc1-he-de.apache.org [116.203.196.100]) (amavisd-new, port 10024) with ESMTP id b3s3bR2umHA8; Sun, 1 Aug 2021 15:49:30 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::130; helo=mail-lf1-x130.google.com; envelope-from=leesf0315@gmail.com; receiver= Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 577237FAD0; Sun, 1 Aug 2021 15:49:30 +0000 (UTC) Received: by mail-lf1-x130.google.com with SMTP id u3so28996806lff.9; Sun, 01 Aug 2021 08:49:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=u+SICEqhk9zKzf1KW3vmgpZm879AAkdizSzrQVYFjyY=; b=rKdmyJr5hlzqwbVjrfYwKZ3ddqOGUwrMp6v1XxFdOCxm0S4pmZyHrTNgK1s3f8jdPX jUKlDMXplTGEylw4EeD6Q54wo/I7Tb23aEiiQkhOJSz2g7oq+KjRyhWFXBMImS8thCSh Q5xUoaZB/+ScWLQKg9DS3aweM6fzIiy0iiZauvaUfpiY1HNOq5a/AZz8VI0eF2wAp1Ah jBzBKMfU8XRn7UlxdgDishznzu38y6rSOaYE3EYaRzXLI2LJU6vMfkY19nYUqxVb9rSa 2tq1PK6nNdhvD4q+17+0kxBYatpe/Sgcq2yuhbRLjrCwZ23wu9dpiSQUJPq9xjkpcngt PWaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=u+SICEqhk9zKzf1KW3vmgpZm879AAkdizSzrQVYFjyY=; b=Cg8K9L0ccm4FWtAnh99L34Ap3KcAFQtER+4gHuSz6cw0UTgjYWwH7IPF1YeZNumBD+ w/JYOEsayCG0PeGgasINYWr8jbwVUU1LyxYXOk8o3s1YXXMVfRZPyilTaHwuZXSjIj2B GZiIMNdHDfgNkq1EkFMkzAT31zl3dAghzHXlZs489MLmCXAIQTzxrB7IwjpvuRvAUEN3 vZV7yGwI9mzSvA0KfJfQc340AR1UhfetEPtlXs2cJhn9mDy+nnQew+81zKbIfYv8w8M4 rFEzdjmt6sMJtUp9EkUaJV6r1D9crmofazrtBnzx2CJ+1dgRU+Di3ch0DpfcOsE3BlKn nCYg== X-Gm-Message-State: AOAM533LNKxNV9SB4EgDtx2EJjM2Kc+cK3TUYYkYkC4W6ZQjZ2/F+uY9 9azXv4l/RzLMxm3/4TwNCanif4LsC4CWs5JE6VscRlMLHgw= X-Google-Smtp-Source: ABdhPJwKSWv7zqlzpdUdlxlk/kRadODJ3WFhK8nt34rt7Hx/STtb2DO7Hm06gwpw1HPBQieRvEKI/24LZUuoPOmhxxY= X-Received: by 2002:ac2:55a7:: with SMTP id y7mr9568915lfg.179.1627832964405; Sun, 01 Aug 2021 08:49:24 -0700 (PDT) MIME-Version: 1.0 From: leesf Date: Sun, 1 Aug 2021 23:49:00 +0800 Message-ID: Subject: [ANNOUNCE] Hudi Community Update(2021-07-18 ~ 2021-08-01) To: dev Cc: users@hudi.apache.org Content-Type: multipart/alternative; boundary="00000000000098dd3605c8816564" --00000000000098dd3605c8816564 Content-Type: text/plain; charset="UTF-8" Dear community, Nice to share Hudi community bi-weekly updates for 2021-07-18 ~ 2021-08-01 with updates on features, bug fixes and tests. ======================================= Features [Core] Adding support to disable meta columns with bulk insert operation [1] [DeltaStreamer] Add INSERT_OVERWRITE and INSERT_OVERWRITE_TABLE support to DeltaStreamer [2] [Spark Integration] MergeInto Support Partial Update For COW [3] [Hive Integration] DeltaStreamer kafka source supports consuming from specified timestamp [4] [Hive Integration] Adding support for HMS for running DDL queries in hive-sync [5] [Docs] Automate the generation of configs webpage as configs are added to Hudi repo [6] [Core] Adding virtual key support to COW table [7] [Flink Integration] Add rateLimiter when Flink writes to hudi [8] [Core] Integrate consumers with rocksDB and compression within External Spillable Map [9] [Flink Integration] Add option 'hive_sync.mode' for flink writer [10] [Spark Integration] Explicit parallelism for flink bulk insert [11] [Hive Integration] Support setting hive sync partition extractor class based on flink configuration [12] [1] https://issues.apache.org/jira/browse/HUDI-2161 [2] https://issues.apache.org/jira/browse/HUDI-1860 [3] https://issues.apache.org/jira/browse/HUDI-1884 [4] https://issues.apache.org/jira/browse/HUDI-1447 [5] https://issues.apache.org/jira/browse/HUDI-1848 [6] https://issues.apache.org/jira/browse/HUDI-1241 [7] https://issues.apache.org/jira/browse/HUDI-2176 [8] https://issues.apache.org/jira/browse/HUDI-2215 [9] https://issues.apache.org/jira/browse/HUDI-2044 [10] https://issues.apache.org/jira/browse/HUDI-2228 [11] https://issues.apache.org/jira/browse/HUDI-2241 [12] https://issues.apache.org/jira/browse/HUDI-2184 ======================================= Bugs [Flink Integration] Remove state in BootstrapFunction [1] [Flink Integration] Create new bucket when NewFileAssignState filled[2] [Flink Integration] Clean and reset the bootstrap events for coordinator when task failover [3] [Code Cleanup] Clean up Multiple versions of scala libraries detected Warning [4] [Flink Integraion] Add marker files for flink writer [5] [Spark Integration] Sync Hive Failed When Execute CTAS In Spark2 And Spark3 [6] [Core] Fix checkpoint blocked because getLastPendingInstant() action after than restoreWriteMetadata() action [7] [Flink Integration] Rollback inflight compaction for flink writer [8] [Spark Integration] MergeInto MOR Table May Result InCorrect Result [9] [Spark Integration] Missing PrimaryKey In Hoodie Properties For CTAS Table [10] [Core] residual temporary files after clustering are not cleaned up [11] [Core] Fix NPE of HoodieConfig [12] [Core] Fix no value present in incremental query on MOR [13] [Spark Integration] Fix Alter Partitioned Table Failed [14] [Flink Integration] Only sync hive meta on successful commit for flink batch writer [15] [Core] Make codahale times transient to avoid serializable exceptions [16] [Core]] BucketAssigner generates the fileId evenly to avoid data skew [17] [Hive Integration] Fix database alreadyExists exception while hive sync [18] [Spark Integration] Performance loss with the additional hoodieRecords.isEmpty() in HoodieSparkSqlWriter#write [19] [Spark Integration] Unpersist the input rdd after the commit is completed to save the memory space for inline compaction [20] [Spark Integration] Fix Exception Cause By Table Name Case Sensitivity For Append Mode Write [21] [Flink Integration] Default consumes from the latest instant for flink streaming reader [22] [Flink Integration] Builtin sort operator for flink bulk insert [23] [Core] Fix missing HoodieWriteStat in HoodieCreateHandle [24] [1] https://issues.apache.org/jira/browse/HUDI-2193 [2] https://issues.apache.org/jira/browse/HUDI-2145 [3] https://issues.apache.org/jira/browse/HUDI-2198 [4] https://issues.apache.org/jira/browse/HUDI-2192 [5] https://issues.apache.org/jira/browse/HUDI-2204 [6] https://issues.apache.org/jira/browse/HUDI-2195 [7] https://issues.apache.org/jira/browse/HUDI-2206 [8] https://issues.apache.org/jira/browse/HUDI-2205 [9] https://issues.apache.org/jira/browse/HUDI-2139 [10] https://issues.apache.org/jira/browse/HUDI-2212 [11] https://issues.apache.org/jira/browse/HUDI-2214 [12] https://issues.apache.org/jira/browse/HUDI-2219 [13] https://issues.apache.org/jira/browse/HUDI-2217 [14] https://issues.apache.org/jira/browse/HUDI-2223 [15] https://issues.apache.org/jira/browse/HUDI-2227 [16] https://issues.apache.org/jira/browse/HUDI-2240 [17] https://issues.apache.org/jira/browse/HUDI-2245 [18] https://issues.apache.org/jira/browse/HUDI-2244 [19] https://issues.apache.org/jira/browse/HUDI-1425 [20] https://issues.apache.org/jira/browse/HUDI-2117 [21] https://issues.apache.org/jira/browse/HUDI-2251 [22] https://issues.apache.org/jira/browse/HUDI-2252 [23] https://issues.apache.org/jira/browse/HUDI-2254 [24] https://issues.apache.org/jira/browse/HUDI-2218 ====================================== Tests [Tests] Fixing hudi_test_suite for spark nodes and adding spark bulk_insert node [1] [Tests] Fix NullPointerException in TestHoodieConsoleMetrics [2] [Tests] Refactoring few tests to reduce runningtime. DeltaStreamer and MultiDeltaStreamer tests. Bulk insert row writer tests [3] [1] https://issues.apache.org/jira/browse/HUDI-2007 [2] https://issues.apache.org/jira/browse/HUDI-2211 [3] https://issues.apache.org/jira/browse/HUDI-2253 Best, Leesf --00000000000098dd3605c8816564--