hudi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinoth Chandar <vin...@apache.org>
Subject Re: Why hudi consider the Avro be the MOR's log format?
Date Tue, 15 Jun 2021 13:10:17 GMT
Hi,

We wanted a row based format to quickly log changes to the base files and
flexibly compact the file groups we wanted. If we wrote parquet for e.g, we
would incur costs of writing parquet (can be upto to 10x even) once during
ingest and once again during compaction.

Of course. This trades off query latency for ingest cost. There is also
ongoing work to flexibly keep log block data in parquet. See
InlineFileSystem/tests if interested.

Thanks
Vinoth

On Mon, Jun 14, 2021 at 1:54 AM LakeShen <shenleifighting@gmail.com> wrote:

> Hi community,
>
> I have a question,  why hudi consider the Avro be the  MOR's log format?
>
>
> Best,
> LakeShen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message