hudi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinoth Chandar <>
Subject Re: Why hudi consider the Avro be the MOR's log format?
Date Tue, 15 Jun 2021 13:10:17 GMT

We wanted a row based format to quickly log changes to the base files and
flexibly compact the file groups we wanted. If we wrote parquet for e.g, we
would incur costs of writing parquet (can be upto to 10x even) once during
ingest and once again during compaction.

Of course. This trades off query latency for ingest cost. There is also
ongoing work to flexibly keep log block data in parquet. See
InlineFileSystem/tests if interested.


On Mon, Jun 14, 2021 at 1:54 AM LakeShen <> wrote:

> Hi community,
> I have a question,  why hudi consider the Avro be the  MOR's log format?
> Best,
> LakeShen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message