drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From François Méthot <fmetho...@gmail.com>
Subject log flooded by "date values definitively CORRECT"
Date Tue, 17 Oct 2017 17:35:11 GMT
Hi again,

  I am running into an issue on a query done on 760 000 parquet files
stored in HDFS. We are using Drill 1.10, 8GB heap, 20GB direct mem. Drill
runs with debug log enabled all the time.

The query is standard select on  8 fields from hdfs.`/path` where this =
that ....

For about an hour I see this message on the foreman:

[pool-9-thread-##] DEBUG o.a.d.exec.store.parquet.Metadata - It is
determined from metadata that the date values are definitely CORRECT


[some UUID:foreman] INFO o.a.d.exec.store.parquet.Metadata - Fetch parquet
metadata : Executed 761659 out of 761659 using 16 threads. Time : 3022416ms

Then :
Java.lang.OutOfMemoryError: Java Heap Space
   at java.util.Arrays.copyOf
   at java.io.PrintWriter.println(PrintWriter.java:757)
   at org.apache.calcite.rel.externalize.RelWriterImplt.explain
   at org.apachje.calcite.rel.externalize.RelWriterImpl.done
   at org.apache.calcite.plan.RelOptUtil.toString (RelOptUtil.java:1927)
   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050)
   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281)

I think it might be caused by having too much files to query, chunking our
select into smaller piece actually helped.
Also suspect that the DEBUG logging is taxing the poor node a bit much.

Do you think adding more memory would address the issue (I can't try this
right now) or you would think it is caused by a bug?

Thank in advance for any advises,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message