incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jinfeng Ni <jinfengn...@gmail.com>
Subject Re: How to load data in Drill
Date Tue, 03 Dec 2013 06:21:09 GMT
Hi Jason,

Thanks for offering your help to look at this issue.

I did try to see if the file PageReadStatus.java has been changed
recently.  The output of git log for that file shows the latest change is
Sep 9 for "DRILL-221 Add license header to all files".  I thought the
binary distribution is made after the license header was added.  But you
are right, there might be change after the binary distribution.

Thanks,

Jinfeng



On Mon, Dec 2, 2013 at 10:03 PM, Jason Altekruse
<altekrusejason@gmail.com>wrote:

> Hi Madhu,
>
> I would be happy to take a look at this as well. I wrote most of the code
> we are using to read parquet files, so I should be able to figure out why
> we are getting an NPE with the files you are reading. I took a look back at
> the previous thread where this issue was being discussed and noticed that
> you reported having installed Drill from binaries. Have you tried compiling
> Drill with a more recent version of the source from our repository?
>
> We ended up learning that Apache does not consider binary releases
> official, while we will obviously be providing them for users in future
> releases, we ended up giving up on the binaries before we reached the end
> of the Apache approval process. As such, several bugs were fixed (not
> necessarily in the parquet reader) between this binary and our final m1
> source release. Since the release, there have also been code changes made
> that may solve the issue you are having, so we can test it against the
> latest development code to see if changes still need to be made to solve
> the problem.
>
> Jinfeng,
> This also could mean that line 92 that you found in the source does not
> match what 92 was at the time of building this release, just something to
> keep in mind if you look at this again.
>
> Thanks,
> Jason Altekruse
>
>
> On Mon, Dec 2, 2013 at 11:38 PM, Jinfeng Ni <jinfengni99@gmail.com> wrote:
>
> > Hi Madhu,
> >
> > Yes, the log is helpful; I can see the NPE is raised in storage engine
> > component ParquetRecordReader,  not in the query execution component.
> >
> > Unfortunately, I can not reproduce this parquet reader NPE problem using
> > either sample data (nation.parquet, region.parquet), or other TPCH
> parquet
> > files. From the log, I could see the NPE is raised in the following code:
> >
> >     currentPage = new Page(
> >         bytesIn,
> >         pageHeader.data_page_header.num_values,
> >         pageHeader.uncompressed_page_size,
> >
> >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.repetition_level_encoding),
> >
> >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.definition_level_encoding),
> >
> >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.encoding)
> >     );
> >
> > My guess is either pageHeader, or it's member data_page_header is NULL.
> But
> > without the parquet file to recreate this NPE, I do not have a way to
> > verify.
> >
> > Is it possible you share your parquet file ( after remove any sensitive
> > data), so that I can recreate and try to find a fix for this NPE?
> >
> > Thanks!
> >
> >
> >
> >
> > On Mon, Dec 2, 2013 at 3:15 PM, Madhu Borkar <backmeupone@gmail.com>
> > wrote:
> >
> > > Hi Jinfeng,
> > >
> > > Here is the part of the log you are looking for
> > > 18:08:12.905 [WorkManager-2] DEBUG o.a.d.e.work.batch.IncomingBuffers -
> > > Came up with a list of 0 required fragments.  Fragments {}
> > > 18:08:16.181 [WorkManager Event Thread] DEBUG
> > > o.apache.drill.exec.work.WorkManager - Starting pending task
> > > org.apache.drill.exec.work.FragmentRunner@2122d9d0
> > > 18:08:16.184 [WorkManager-3] DEBUG o.a.drill.exec.work.FragmentRunner -
> > > Starting fragment runner. 0:0
> > > 18:08:16.188 [WorkManager-3] DEBUG o.a.d.e.w.f.RunningFragmentManager -
> > New
> > > fragment status was provided to Foreman of memory_use: 0
> > > batches_completed: 0
> > > records_completed: 0
> > > state: RUNNING
> > > data_processed: 0
> > > handle {
> > >   query_id {
> > >     part1: -3386430666417617918
> > >     part2: -5241641154650077119
> > >   }
> > >   major_fragment_id: 0
> > >   minor_fragment_id: 0
> > > }
> > > running_time: 429655087179513
> > >
> > > 18:08:16.237 [WorkManager-3] DEBUG o.a.d.e.s.p.ParquetRecordReader -
> > > records to read in this pass: 4000
> > > 18:08:16.339 [WorkManager-3] DEBUG o.a.drill.exec.work.FragmentRunner -
> > > Caught exception while running fragment
> > > java.lang.NullPointerException: null
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.store.parquet.PageReadStatus.next(PageReadStatus.java:92)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.store.parquet.VarLenBinaryReader.readFields(VarLenBinaryReader.java:124)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.store.parquet.ParquetRecordReader.next(ParquetRecordReader.java:386)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:95)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.next(ScreenCreator.java:77)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > > org.apache.drill.exec.work.FragmentRunner.run(FragmentRunner.java:79)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > [na:1.7.0_45]
> > >         at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > [na:1.7.0_45]
> > >         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> > > 18:08:16.341 [WorkManager-3] ERROR
> > o.a.d.e.w.AbstractFragmentRunnerListener
> > > - Error b7fa738a-1d3a-4b06-acb6-226a9744dbb7: Failure while running
> > > fragment.
> > > java.lang.NullPointerException: null
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.store.parquet.PageReadStatus.next(PageReadStatus.java:92)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.store.parquet.VarLenBinaryReader.readFields(VarLenBinaryReader.java:124)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.store.parquet.ParquetRecordReader.next(ParquetRecordReader.java:386)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:95)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.next(ScreenCreator.java:77)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > > org.apache.drill.exec.work.FragmentRunner.run(FragmentRunner.java:79)
> > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > >         at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > [na:1.7.0_45]
> > >         at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > [na:1.7.0_45]
> > >         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> > > 18:08:16.344 [WorkManager-3] DEBUG o.a.d.e.w.f.RunningFragmentManager -
> > New
> > > fragment status was provided to Foreman of memory_use: 0
> > > batches_completed:
> > > records_completed: 0
> > > state: FAILED
> > > data_processed: 0
> > > handle {
> > >   query_id {
> > >     part1: -3386430666417617918
> > >     part2: -5241641154650077119
> > >   }
> > >   major_fragment_id: 0
> > >   minor_fragment_id: 0
> > > }
> > > error {
> > >   error_id: "b7fa738a-1d3a-4b06-acb6-226a9744dbb7"
> > >   endpoint {
> > >     address: "Node-0.etouch.net"
> > >     user_port: 31010
> > >     bit_port: 32011
> > >   }
> > >   error_type: 0
> > >   message: "Failure while running fragment. < NullPointerException"
> > > }
> > > running_time: 155084239
> > >
> > > 18:08:16.346 [WorkManager-3] DEBUG o.a.drill.exec.rpc.user.UserServer -
> > > Sending result to client with QueryWritableBatch [header=query_state:
> > > FAILED
> > > query_id {
> > >   part1: -3386430666417617918
> > >   part2: -5241641154650077119
> > > }
> > > error {
> > >   error_id: "b7fa738a-1d3a-4b06-acb6-226a9744dbb7"
> > >   endpoint {
> > >     address: "Node-0.etouch.net"
> > >     user_port: 31010
> > >     bit_port: 32011
> > >   }
> > >   error_type: 0
> > >   message: "Failure while running fragment. < NullPointerException"
> > > }
> > > , buffers=[]]
> > > 18:08:16.351 [WorkManager-3] DEBUG o.a.drill.exec.work.FragmentRunner -
> > > Fragment runner complete. 0:
> > >
> > > Please, let me know if this one helps!
> > >
> > >
> > > On Sun, Dec 1, 2013 at 10:34 PM, Jinfeng Ni <jinfengni99@gmail.com>
> > wrote:
> > >
> > > > Hi Tom and Madhu,
> > > >
> > > > Regarding the NullPointerException you encountered when you run the
> > query
> > > > in sqlline:
> > > >
> > > > SELECT * FROM some_parquet_file;
> > > >
> > > > Could you please post some debug log in the sqlline's log?  If you
> use
> > > > sqlline in drill's binary distribution, sqlline's log would be in
> > > > /var/log/drill/sqlline.log.  Please search keyword "FragmentRunner"
> and
> > > > "Caught exception".  The sqlline.log should contain a call stack when
> > the
> > > > NullPointerException is threw.  For instance, here is the log for an
> > > > IndexOutBoundaryException in my sqlline.log:
> > > >
> > > > 21:44:40.984 [WorkManager-4] DEBUG
> o.a.drill.exec.work.FragmentRunner -
> > > > Caught exception while running fragment
> > > > java.lang.IndexOutOfBoundsException: index: 31999268, length: 4
> > > (expected:
> > > > range(0, 4194244))
> > > >         at
> > > > io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1130)
> > > > ~[netty-buffer-4.0.7.Final.jar:na]
> > > >         at
> > > io.netty.buffer.AbstractByteBuf.getInt(AbstractByteBuf.java:378)
> > > > ~[netty-buffer-4.0.7.Final.jar:na]
> > > >         at
> > > >
> > >
> >
> org.apache.drill.exec.vector.UInt4Vector$Accessor.get(UInt4Vector.java:188)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.vector.VarBinaryVector$Mutator.setValueCount(VarBinaryVector.java:355)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.doWork(RemovingRecordBatch.java:92)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:63)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:42)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:42)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.next(LimitRecordBatch.java:89)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:42)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.next(ScreenCreator.java:77)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > > org.apache.drill.exec.work.FragmentRunner.run(FragmentRunner.java:79)
> > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > >         at
> > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > > [na:1.7.0_45]
> > > >         at
> > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > > [na:1.7.0_45]
> > > >         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> > > > 21:44:40.990 [WorkManager-4] ERROR
> > > o.a.d.e.w.AbstractFragmentRunnerListener
> > > > - Error c8efdbf1-9a6f-427c-ab90-ce16002904af: Failure while running
> > > > fragment.
> > > >
> > > > I need the call stack when the NPE is threw, to see what went wrong
> for
> > > > your query.
> > > >
> > > > The call stack that you posted (started from
> > > >
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.rpc.user.QueryResultHandler.batchArrived(QueryResultHandler.java:72)
> > > > ) is when the Query Result Lister detects
> > > >
> > > > an exception has been threw.
> > > >
> > > > Thanks!
> > > >
> > > > Jinfeng
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message