incubator-drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhu Borkar <backmeup...@gmail.com>
Subject Re: How to load data in Drill
Date Tue, 03 Dec 2013 07:38:29 GMT
Hi Jason nd Jinfeng,
Thank you guys for taking your time to debug the problem. I have sent my
data to Jinfeng.
Other than parquet file, can I put my data in hbase (or any other data
source) and query it thru drill?
Please, let me know.



On Mon, Dec 2, 2013 at 10:21 PM, Jinfeng Ni <jinfengni99@gmail.com> wrote:

> Hi Jason,
>
> Thanks for offering your help to look at this issue.
>
> I did try to see if the file PageReadStatus.java has been changed
> recently.  The output of git log for that file shows the latest change is
> Sep 9 for "DRILL-221 Add license header to all files".  I thought the
> binary distribution is made after the license header was added.  But you
> are right, there might be change after the binary distribution.
>
> Thanks,
>
> Jinfeng
>
>
>
> On Mon, Dec 2, 2013 at 10:03 PM, Jason Altekruse
> <altekrusejason@gmail.com>wrote:
>
> > Hi Madhu,
> >
> > I would be happy to take a look at this as well. I wrote most of the code
> > we are using to read parquet files, so I should be able to figure out why
> > we are getting an NPE with the files you are reading. I took a look back
> at
> > the previous thread where this issue was being discussed and noticed that
> > you reported having installed Drill from binaries. Have you tried
> compiling
> > Drill with a more recent version of the source from our repository?
> >
> > We ended up learning that Apache does not consider binary releases
> > official, while we will obviously be providing them for users in future
> > releases, we ended up giving up on the binaries before we reached the end
> > of the Apache approval process. As such, several bugs were fixed (not
> > necessarily in the parquet reader) between this binary and our final m1
> > source release. Since the release, there have also been code changes made
> > that may solve the issue you are having, so we can test it against the
> > latest development code to see if changes still need to be made to solve
> > the problem.
> >
> > Jinfeng,
> > This also could mean that line 92 that you found in the source does not
> > match what 92 was at the time of building this release, just something to
> > keep in mind if you look at this again.
> >
> > Thanks,
> > Jason Altekruse
> >
> >
> > On Mon, Dec 2, 2013 at 11:38 PM, Jinfeng Ni <jinfengni99@gmail.com>
> wrote:
> >
> > > Hi Madhu,
> > >
> > > Yes, the log is helpful; I can see the NPE is raised in storage engine
> > > component ParquetRecordReader,  not in the query execution component.
> > >
> > > Unfortunately, I can not reproduce this parquet reader NPE problem
> using
> > > either sample data (nation.parquet, region.parquet), or other TPCH
> > parquet
> > > files. From the log, I could see the NPE is raised in the following
> code:
> > >
> > >     currentPage = new Page(
> > >         bytesIn,
> > >         pageHeader.data_page_header.num_values,
> > >         pageHeader.uncompressed_page_size,
> > >
> > >
> > >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.repetition_level_encoding),
> > >
> > >
> > >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.definition_level_encoding),
> > >
> > >
> > >
> >
> ParquetStorageEngine.parquetMetadataConverter.getEncoding(pageHeader.data_page_header.encoding)
> > >     );
> > >
> > > My guess is either pageHeader, or it's member data_page_header is NULL.
> > But
> > > without the parquet file to recreate this NPE, I do not have a way to
> > > verify.
> > >
> > > Is it possible you share your parquet file ( after remove any sensitive
> > > data), so that I can recreate and try to find a fix for this NPE?
> > >
> > > Thanks!
> > >
> > >
> > >
> > >
> > > On Mon, Dec 2, 2013 at 3:15 PM, Madhu Borkar <backmeupone@gmail.com>
> > > wrote:
> > >
> > > > Hi Jinfeng,
> > > >
> > > > Here is the part of the log you are looking for
> > > > 18:08:12.905 [WorkManager-2] DEBUG
> o.a.d.e.work.batch.IncomingBuffers -
> > > > Came up with a list of 0 required fragments.  Fragments {}
> > > > 18:08:16.181 [WorkManager Event Thread] DEBUG
> > > > o.apache.drill.exec.work.WorkManager - Starting pending task
> > > > org.apache.drill.exec.work.FragmentRunner@2122d9d0
> > > > 18:08:16.184 [WorkManager-3] DEBUG
> o.a.drill.exec.work.FragmentRunner -
> > > > Starting fragment runner. 0:0
> > > > 18:08:16.188 [WorkManager-3] DEBUG
> o.a.d.e.w.f.RunningFragmentManager -
> > > New
> > > > fragment status was provided to Foreman of memory_use: 0
> > > > batches_completed: 0
> > > > records_completed: 0
> > > > state: RUNNING
> > > > data_processed: 0
> > > > handle {
> > > >   query_id {
> > > >     part1: -3386430666417617918
> > > >     part2: -5241641154650077119
> > > >   }
> > > >   major_fragment_id: 0
> > > >   minor_fragment_id: 0
> > > > }
> > > > running_time: 429655087179513
> > > >
> > > > 18:08:16.237 [WorkManager-3] DEBUG o.a.d.e.s.p.ParquetRecordReader -
> > > > records to read in this pass: 4000
> > > > 18:08:16.339 [WorkManager-3] DEBUG
> o.a.drill.exec.work.FragmentRunner -
> > > > Caught exception while running fragment
> > > > java.lang.NullPointerException: null
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.store.parquet.PageReadStatus.next(PageReadStatus.java:92)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.store.parquet.VarLenBinaryReader.readFields(VarLenBinaryReader.java:124)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.store.parquet.ParquetRecordReader.next(ParquetRecordReader.java:386)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:95)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.next(ScreenCreator.java:77)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > > org.apache.drill.exec.work.FragmentRunner.run(FragmentRunner.java:79)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > > [na:1.7.0_45]
> > > >         at
> > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > > [na:1.7.0_45]
> > > >         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> > > > 18:08:16.341 [WorkManager-3] ERROR
> > > o.a.d.e.w.AbstractFragmentRunnerListener
> > > > - Error b7fa738a-1d3a-4b06-acb6-226a9744dbb7: Failure while running
> > > > fragment.
> > > > java.lang.NullPointerException: null
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.store.parquet.PageReadStatus.next(PageReadStatus.java:92)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.store.parquet.VarLenBinaryReader.readFields(VarLenBinaryReader.java:124)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.store.parquet.ParquetRecordReader.next(ParquetRecordReader.java:386)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > > org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:95)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.next(ScreenCreator.java:77)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > > org.apache.drill.exec.work.FragmentRunner.run(FragmentRunner.java:79)
> > > > ~[java-exec-1.0.0-m1-rebuffed.jar:1.0.0-m1]
> > > >         at
> > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > > [na:1.7.0_45]
> > > >         at
> > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > > [na:1.7.0_45]
> > > >         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> > > > 18:08:16.344 [WorkManager-3] DEBUG
> o.a.d.e.w.f.RunningFragmentManager -
> > > New
> > > > fragment status was provided to Foreman of memory_use: 0
> > > > batches_completed:
> > > > records_completed: 0
> > > > state: FAILED
> > > > data_processed: 0
> > > > handle {
> > > >   query_id {
> > > >     part1: -3386430666417617918
> > > >     part2: -5241641154650077119
> > > >   }
> > > >   major_fragment_id: 0
> > > >   minor_fragment_id: 0
> > > > }
> > > > error {
> > > >   error_id: "b7fa738a-1d3a-4b06-acb6-226a9744dbb7"
> > > >   endpoint {
> > > >     address: "Node-0.etouch.net"
> > > >     user_port: 31010
> > > >     bit_port: 32011
> > > >   }
> > > >   error_type: 0
> > > >   message: "Failure while running fragment. < NullPointerException"
> > > > }
> > > > running_time: 155084239
> > > >
> > > > 18:08:16.346 [WorkManager-3] DEBUG
> o.a.drill.exec.rpc.user.UserServer -
> > > > Sending result to client with QueryWritableBatch [header=query_state:
> > > > FAILED
> > > > query_id {
> > > >   part1: -3386430666417617918
> > > >   part2: -5241641154650077119
> > > > }
> > > > error {
> > > >   error_id: "b7fa738a-1d3a-4b06-acb6-226a9744dbb7"
> > > >   endpoint {
> > > >     address: "Node-0.etouch.net"
> > > >     user_port: 31010
> > > >     bit_port: 32011
> > > >   }
> > > >   error_type: 0
> > > >   message: "Failure while running fragment. < NullPointerException"
> > > > }
> > > > , buffers=[]]
> > > > 18:08:16.351 [WorkManager-3] DEBUG
> o.a.drill.exec.work.FragmentRunner -
> > > > Fragment runner complete. 0:
> > > >
> > > > Please, let me know if this one helps!
> > > >
> > > >
> > > > On Sun, Dec 1, 2013 at 10:34 PM, Jinfeng Ni <jinfengni99@gmail.com>
> > > wrote:
> > > >
> > > > > Hi Tom and Madhu,
> > > > >
> > > > > Regarding the NullPointerException you encountered when you run the
> > > query
> > > > > in sqlline:
> > > > >
> > > > > SELECT * FROM some_parquet_file;
> > > > >
> > > > > Could you please post some debug log in the sqlline's log?  If you
> > use
> > > > > sqlline in drill's binary distribution, sqlline's log would be in
> > > > > /var/log/drill/sqlline.log.  Please search keyword "FragmentRunner"
> > and
> > > > > "Caught exception".  The sqlline.log should contain a call stack
> when
> > > the
> > > > > NullPointerException is threw.  For instance, here is the log for
> an
> > > > > IndexOutBoundaryException in my sqlline.log:
> > > > >
> > > > > 21:44:40.984 [WorkManager-4] DEBUG
> > o.a.drill.exec.work.FragmentRunner -
> > > > > Caught exception while running fragment
> > > > > java.lang.IndexOutOfBoundsException: index: 31999268, length: 4
> > > > (expected:
> > > > > range(0, 4194244))
> > > > >         at
> > > > >
> io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1130)
> > > > > ~[netty-buffer-4.0.7.Final.jar:na]
> > > > >         at
> > > > io.netty.buffer.AbstractByteBuf.getInt(AbstractByteBuf.java:378)
> > > > > ~[netty-buffer-4.0.7.Final.jar:na]
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.vector.UInt4Vector$Accessor.get(UInt4Vector.java:188)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.vector.VarBinaryVector$Mutator.setValueCount(VarBinaryVector.java:355)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.doWork(RemovingRecordBatch.java:92)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:63)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:42)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:42)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.next(LimitRecordBatch.java:89)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.next(AbstractSingleRecordBatch.java:42)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.next(ScreenCreator.java:77)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> org.apache.drill.exec.work.FragmentRunner.run(FragmentRunner.java:79)
> > > > >
> > > > >
> > > >
> > >
> >
> ~[drill-java-exec-1.0.0-m2-incubating-SNAPSHOT-rebuffed.jar:1.0.0-m2-incubating-SNAPSHOT]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > > > [na:1.7.0_45]
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > > > [na:1.7.0_45]
> > > > >         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> > > > > 21:44:40.990 [WorkManager-4] ERROR
> > > > o.a.d.e.w.AbstractFragmentRunnerListener
> > > > > - Error c8efdbf1-9a6f-427c-ab90-ce16002904af: Failure while running
> > > > > fragment.
> > > > >
> > > > > I need the call stack when the NPE is threw, to see what went wrong
> > for
> > > > > your query.
> > > > >
> > > > > The call stack that you posted (started from
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.rpc.user.QueryResultHandler.batchArrived(QueryResultHandler.java:72)
> > > > > ) is when the Query Result Lister detects
> > > > >
> > > > > an exception has been threw.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Jinfeng
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message