hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Shao <zsh...@gmail.com>
Subject Re: [VOTE] hive release candidate 0.4.1-rc0
Date Tue, 03 Nov 2009 08:57:20 GMT
Hi Min,

What is zip? Which codec does it use?
I think it's probably a problem of the codec.

Can you try GzipCodec? Most probably that will work fine.

Zheng

On Mon, Nov 2, 2009 at 1:08 AM, Min Zhou <coderplay@gmail.com> wrote:

> If  it returns more than 0 rows, that error will never happen.
>
> Thanks,
> Min
>
> On Mon, Nov 2, 2009 at 5:06 PM, Min Zhou <coderplay@gmail.com> wrote:
> > No, it's zip.
> >
> > On Mon, Nov 2, 2009 at 4:03 PM, Zheng Shao <zshao9@gmail.com> wrote:
> >> Do you mean gzip codec?
> >> I think an empty gzip file should be 20 bytes. There might be some
> >> problem with the gzip codec (or native gzip codec) on your cluster.
> >> Can you check the log message of the map tasks whether it has a line
> >> called "Successfully loaded native gzip lib"?
> >>
> >> You can try any query that produces empty results - it should go
> >> through the same code path.
> >>
> >> Zheng
> >>
> >> On Sun, Nov 1, 2009 at 11:07 PM, Min Zhou <coderplay@gmail.com> wrote:
> >>> we use zip codec in default.
> >>> Some of the same lines were omitted from the error stack:
> >>> at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
> >>>
> >>>
> >>> Thanks,
> >>> Min
> >>>
> >>> On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao <zshao9@gmail.com> wrote:
> >>>> Min, can you check the default compression codec in your hadoop conf?
> >>>> The 8-byte file must be a compressed file using the codec which
> >>>> represents 0-length file.
> >>>>
> >>>> It seems that codec was not able to decompress the stream.
> >>>>
> >>>> Zheng
> >>>>
> >>>> On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou <coderplay@gmail.com>
> wrote:
> >>>>> I think there may be a bug still in this release.
> >>>>>
> >>>>> hive>select stuff_status from auctions where auction_id='2591238417'
> >>>>> and pt='20091027';
> >>>>>
> >>>>> auctions is a table partitioned by date, it stored as a textfile
w/o
> >>>>> compression. The query above should return 0 rows.
> >>>>> but when hive.exec.compress.output=true,  hive will crash with a
> >>>>> StackOverflowError
> >>>>>
> >>>>> java.lang.StackOverflowError
> >>>>>        at java.lang.ref.FinalReference.<init>(FinalReference.java:16)
> >>>>>        at java.lang.ref.Finalizer.<init>(Finalizer.java:66)
> >>>>>        at java.lang.ref.Finalizer.register(Finalizer.java:72)
> >>>>>        at java.lang.Object.<init>(Object.java:20)
> >>>>>        at java.net.SocketImpl.<init>(SocketImpl.java:27)
> >>>>>        at java.net.PlainSocketImpl.<init>(PlainSocketImpl.java:90)
> >>>>>        at java.net.SocksSocketImpl.<init>(SocksSocketImpl.java:33)
> >>>>>        at java.net.Socket.setImpl(Socket.java:434)
> >>>>>        at java.net.Socket.<init>(Socket.java:68)
> >>>>>        at sun.nio.ch.SocketAdaptor.<init>(SocketAdaptor.java:50)
> >>>>>        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
> >>>>>        at
> sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
> >>>>>        at
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
> >>>>>        at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
> >>>>>        at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
> >>>>>        at java.io.DataInputStream.read(DataInputStream.java:132)
> >>>>>        at
> org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
> >>>>>        at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
> >>>>>        at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
> >>>>>        at java.io.InputStream.read(InputStream.java:85)
> >>>>>        at
> org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
> >>>>>        at
> org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
> >>>>>        at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
> >>>>>        at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
> >>>>>        at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
> >>>>>        at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)
> >>>>>
> >>>>> Each mapper will produce a 8 bytes deflate file on hdfs(we set
> >>>>> hive.merge.mapfiles=false), their hex representation  is like below:
> >>>>>
> >>>>> 78 9C 03 00 00 00 00 01
> >>>>>
> >>>>> This is the reason why FetchOperator:272 is called recursively,
and
> >>>>> caused a stack overflow error.
> >>>>>
> >>>>> Regards,
> >>>>> Min
> >>>>>
> >>>>>
> >>>>> On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao <zshao9@gmail.com>
wrote:
> >>>>>> I have made a release candidate 0.4.1-rc0.
> >>>>>>
> >>>>>> We've fixed several critical bugs to hive release 0.4.0. We
need
> hive
> >>>>>> release 0.4.1 out asap.
> >>>>>>
> >>>>>> Here are the list of changes:
> >>>>>>
> >>>>>>    HIVE-884. Metastore Server should call System.exit() on error.
> >>>>>>    (Zheng Shao via pchakka)
> >>>>>>
> >>>>>>    HIVE-864. Fix map-join memory-leak.
> >>>>>>    (Namit Jain via zshao)
> >>>>>>
> >>>>>>    HIVE-878. Update the hash table entry before flushing in
Group By
> >>>>>>    hash aggregation (Zheng Shao via namit)
> >>>>>>
> >>>>>>    HIVE-882. Create a new directory every time for scratch.
> >>>>>>    (Namit Jain via zshao)
> >>>>>>
> >>>>>>    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul
Huff
> via zshao)
> >>>>>>
> >>>>>>    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur
> via zshao)
> >>>>>>
> >>>>>>    HIVE-883. URISyntaxException when partition value contains
> special chars.
> >>>>>>    (Zheng Shao via namit)
> >>>>>>
> >>>>>>
> >>>>>> Please vote.
> >>>>>>
> >>>>>> --
> >>>>>> Yours,
> >>>>>> Zheng
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> My research interests are distributed systems, parallel computing
and
> >>>>> bytecode based virtual machine.
> >>>>>
> >>>>> My profile:
> >>>>> http://www.linkedin.com/in/coderplay
> >>>>> My blog:
> >>>>> http://coderplay.javaeye.com
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Yours,
> >>>> Zheng
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> My research interests are distributed systems, parallel computing and
> >>> bytecode based virtual machine.
> >>>
> >>> My profile:
> >>> http://www.linkedin.com/in/coderplay
> >>> My blog:
> >>> http://coderplay.javaeye.com
> >>>
> >>
> >>
> >>
> >> --
> >> Yours,
> >> Zheng
> >>
> >
> >
> >
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
Yours,
Zheng

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message