hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jian yi <eyj...@gmail.com>
Subject Re: Will Pig support SQL?
Date Mon, 08 Feb 2010 10:02:52 GMT
Hi Jeff,

Thank you Jeff.
I known Hive has handling skewed join, but I think it is not enough:
1.Need cost sample
2.Can't control the size of a task
3.Not exact
4.Must use Hive or Pig

I think this is a fundamental solution for skew problem by adding balacne
between map and reduce. Maybe I need express it more detailed.

Regards
Jian YI

2010/2/8 Jeff Hammerbacher <hammer@cloudera.com>

> Hey Jian,
>
> Hive supports arbitrary procedural languages through Hadoop Streaming; see
> http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform for more.
>
> Also, both Hive and Pig have support for handling skewed joins if you use
> their higher-level interface. See
> https://issues.apache.org/jira/browse/HIVE-562 and
> http://wiki.apache.org/pig/PigSkewedJoinSpec.
>
> Thanks,
> Jeff
>
> On Sun, Feb 7, 2010 at 4:13 AM, jian yi <eyjian@gmail.com> wrote:
>
> > Hey Jeff,
> >
> > Thank you, Jeff.
> > The procedure means procedure language, like Oracle PL/SQL, which is very
> > helpful to migrate old services. We want to build a data warehouse based
> on
> > MapReduce engine. I plan to optimize MapReduce to solve the skew problem
> by
> > adding a balance between map and reduce. Please refer to
> > http://bbs.hadoopor.com/thread-521-1-1.html
> >
> > <http://bbs.hadoopor.com/thread-521-1-1.html>Regards,
> > Jian
> >
> > 2010/2/7 Jeff Hammerbacher <hammer@cloudera.com>
> >
> > > Hey Jian,
> > >
> > > I'm not sure what you mean by "Hive don't support procedure", but in
> any
> > > case, the Pig team has stated that they will support SQL over the Pig
> > > execution engine. See https://issues.apache.org/jira/browse/PIG-824.
> > >
> > > Regards,
> > > Jeff
> > >
> > > On Sat, Feb 6, 2010 at 6:16 PM, jian yi <eyjian@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > SQL is very helpful to develop data warehouse, but Hive don't support
> > > > procedure. if Pig support SQL, it will be more powerful.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message