hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lengwuqing (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-3601) Hive as a contrib project
Date Thu, 28 Aug 2008 07:09:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626454#action_12626454
] 

lengwuqing edited comment on HADOOP-3601 at 8/28/08 12:07 AM:
--------------------------------------------------------------

When I tried ths hive system, I found these issues:
   1. The NullHiveObject.getFields always return null, but somewhere(such as NaiiveSerializer)
the caller used thisreturn value to call .getSize() method.    //I return "return new ArrayList<SerDeField>();"
but I can not make sure this is OK.
   2.  In the joinOperator.close(), the l4j object is null.      //I added some check in this
function.
   3. Some times(heavy loading for hadoop), even I used same data and same Hive-QL, but the
results are difference.  While the logic error happening, the result must be xxxx_r_000022_0
and another file with postfix: xxxx_r_000022_1.   //This issue is reproducable, I think this
is a critical bug for me. I dont known which cases the results have difference postfix: _0
and _1. But I guess this is a good hint to debug this issue.

   Any facebook guy could you please give me a hand:  why the case#3 happened?
   "No one is there", nobody care me. my god, I suggest that FB pay us $, we can enhance the
hive better and have it go ahead faster.  Hahahahahahahahahahahah


      was (Author: lengwuqing):
    When I tried ths hive system, I found these issues:
   1. The NullHiveObject.getFields always return null, but somewhere(such as NaiiveSerializer)
the caller used thisreturn value to call .getSize() method.    //I return "return new ArrayList<SerDeField>();"
but I can not make sure this is OK.
   2.  In the joinOperator.close(), the l4j object is null.      //I added some check in this
function.
   3. Some times(heavy loading for hadoop), even I used same data and same Hive-QL, but the
results are difference.  While the logic error happening, the result must be xxxx_r_000022_0
and another file with postfix: xxxx_r_000022_1.   //This issue is reproducable, I think this
is a critical bug for me. I dont known which cases the results have difference postfix: _0
and _1. But I guess this is a good hint to debug this issue.

   Any facebook guy could you please give me a hand:  why the case#3 happened?
   "No one is there", nobody care me. my god, I suggest that FB pay us $, we can enhance the
hive better and have it go head faster.  Hahahahahahahahahahahah

  
> Hive as a contrib project
> -------------------------
>
>                 Key: HADOOP-3601
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3601
>             Project: Hadoop Core
>          Issue Type: Wish
>    Affects Versions: 0.17.2
>         Environment: N/A
>            Reporter: Joydeep Sen Sarma
>            Priority: Minor
>         Attachments: hive.tgz, hive.tgz, HiveTutorial.pdf
>
>   Original Estimate: 1080h
>  Remaining Estimate: 1080h
>
> Hive is a data warehouse built on top of flat files (stored primarily in HDFS). It includes:
> - Data Organization into Tables with logical and hash partitioning
> - A Metastore to store metadata about Tables/Partitions etc
> - A SQL like query language over object data stored in Tables
> - DDL commands to define and load external data into tables
> Hive's query language is executed using Hadoop map-reduce as the execution engine. Queries
can use either single stage or multi-stage map-reduce. Hive has a native format for tables
- but can handle any data set (for example json/thrift/xml) using an IO library framework.
> Hive uses Antlr for query parsing, Apache JEXL for expression evaluation and may use
Apache Derby as an embedded database for MetaStore. Antlr has a BSD license and should be
compatible with Apache license.
> We are currently thinking of contributing to the 0.17 branch as a contrib project (since
that is the version under which it will get tested internally) - but looking for advice on
the best release path.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message