hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-318) [Hive] union all queries broken - all kinds of problems
Date Wed, 18 Mar 2009 18:05:50 GMT

     [ https://issues.apache.org/jira/browse/HIVE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Namit Jain updated HIVE-318:
----------------------------

    Status: Open  (was: Patch Available)

> [Hive] union all queries broken - all kinds of problems
> -------------------------------------------------------
>
>                 Key: HIVE-318
>                 URL: https://issues.apache.org/jira/browse/HIVE-318
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>            Priority: Blocker
>         Attachments: hive.318.2.patch, hive.318.3.patch, hive.318.4.patch, hive.318.patch
>
>
> 1. Map-only job : same input
>    Hangs because mapper tries to same open twice, and hadoop filesystem complains.
>    Fix: Only initialize once - keep state at the Operator level for the same. Should
do same for Close.
> 2. Map-only job : different inputs
>    Loss of data due to rename.
>    Fix: change rename to move files to the directory.
> 3. Map-only job in subquery + RedSink: works currently
> 4. 2 variables: so 4 sub-cases
>    Number of sub-queries having map-reduce jobs. (1/2)
>    Operator after Union (RS/FS)
>    
> a.   Number of sub-queries having map-reduce jobs. 1
>      Operator after Union: RS
>      Can be done in 2MR - really difficult with current infrastructure.
>      Should do with 3 MR jobs now - break on top of UNION. 
>      Future optimization: move operators between Union and RS before Union.
> b.   Number of sub-queries having map-reduce jobs. 2
>      Operator after Union: RS
>      Needs 3MR - Should do with 3 MR jobs - break on top of UNION. 
>      Future optimization: move operators between Union and RS before Union.
> c.   Number of sub-queries having map-reduce jobs. 1
>      Operator after Union: FS
>      Can be done in 1MR - really difficult with current infrastructure.
>      Can be easily done with 2 MR by removing UNION and cloning operators between Union
and FS.
>      Should do with 3 MR jobs now - break on top of UNION. 
>      Followup optimization: 2MR should be able to handle
> d.   Number of sub-queries having map-reduce jobs. 2
>      Operator after Union: FS
>      Can be easily done with 2 MR by removing UNION and cloning operators between Union
and FS.
>      Should do with 3 MR jobs now - break on top of UNION. 
>      Followup optimization: 2MR should be able to handle

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message