Return-Path: Delivered-To: apmail-hive-dev-archive@www.apache.org Received: (qmail 20799 invoked from network); 12 Oct 2010 20:56:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Oct 2010 20:56:57 -0000 Received: (qmail 74790 invoked by uid 500); 12 Oct 2010 20:56:57 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 74752 invoked by uid 500); 12 Oct 2010 20:56:57 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 74744 invoked by uid 500); 12 Oct 2010 20:56:57 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 74741 invoked by uid 99); 12 Oct 2010 20:56:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 20:56:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 20:56:55 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9CKuXA5000852 for ; Tue, 12 Oct 2010 20:56:33 GMT Message-ID: <17421718.103641286916993463.JavaMail.jira@thor> Date: Tue, 12 Oct 2010 16:56:33 -0400 (EDT) From: "Namit Jain (JIRA)" To: hive-dev@hadoop.apache.org Subject: [jira] Created: (HIVE-1702) optimize JDBM to make mapjoin faster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org optimize JDBM to make mapjoin faster ------------------------------------ Key: HIVE-1702 URL: https://issues.apache.org/jira/browse/HIVE-1702 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Liyin Tang Htree.get() cost 70% total time. It could help a lot if there is bloom filter here to avoid unneeded get() if we know for sure the given key is not in JDBM. (we can generate the bloom filter when doing the jdbm sink, and read into memory when doing read. ) Copied from https://issues.apache.org/jira/browse/HIVE-1700 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.