Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 75CF010918 for ; Mon, 2 Mar 2015 21:36:06 +0000 (UTC) Received: (qmail 24930 invoked by uid 500); 2 Mar 2015 21:36:06 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 24881 invoked by uid 500); 2 Mar 2015 21:36:06 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 24869 invoked by uid 99); 2 Mar 2015 21:36:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Mar 2015 21:36:06 +0000 Date: Mon, 2 Mar 2015 21:36:06 +0000 (UTC) From: "Colin Patrick McCabe (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7845) Compress block reports MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343805#comment-14343805 ] Colin Patrick McCabe commented on HDFS-7845: -------------------------------------------- As [~arpitagarwal] pointed out, we're not dealing with a series of ints, but with a series of protobuf vints (variable length ints). [~clamb] did some tests with a block report and got around 50% (if I'm remembering correctly?) [~clamb], can you comment on whether those tests were done with vints or regular integers? We should probably make sure we're doing the compression test with what we're actually sending, which is going to be a 3-tuple of [ block_id, genstamp, length ], all encoded as protobuf vints. Sorting is an interesting idea, but I wonder if the effectiveness diminishes when you interleave the 3 numbers? Of course we could separate them, but then our L1 / L2 cache hit rates plummet when actually processing the blocks. > Compress block reports > ---------------------- > > Key: HDFS-7845 > URL: https://issues.apache.org/jira/browse/HDFS-7845 > Project: Hadoop HDFS > Issue Type: Sub-task > Affects Versions: HDFS-7836 > Reporter: Colin Patrick McCabe > Assignee: Charles Lamb > > We should optionally compress block reports using a low-cpu codec such as lz4 or snappy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)