Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 087F4CA98 for ; Tue, 22 May 2012 21:43:55 +0000 (UTC) Received: (qmail 98887 invoked by uid 500); 22 May 2012 21:43:54 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 98819 invoked by uid 500); 22 May 2012 21:43:54 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 98811 invoked by uid 99); 22 May 2012 21:43:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 May 2012 21:43:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of russell.jurney@gmail.com designates 74.125.82.171 as permitted sender) Received: from [74.125.82.171] (HELO mail-we0-f171.google.com) (74.125.82.171) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 May 2012 21:43:49 +0000 Received: by wejx9 with SMTP id x9so5130838wej.30 for ; Tue, 22 May 2012 14:43:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:from:in-reply-to:mime-version:date:message-id:subject:to :content-type:content-transfer-encoding; bh=YF4Ih61o2H+0YVJ/58b2xF1Yj0EgcOLtd/lC8NCiy58=; b=iDwPIKVlUm5qLmXNg+9LoTZDhUgIl3JHnqG6AsNyfL12IyQ7QaTWonceGQ/thfH7Pv ox6zbDXw6JvvhSKhh+qmrHBrOGm+rhuChxVKhjHOeqX8wKP5ycIeMmIX9hGaITWJuP4C mHCCCZWHCEvGy1SGWFjVBIu8SpqWyXE3PfQVsdgvQ4HgB8d03m554mh/glgFbiLsod3r F+G0f+nXG1U2FiPN+Caik5IF9+KxIABcDijIz5I2soYn15hEsvNXVjEFQKMlFVXjrupJ d/zBSEVjw7Ex+ZXu4IioDUxxR05ZL4WtVEkzyi7ZdCEmTfVZuPBy4eIpfDfXUx+mX94b JGgA== Received: by 10.180.81.36 with SMTP id w4mr38531453wix.16.1337723008146; Tue, 22 May 2012 14:43:28 -0700 (PDT) References: From: Russell Jurney In-Reply-To: Mime-Version: 1.0 (1.0) Date: Tue, 22 May 2012 14:43:26 -0700 Message-ID: <4981960695580415488@unknownmsgid> Subject: Re: Can serialized Avro records be efficiently compared without deserializing? To: "user@avro.apache.org" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org +1 I need this kind of access too, to roll back Avro records that fail to finish writing when python dies from a UTF error. Russell Jurney http://datasyndrome.com On May 22, 2012, at 1:22 PM, Jonathan Coveney wrote: > Imagine I use Avro to serialize an object (without loss of generality let= 's say an array of longs). I'm curious if it is possible to compare those a= rrays without deserializing... ie look at the bytes in memory or on disk, a= nd do the comparison based on those bytes (ie the raw comparison that Hadoo= p does in the shuffle sort). > > I poked around the documentation but wasn't sure where to look. > > Thanks for your help! > Jon