Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D83531770E for ; Wed, 15 Apr 2015 14:22:47 +0000 (UTC) Received: (qmail 46512 invoked by uid 500); 15 Apr 2015 14:22:47 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 46464 invoked by uid 500); 15 Apr 2015 14:22:47 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 46453 invoked by uid 99); 15 Apr 2015 14:22:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Apr 2015 14:22:47 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [209.85.218.43] (HELO mail-oi0-f43.google.com) (209.85.218.43) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Apr 2015 14:22:21 +0000 Received: by oign205 with SMTP id n205so24100217oig.2 for ; Wed, 15 Apr 2015 07:21:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=1gaSHI67/VXkfN3T2uR4vGO9FtcKmCC9yVbGgSri46g=; b=cPCqReS/wIxhCd6cjZ+kC+beMhP+j/Hh3fb5xA0GNfHrqCtyhMPtsBEbG3TNdjTnX7 NNRlUzShgOlEIDVO3ak8Cj3le2zpwjrW/vh8FAykyecWaub6vmr3InYqZciacR4pCxTw NENtBIscXNNJKfuAJdz+geMtgl2MMaSsOCkdDZaN9Kv/+0B0fINWbtyAO8dsCQ7OCPV6 4Hx51uyrwtlrbMCfrWkcMdceoTmOdbbTY6pDGQfkN82OBWfG/r/C7zcAy4o+e4+BlA4J uYmF3ARYD4JaI/ytYtQ2nWFXlQDokJzQnUe64KBw0yf3FfBR196SpZABzHfDJaaT/srz Tz4Q== X-Gm-Message-State: ALoCoQllUxiYlJlLq6dZWMxYtFR+83vq+3jmy1kqoaJZu7jlqdDWTVcm5s2vaTqvp+N1DCrR9A4y MIME-Version: 1.0 X-Received: by 10.107.10.201 with SMTP id 70mr36642708iok.0.1429107658896; Wed, 15 Apr 2015 07:20:58 -0700 (PDT) Received: by 10.36.54.143 with HTTP; Wed, 15 Apr 2015 07:20:58 -0700 (PDT) In-Reply-To: References: Date: Wed, 15 Apr 2015 10:20:58 -0400 Message-ID: Subject: Re: Unexpected aliasing from RFile getTopValue() From: Keith Turner To: user@accumulo.apache.org Content-Type: multipart/alternative; boundary=001a113eeab05af4130513c40efb X-Virus-Checked: Checked by ClamAV on apache.org --001a113eeab05af4130513c40efb Content-Type: text/plain; charset=UTF-8 On Tue, Apr 14, 2015 at 9:43 PM, Christopher wrote: > Well, it depends on the behavior of the iterator beneath it. Most > iterators would probably not want to copy, for performance reasons, but an > iterator could choose to break that convention (or may need to). It is kind > of unintuitive that a developer would need to understand the behavior of > the iterator beneath it, so a best practice is probably to just copy inside > your iterator, if you need a copy. > > I know we've discussed trying to revamp the iterator API before, so if you > think there's something we could do to make this a bit more intuitive, > that'd be very welcome and appreciated. > Random thought on revamp. Immutable key values with enough primitives to make most operations efficient (avoid constant alloc/copy) might be something to consider for the iterator API > > On Tue, Apr 14, 2015 at 8:57 PM Dylan Hutchison wrote: > >> While debugging a custom iterator today to find the source of a logical >> error, I discovered something an iterator developer may not expect. The >> getTopValue() of RFile returns a reference to the RFile's internal Value >> private variable. The private variable is modified inside RFile via >> >> >> val.readFields(currBlock); >> >> >> which means that if an iterator stores the reference from getTopValue(), >> that is, without copying the Value to a new Object, then the value will be >> updated in the iterator when the RFile's next() method is called. >> >> Here is an example snippet to demonstrate: >> >> Value v1 = source.getTopValue(); >> >> source.next(); // v1 is modified! >> >> >> The following code would not have a problem: >> >> Value v1 = new Value(source.getTopValue()); >> >> source.next(); >> >> >> I bet this is done for performance reasons. Is this expected? >> >> Regards, Dylan >> > --001a113eeab05af4130513c40efb Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Tue, Apr 14, 2015 at 9:43 PM, Christopher <ctubbsii@apache.org> wrote:
Well, it depends= on the behavior of the iterator beneath it. Most iterators would probably = not want to copy, for performance reasons, but an iterator could choose to = break that convention (or may need to). It is kind of unintuitive that a de= veloper would need to understand the behavior of the iterator beneath it, s= o a best practice is probably to just copy inside your iterator, if you nee= d a copy.

I know we've discussed trying to revamp the iterator API befo= re, so if you think there's something we could do to make this a bit mo= re intuitive, that'd be very welcome and appreciated.

Random thought on revamp.=C2=A0 Immutable key values w= ith enough primitives to make most operations efficient (avoid constant all= oc/copy) might be something to consider for the iterator API
= =C2=A0

On Tue, Apr 14, 2015 at 8:57 PM Dyla= n Hutchison <dhutc= his@mit.edu> wrote:
While debugging a custom iterator today to find the source of a log= ical error, I discovered something an iterator developer may not expect.=C2= =A0 The getTopValue() of RFile returns a reference to the RFile's inter= nal Value private variable.=C2=A0 The private variable is modified inside R= File via

val.readFields(currBlock);

which means that if an iterator stores the reference= from getTopValue(), that is, without copying the Value to a new Object, th= en the value will be updated in the iterator when the RFile's next() me= thod is called. =C2=A0

Here is an example snippet = to demonstrate:

Value v1 =3D source.getTopValue();
=
Value v1 =3D new Value(source.getTopValue(= ));
source.next();

I bet this = is done for performance reasons.=C2=A0 Is this expected?

Regards, Dylan

--001a113eeab05af4130513c40efb--