Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7AD4D11D39 for ; Fri, 25 Apr 2014 14:54:03 +0000 (UTC) Received: (qmail 78836 invoked by uid 500); 25 Apr 2014 14:54:02 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 78783 invoked by uid 500); 25 Apr 2014 14:54:02 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 78763 invoked by uid 99); 25 Apr 2014 14:54:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Apr 2014 14:54:01 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of threadedblue@gmail.com designates 74.125.82.177 as permitted sender) Received: from [74.125.82.177] (HELO mail-we0-f177.google.com) (74.125.82.177) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Apr 2014 14:53:57 +0000 Received: by mail-we0-f177.google.com with SMTP id t60so1916692wes.22 for ; Fri, 25 Apr 2014 07:53:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=t1kfGMUgEa0uGFr4IiLy7eUhxKZu/G6JleDGlPiaWNU=; b=XlOrAZD1oJWF3jRy3aJvXpsR1GHXZBPN2QAwTrG7M3aTIAuLVYit6ytoWiH8CjPf/x YD2aHGLDPlprxcQQtNj5wFtCVe2k5009R1QbhlFMaTc8I4gs+QqbPdskhW9G0ZixSvBq HLEpJgiotitJ/WdPBQ32Mysyl7pIxoeXi3CdDFUnYfWJwOVWBmDLhosGrKog3NNa0OXv Mfdy6Ofnf1/yiSGmGMnkUcZPPZb1Q7w71J38PXZKK0MySFHVVJjLSf8S+vSiQ07blulr 85N5kjg2ppQFNOF1uZVbtkhMHLXoqvowD8GyAVByjuoL/YenaDuNDuGsDuP4is3QR3ac ON2g== MIME-Version: 1.0 X-Received: by 10.180.78.225 with SMTP id e1mr4109576wix.17.1398437615449; Fri, 25 Apr 2014 07:53:35 -0700 (PDT) Received: by 10.194.139.76 with HTTP; Fri, 25 Apr 2014 07:53:35 -0700 (PDT) In-Reply-To: <535A6E7B.9040900@gmail.com> References: <535A6E7B.9040900@gmail.com> Date: Fri, 25 Apr 2014 10:53:35 -0400 Message-ID: Subject: Re: Embedded Mutations: Is this kind of thing done? From: Geoffry Roberts To: Accumulo Content-Type: multipart/alternative; boundary=f46d043c08644f3b4604f7df218b X-Virus-Checked: Checked by ClamAV on apache.org --f46d043c08644f3b4604f7df218b Content-Type: text/plain; charset=UTF-8 Ok Josh, you have me worried. I am storing the object's name in the colfam: e.g. "patientId", the object's data type goes in the colq: e.g "org.hl7.v3.II", then the value in the colval. I think the largest graph I'm likely to have is < 5k and you say I soul have memory problems. This is good topic. How then can I estimate? On Fri, Apr 25, 2014 at 10:17 AM, Josh Elser wrote: > Not necessarily. If you are storing just the type in the colq and have one > value and type per document/row, you won't have a problem. If you have more > than one value in a type per document/row, the last one you inserted will > be what sticks (which is likely undesirable). > > Of course, this is also assuming there isn't some other uniquely > identifying attribute in the colfam. > > > On 4/25/14, 9:55 AM, Geoffry Roberts wrote: > >> Thanks for the comments. >> >> I'm using the qualifier to tell me the type of the value. Sounds like >> I'm misusing it. >> >> My EMF documents are running no more than 5k so I gather a row will fit >> into memory well enough. >> >> >> On Fri, Apr 25, 2014 at 9:29 AM, Mike Drob > > wrote: >> >> Large rows are only an issue if you are going to try to put the >> entire row in memory at once. As long as you have small enough >> entries in the row, and can treat them individually, you should be >> fine. >> >> The qualifier is anything that you want to use to determine >> uniqueness across keys. So yes, this sounds fine, although possibly >> not fine grain enough. >> >> Mike >> >> >> On Fri, Apr 25, 2014 at 9:11 AM, Geoffry Roberts >> > wrote: >> >> Interesting, multiple mutations that is. Are we talking >> multiples on the same row id? >> >> Upon reflection, I realized the embedded thing is nothing >> special. I think I'll keep adding columns to a single mutation. >> This will make for a wide row, but I'm not seeing that as a >> problem. I am I being naive? >> >> Another question if I may. As I walk my graph, I must keep >> track of the type of the value being persisted. I am using the >> qualifier for this, putting in it a URI that indicates the type. >> Is this a proper use for the qualifier? >> >> Thanks for the discussion >> >> >> On Thu, Apr 24, 2014 at 11:23 PM, William Slacum >> > > wrote: >> >> Depending on your table schema, you'll probably want to >> translate an object graph into multiple mutations. >> >> >> On Thu, Apr 24, 2014 at 8:40 PM, David Medinets >> > >> >> wrote: >> >> If the sub-document changes, you'll need to search the >> values of every Accumulo entry? >> >> >> On Thu, Apr 24, 2014 at 5:31 PM, Geoffry Roberts >> > >> >> wrote: >> >> The use case is, I am walking a complex object graph >> and persisting what I find there. Said object graph >> in my case is always EMF (eclipse modeling >> framework) compliant. An EMF graph can have in if >> references to--brace yourself--a non-cross document >> containment reference. When using Mongo, these were >> persisted as a DBObject embedded into a containing >> DBObject. I'm trying to decide whether I want to >> follow suit. >> >> Any thoughts? >> >> >> On Thu, Apr 24, 2014 at 4:03 PM, Sean Busbey >> > >> >> wrote: >> >> Can you describe the use case more? Do you know >> what the purpose for the embedded changes are? >> >> >> On Thu, Apr 24, 2014 at 2:59 PM, Geoffry Roberts >> > > wrote: >> >> All, >> >> I am in the throws of converting >> some(else's) code from MongoDB to Accumulo. >> I am seeing a situation where one DBObject >> if being embedded into another DBObject. I >> see that Mutation supports a method called >> getRow() that returns a byte array. I >> gather I can use this to achieve a similar >> result if I were so inclined. >> >> Am I so inclined? i.e. Is this the way we >> do things in Accumulo? >> >> DBObject, roughly speaking, is Mongo's >> counterpart to Mutation. >> >> Thanks mucho >> >> -- >> There are ways and there are ways, >> >> Geoffry Roberts >> >> >> >> >> -- >> Sean >> >> >> >> >> -- >> There are ways and there are ways, >> >> Geoffry Roberts >> >> >> >> >> >> >> -- >> There are ways and there are ways, >> >> Geoffry Roberts >> >> >> >> >> >> -- >> There are ways and there are ways, >> >> Geoffry Roberts >> > -- There are ways and there are ways, Geoffry Roberts --f46d043c08644f3b4604f7df218b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Ok Josh, you have me worried.

I am stor= ing the object's name in the colfam: e.g. "patientId", the ob= ject's data type goes in the colq: e.g "org.hl7.v3.II", then = the value in the colval. =C2=A0I think the largest graph I'm likely to = have is < 5k and you say I soul have memory problems. =C2=A0This is good= topic. =C2=A0How then can I estimate? =C2=A0=C2=A0


On Fri,= Apr 25, 2014 at 10:17 AM, Josh Elser <josh.elser@gmail.com> wrote:
Not necessarily. If you are storing just the= type in the colq and have one value and type per document/row, you won'= ;t have a problem. If you have more than one value in a type per document/r= ow, the last one you inserted will be what sticks (which is likely undesira= ble).

Of course, this is also assuming there isn't some other uniquely identi= fying attribute in the colfam.


On 4/25/14, 9:55 AM, Geoffry Roberts wrote:
Thanks for the comments.

I'm using the qualifier to tell me the type of the value. =C2=A0Sounds = like
I'm misusing it.

My EMF documents are running =C2=A0no more than 5k so I gather a row will f= it
into memory well enough.


On Fri, Apr 25, 2014 at 9:29 AM, Mike Drob <madrob@cloudera.com
<mailto:madrob@= cloudera.com>> wrote:

=C2=A0 =C2=A0 Large rows are only an issue if you are going to try to put t= he
=C2=A0 =C2=A0 entire row in memory at once. As long as you have small enoug= h
=C2=A0 =C2=A0 entries in the row, and can treat them individually, you shou= ld be fine.

=C2=A0 =C2=A0 The qualifier is anything that you want to use to determine =C2=A0 =C2=A0 uniqueness across keys. So yes, this sounds fine, although po= ssibly
=C2=A0 =C2=A0 not fine grain enough.

=C2=A0 =C2=A0 Mike


=C2=A0 =C2=A0 On Fri, Apr 25, 2014 at 9:11 AM, Geoffry Roberts
=C2=A0 =C2=A0 <threadedblue@gmail.com <mailto:threadedblue@gmail.com>> wrote:<= br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Interesting, multiple mutations that is. =C2=A0= Are we talking
=C2=A0 =C2=A0 =C2=A0 =C2=A0 multiples on the same row id?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Upon reflection, I realized the embedded thing = is nothing
=C2=A0 =C2=A0 =C2=A0 =C2=A0 special. =C2=A0I think I'll keep adding col= umns to a single mutation.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 This will make for a wide row, but I'= ;m not seeing that as a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 problem. =C2=A0I am I being naive?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Another question if I may. =C2=A0As I walk my g= raph, I must keep
=C2=A0 =C2=A0 =C2=A0 =C2=A0 track of the type of the value being persisted.= =C2=A0I am using the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 qualifier for this, putting in it a URI that in= dicates the type.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Is this a proper use for the qualifier?<= br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Thanks for the discussion


=C2=A0 =C2=A0 =C2=A0 =C2=A0 On Thu, Apr 24, 2014 at 11:23 PM, William Slacu= m
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <wilhelm.von.cloud@accumulo.net
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <mailto:wilhelm.von.cloud@accumulo.net&g= t;> wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Depending on your table schema, y= ou'll probably want to
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 translate an object graph into mu= ltiple mutations.


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 On Thu, Apr 24, 2014 at 8:40 PM, = David Medinets
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <david.medinets@gmail.com <mailto:david.medinets@= gmail.com>>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 If the sub-document= changes, you'll need to search the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 values of every Acc= umulo entry?


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 On Thu, Apr 24, 201= 4 at 5:31 PM, Geoffry Roberts
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <threadedblue@gmail.com <= ;mailto:threade= dblue@gmail.com>>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 The u= se case is, I am walking a complex object graph
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 and p= ersisting what I find there. =C2=A0Said object graph
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 in my= case is always EMF (eclipse modeling
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 frame= work) compliant. =C2=A0An EMF graph can have in if
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 refer= ences to--brace yourself--a non-cross document
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 conta= inment reference. =C2=A0When using Mongo, these were
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 persi= sted as a DBObject embedded into a containing
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 DBObj= ect. =C2=A0I'm trying to decide whether I want to
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 follo= w suit.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Any t= houghts?


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 On Th= u, Apr 24, 2014 at 4:03 PM, Sean Busbey
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <<= a href=3D"mailto:busbey@cloudera.com" target=3D"_blank">busbey@cloudera.com= <mailto:bu= sbey@cloudera.com>>

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 wrote= :

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 Can you describe the use case more? Do you know
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 what the purpose for the embedded changes are?


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 On Thu, Apr 24, 2014 at 2:59 PM, Geoffry Roberts
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 <= threadedblue@gmail.com
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 <mailto:threadedblue@gmail.com>> wrote:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 All,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 I am in the throws of converting
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 some(else's) code from MongoDB to Accumulo. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 I am seeing a situation where one DBObject<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 if being embedded into another DBObject. =C2=A0I =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 see that Mutation supports a method called
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 getRow() =C2=A0that returns a byte array. =C2=A0I<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 gather I can use this to achieve a similar
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 result if I were so inclined.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 Am I so inclined? =C2=A0i.e. Is this the way we =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 do things in Accumulo?

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 DBObject, roughly speaking, is Mongo's
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 counterpart to Mutation.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 Thanks mucho

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 --
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 There are ways and there are ways,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 Geoffry Roberts




=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 --
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 Sean




=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -- =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 There= are ways and there are ways,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Geoff= ry Roberts






=C2=A0 =C2=A0 =C2=A0 =C2=A0 --
=C2=A0 =C2=A0 =C2=A0 =C2=A0 There are ways and there are ways,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 Geoffry Roberts





--
There are ways and there are ways,

Geoffry Roberts



--
There are ways and there are ways,=C2=A0

Geof= fry Roberts
--f46d043c08644f3b4604f7df218b--