Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9492A9A49 for ; Tue, 10 Apr 2012 00:34:39 +0000 (UTC) Received: (qmail 71478 invoked by uid 500); 10 Apr 2012 00:34:38 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 71302 invoked by uid 500); 10 Apr 2012 00:34:38 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 71294 invoked by uid 99); 10 Apr 2012 00:34:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 00:34:38 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of andywoodcock11@gmail.com designates 209.85.215.52 as permitted sender) Received: from [209.85.215.52] (HELO mail-lpp01m010-f52.google.com) (209.85.215.52) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 00:34:32 +0000 Received: by lahi5 with SMTP id i5so4768853lah.11 for ; Mon, 09 Apr 2012 17:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Ih9Rc9qlx5MlpCEvZJKAToXB1DSGlPqKQO0EAmXOXM0=; b=RSUnwUGhMYhbKsS/LFtTlom6IVh0qZEr/P8VJtlqH9oTjm9a//Rq63OXH173okO+Yr jfHWuaVfb9UiPpLFrgnSS1nLOhAwGpnSTN6Do2bNw+5HkUTcVGnKicr35zqfxOdw9gHO VDpuGRWBhDuxWWk44hYAc1+rfIBGW7ap2Y1xPB0oc2i0ISoyACZzqSsREg6KZAl4TLFj Pr3ATSAtwTI87kMyTNd6rKr8uPvN2/3McU5sG1YdJoFOF5WpZrbPEnf/K0klGbYtiK+B PZwNB+bdvLOOaMgFQ6GLxyZJyMeifFpT55Lb2uIw3onxQtkT5I6QqBFtNtFU2fHm9oI7 c0yg== MIME-Version: 1.0 Received: by 10.152.111.198 with SMTP id ik6mr13331787lab.38.1334018051573; Mon, 09 Apr 2012 17:34:11 -0700 (PDT) Received: by 10.152.13.137 with HTTP; Mon, 9 Apr 2012 17:34:11 -0700 (PDT) In-Reply-To: References: Date: Mon, 9 Apr 2012 17:34:11 -0700 Message-ID: Subject: Re: Schema Design when migrating data from relational into document From: Andrew Woodcock To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=f46d0408398916878e04bd484875 --f46d0408398916878e04bd484875 Content-Type: text/plain; charset=ISO-8859-1 Bear in mind as well how often you will be updating certain information: each update creates a document revision, so a large document where a couple of fields (or even just one) are frequently updated can lead to increased storage requirements and will also impact replication: there will be frequent replication of a large document where only a small part is actually changing. In a scenario like that, it may well be better to have the frequently updating field(s) in separate documents. Regards, Andrew On 9 April 2012 17:26, Mohammad Prabowo wrote: > Thanks! I had read somewhere that there is a tradeoff between embedding the > data (example 1) or more normalized document (example 2). It's more of a > choice between data locality, disk space, and querying flexibilities. I > guess since every query must go trough views, the speed benefits of data > locality is therefore reduced > > On Mon, Apr 9, 2012 at 9:01 PM, Keith Gable >wrote: > > > I'd go the first route, but salaries and titles should be arrays of > hashes: > > > > "titles": [ > > { "name": "xxx", "from": "xxx", "to": "xxx" } > > ] > > > > If you want to decouple the data, like if you wanted a list of all > titles, > > you'd use CouchDB views. > > On Apr 9, 2012 6:07 AM, "Mohammad Prabowo" wrote: > > > > > Hi, suppose i have relational db with schema like this > > > > > > employees-schema< > > > http://dev.mysql.com/doc/employee/en/images/employees-schema.png> > > > > > > I want to try converting it into document. I have two question: > > > > > > 1. The main strength of Document is that it is 'self contained'. > > Meaning > > > we don't need to do JOIN stuff, and all data that is needed are > > contained > > > within documents. So, should i choose to use nested documents like > > this : > > > > > > { > > > "emp_no": "...", > > > "birth_date": "...", > > > "first_name": "..", > > > "last_name": "..", > > > "gender": "..", > > > "hire_date": "..", > > > "titles": { > > > "title": "...", > > > "from_date": "...", > > > "to_date": "..." > > > }, > > > "salaries": { > > > "salary": "...", > > > "from_date": "...", > > > "to_date": "..." > > > } > > > } > > > > > > > > > or using different documents like this : > > > > > > [ > > > { > > > "doc_name": "employees", > > > "emp_no": "...", > > > "birth_date": "...", > > > "first_name": "..", > > > "last_name": "..", > > > "gender": "..", > > > "hire_date": ".." > > > }, > > > { > > > "doc_name": "titles", > > > "from_date": "...", > > > "to_date": "..." > > > }, > > > { > > > "doc_name": "salaries", > > > "salary": "...", > > > "from_date": "...", > > > "to_date": "..." > > > } > > > ] > > > > > > > > > 2. I want to benchmark MySQL and CouchDB with > > > YCSB. > > > Is there are db layer that has been built for CouchDB ? > > > > > > Thanks in advance > > > > > > --f46d0408398916878e04bd484875--