Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CB084108F1 for ; Thu, 21 Nov 2013 20:03:36 +0000 (UTC) Received: (qmail 53385 invoked by uid 500); 21 Nov 2013 20:03:35 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 53353 invoked by uid 500); 21 Nov 2013 20:03:35 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 53345 invoked by uid 99); 21 Nov 2013 20:03:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Nov 2013 20:03:35 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tim@alwaysreformed.com designates 74.55.86.74 as permitted sender) Received: from [74.55.86.74] (HELO smtp.webfaction.com) (74.55.86.74) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Nov 2013 20:03:30 +0000 Received: from [192.168.1.3] (unknown [108.67.115.10]) by smtp.webfaction.com (Postfix) with ESMTP id 2AB8D26ED230 for ; Thu, 21 Nov 2013 20:03:08 +0000 (UTC) Message-ID: <528E66FB.6040501@alwaysreformed.com> Date: Thu, 21 Nov 2013 14:03:07 -0600 From: Tim Black User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: couchdb data structure References: <528D0D58.9090604@alwaysreformed.com> In-Reply-To: <528D0D58.9090604@alwaysreformed.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I should note that instead of using the common convention Jens mentioned of a document attribute named "type," my code below uses a document attribute named "collection" since that is what backbone-relational requires. So to follow the convention of an attribute named "type" you'll have to delete my code that currently handles a "type" attribute (which has a different meaning in my code), and replace the word "collection" with "type". Tim On 11/20/2013 01:28 PM, Tim Black wrote: > > Here are two files I use to migrate data from sqlite to CouchDB, which I > offer here as an example for any who are doing similar work: > > csv2json.py: > > ---------------- > #!/usr/bin/env python > > import csv, sys, json > > # Open the file passed as a command line argument > f = open(sys.argv[1], 'r') > reader = csv.DictReader(f) > rows = [] > for row in reader: > for key in row.keys(): > # Remove underscore from beginning of attribute names > if key.startswith('_'): > new_key = key.lstrip('_') > row[new_key] = row[key] > del row[key] > # Insert document collection column, which equals the sqlite > table name > row['collection'] = sys.argv[2] > # Convert id column to namespaced id to avoid conflicts > if key == 'id': > row['_id'] = sys.argv[2] + '.' + row['id'] > del row['id'] > if key == 'user_id': > row['_id'] = sys.argv[2] + '.' + row['user_id'] > del row['user_id'] > if key == 'type': > row['job'] = row['type'] > del row['type'] > rows.append(row) > # Wrap in CouchDB _bulk_docs JSON format > out = '{"docs":%s}' % json.dumps(rows) > > print(out) > ----------------- > > sqlite2csv2couchdb.sh > > ------------------ > #!/bin/bash > > # Get the database from the production site > scp remote_host:path/to/sqlite.db . > > DB="http://username:password@localhost:5984/projects" > > # TODO: Use filtered replication to save the design docs > # Delete old copy of database > curl -X DELETE $DB > # Wait a second to let CouchDB delete the old database. > sleep 1 > # Create new copy of database > curl -X PUT $DB > > # TODO: Set permissions on couchdb database > # Create list of tables > tables=`sqlite3 devdata.db 'SELECT tbl_name FROM sqlite_master WHERE > type="table"'` > > while read -r line; > > do > # Filter out the visits tables > if [ "$line" != "visit" ] && [ "$line" != "visit_identity" ] > then > # Get table of data > rows=$(sqlite3 -csv -header sqlite.db "SELECT * FROM $line") > > echo "$rows" > tmp.csv > rows=$( python csv2json.py tmp.csv $line ) > > # write JSON to file to avoid curl error of having too many > command line arguments > echo "$rows" > tmp.json > > # Insert table into couchdb > > curl -d @tmp.json -H "Content-Type:application/json" -X POST > $DB/_bulk_docs &> /dev/null > > fi > done <<< "$tables" > > rm tmp.json > rm tmp.csv > rm devdata.db > ---------------------