Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 57570 invoked from network); 13 Mar 2011 10:53:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Mar 2011 10:53:30 -0000 Received: (qmail 53246 invoked by uid 500); 13 Mar 2011 10:53:29 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 53220 invoked by uid 500); 13 Mar 2011 10:53:29 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 53212 invoked by uid 99); 13 Mar 2011 10:53:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 10:53:29 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,X_IP X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of alienth@gmail.com designates 209.85.210.60 as permitted sender) Received: from [209.85.210.60] (HELO mail-pz0-f60.google.com) (209.85.210.60) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 10:53:22 +0000 Received: by pzk9 with SMTP id 9so455395pzk.25 for ; Sun, 13 Mar 2011 03:53:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.231.3 with SMTP id d3mr838786wfh.39.1300013582398; Sun, 13 Mar 2011 03:53:02 -0700 (PDT) Received: by d12g2000prj.googlegroups.com with HTTP; Sun, 13 Mar 2011 03:53:02 -0700 (PDT) Date: Sun, 13 Mar 2011 03:53:02 -0700 (PDT) In-Reply-To: X-IP: 98.210.57.101 References: User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110221 Ubuntu/10.10 (maverick) Firefox/3.6.14,gzip(gfe) Message-ID: <63c4089f-b8c8-43c1-8db0-d1ae77d29506@d12g2000prj.googlegroups.com> Subject: Re: json2sstable hanging on large sstable2json-generated JSON file From: Jason Harvey To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable It eventually died with an OOM error. Guess the table was just too big :( Created an improvement request ticket: https://issues.apache.org/jira/browse/CASSANDRA-2322 Jason On Mar 12, 10:50=A0pm, Jason Harvey wrote: > Trying to import a 3GB JSON file which was exported from sstable2json. > I let it run for over an hour and saw zero IO activity. The last thing > it logs is the following: > > DEBUG 23:19:32,638 collecting 0 of 2147483647: > Avro/Schema:false:2042@1298067089267 > DEBUG 23:19:32,638 collecting 1 of 2147483647: reddit:false:2502@12980670= 89267 > > Considering I saw zero reads on my disk when I ran it, I don't think > it is even reading the JSON file. > > I shrunk the file down to a handful of keys, and it worked fine. Is > there an issue with json2sstable loading large JSON files? Does it try > to read it into memory? > > Also as a note, this data is unsorted. I did generate it via > sstable2json, but my sstables were broken and had unsorted data, which > is the whole reason I am doing this. > > Thanks! > Jason