Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CB6DE200B80 for ; Wed, 14 Sep 2016 19:30:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C9FC5160AB4; Wed, 14 Sep 2016 17:30:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C56DD160AB3 for ; Wed, 14 Sep 2016 19:30:10 +0200 (CEST) Received: (qmail 13996 invoked by uid 500); 14 Sep 2016 17:30:09 -0000 Mailing-List: contact dev-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list dev@asterixdb.apache.org Received: (qmail 13980 invoked by uid 99); 14 Sep 2016 17:30:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Sep 2016 17:30:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 1E6C7C0E36 for ; Wed, 14 Sep 2016 17:30:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.179 X-Spam-Level: ** X-Spam-Status: No, score=2.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id dh72SHY9RitE for ; Wed, 14 Sep 2016 17:30:02 +0000 (UTC) Received: from mail-oi0-f45.google.com (mail-oi0-f45.google.com [209.85.218.45]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id B018060D13 for ; Wed, 14 Sep 2016 17:30:01 +0000 (UTC) Received: by mail-oi0-f45.google.com with SMTP id w11so32591508oia.2 for ; Wed, 14 Sep 2016 10:30:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=ZBrgYhf+bZw6WyLsTDmjYzCQUYkxaUaL5eHZ54L5wLA=; b=yrT0ixcLJLNsDuePS2psrcTFIzBysy8g1cDzEHqhqikUQyt6Z0lgQEly+C0cGOK1/Z o86a/SYMYePdevbfo91xm4gqZRbHjblKz+HjsPC9rDpYbEyZyS8XYY7IeM7LHJv8scHj 39vl2Z4r/WlEOoT8wZqvKVbJZN1r79W6mjYM2iMu0vfIfo9YK5f8QaY5D+kdQbRVoyS7 ieTw8lXWNQl9Ijs+bACBsl9RoIr+CHDJXJSQ6NCUv4oPLDOPU0abkGK56nbxvZVBVRkJ WAD8V0kw+HJpPkmvXAKb/Y0oro9Gnf67mZIX/TXubvNiwbtx17yrykyMIwBiir8bNiCc h9BQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=ZBrgYhf+bZw6WyLsTDmjYzCQUYkxaUaL5eHZ54L5wLA=; b=CbT13K8Gi1gj108Vz/vsJDoPqQJerspZQDRUwPc21PFN2P6KTVOaGbpItJhjtNZ4y3 kXJyoldLW4ARtcUtP8ZJc3MIJQ2wMC+7iORGqVzcuUrBUgvsn/PRiPWRkVOCH0o3zs8o q2vkj9KlQzUGJUR17HEOdrLHBy1KcKq3KxlyI+8iCux/QIXj6/zE6ozbv/NigKx+8l/P JgWHAhmf0a9QuoF2JGANYWNic0nrRbUrbmum6BEEEFXXjvAv2bQ2dmO//0MAWcfucofL rRSrJtqTe5PRTJ0QM9LcPR+6yECsyuSB+Zrx26thpbvlQlamuakyUTfZXD0Qj905iv96 hjrQ== X-Gm-Message-State: AE9vXwMYwSZQHb2pE+Q9OaSJmhqOJgsTqDYA7218yzJE+9clL6MirfXWG07JFYYI5s3a1yujoV8UCyhNdlso5w== X-Received: by 10.202.225.138 with SMTP id y132mr3642111oig.12.1473874200323; Wed, 14 Sep 2016 10:30:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.205.82 with HTTP; Wed, 14 Sep 2016 10:29:58 -0700 (PDT) Received: by 10.202.205.82 with HTTP; Wed, 14 Sep 2016 10:29:58 -0700 (PDT) In-Reply-To: References: <79AD9AB5-0F1C-4C62-A3D7-A802997FB310@gmail.com> <65ef861f-a5d1-af3e-6df5-a787c956276e@gmail.com> From: Mike Carey Date: Wed, 14 Sep 2016 10:29:58 -0700 Message-ID: Subject: Re: Creating RTree: no space left To: dev@asterixdb.apache.org Content-Type: multipart/alternative; boundary=001a113d38d2276d04053c7b1442 archived-at: Wed, 14 Sep 2016 17:30:12 -0000 --001a113d38d2276d04053c7b1442 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable =E2=98=BA! On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" wrote: > To be exact > I have 2,255,091,590 records and 10,391 points :-) > > On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey wrote: > > > Thx! I knew I'd meant to "activate" the thought somehow, but couldn't > > remember having done it for sure. Oops! Scattered from VLDB, I guess..= .! > > > > > > > > On 9/13/16 9:58 PM, Taewoo Kim wrote: > > > >> @Mike: You filed an issue - > >> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-) > >> > >> Best, > >> Taewoo > >> > >> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey wrote: > >> > >> I can't remember (slight jetlag? :-)) if I shared back to this list on= e > >>> theory that came up in India when Wail and I talked F2F - his data ha= s > a > >>> lot of duplicate points, so maybe something goes awry in that case. = I > >>> wonder if we've sufficiently tested that case? (E.g., what if there > are > >>> gazillions of records originating from a small handful of points?) > >>> > >>> > >>> On 8/26/16 9:55 AM, Taewoo Kim wrote: > >>> > >>> Based on a rough calculation, per partition, each point field takes > 3.6GB > >>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we ar= e > >>>> generating 625 files (96MB or 128MB each) =3D 157GB. Since Wail > mentioned > >>>> that there was no issue when creating a B+ tree index, we need to > check > >>>> what SORT process is required by R-Tree index. > >>>> > >>>> Best, > >>>> Taewoo > >>>> > >>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia > > >>>> wrote: > >>>> > >>>> If all of the file names start with =E2=80=9CExternalSortRunGenerato= r=E2=80=9D, then > >>>> they > >>>> > >>>>> are the first round files which can not be GCed. > >>>>> Could you provide the query plan as well? > >>>>> > >>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet > >>>>> wrote: > >>>>> > >>>>> Hi Ian and Pouria, > >>>>>> > >>>>>> The name of the files along with the sizes (there were 625 one of > >>>>>> those > >>>>>> before crashing): > >>>>>> > >>>>>> size name > >>>>>> 96MB ExternalSortRunGenerator8917133039835449370.waf > >>>>>> 128MB ExternalSortRunGenerator8948724728025392343.waf > >>>>>> > >>>>>> no files were generated beyond runs. > >>>>>> compiler.sortmemory =3D 64MB > >>>>>> > >>>>>> Here is the full logs > >>>>>> >>>>>> > >>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=3D0> > >>>>> > >>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh < > >>>>>> > >>>>>> pouria.pirzadeh@gmail.com> > >>>>> > >>>>> wrote: > >>>>>> > >>>>>> We previously had issues with huge spilled sort temp files when > >>>>>> creating > >>>>>> > >>>>>>> inverted index for fuzzy queries, but NOT R-Trees. > >>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up fo= r > >>>>>>> intermediate temp files until the end of the query execution. > >>>>>>> If you can share names of a couple of temp files (and their sizes > >>>>>>> along > >>>>>>> with the sort memory setting you have in asterix-configuration.xm= l) > >>>>>>> we > >>>>>>> > >>>>>>> may > >>>>>> be able to have a better guess as if the sort is really going into= a > >>>>>> > >>>>>>> two-level merge or not. > >>>>>>> > >>>>>>> Pouria > >>>>>>> > >>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon > wrote: > >>>>>>> > >>>>>>> I think that execption ("No space left on device") is just casted > >>>>>>> from > >>>>>>> the > >>>>>>> > >>>>>>> native IOException. Therefore I would be inclined to believe it's > >>>>>>>> > >>>>>>>> genuinely > >>>>>>> > >>>>>>> out of space. I suppose the question is why the external sort is = so > >>>>>>>> > >>>>>>>> huge. > >>>>>>> > >>>>>> What is the query plan? Maybe that will shed light on a possible > >>>>>> cause. > >>>>>> > >>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet < > >>>>>>>> wael.y.k@gmail.com > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%. > >>>>>>>> > >>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet < > >>>>>>>>> wael.y.k@gmail.com > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi Chris and Mike, > >>>>>>>>> > >>>>>>>>>> Actually I was monitoring it to see what's going on: > >>>>>>>>>> > >>>>>>>>>> - The size of each partition is about 40GB (80GB in total > per > >>>>>>>>>> iodevice). > >>>>>>>>>> - The runs took 157GB per iodevice (about 2x of the datase= t > >>>>>>>>>> size). > >>>>>>>>>> Each run takes either of 128MB or 96MB of storage. > >>>>>>>>>> - At a certain time, there were 522 runs. > >>>>>>>>>> > >>>>>>>>>> I even tried to create a BTree Index to see if that happens as > >>>>>>>>>> well. > >>>>>>>>>> > >>>>>>>>>> I > >>>>>>>>> > >>>>>>>> created two BTree indexes one for the *location* and one for the > >>>>>>>> > >>>>>>>>> *caller > >>>>>>>>> *and > >>>>>>>>> > >>>>>>>>> they were created successfully. The sizes of the runs didn't ta= ke > >>>>>>>>>> > >>>>>>>>>> anyway > >>>>>>>>> near that. > >>>>>>>>> > >>>>>>>>>> Logs are attached. > >>>>>>>>>> > >>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey > >>>>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>> I think we might have "file GC issues" - I vaguely remember that > we > >>>>>>>> > >>>>>>>>> don't > >>>>>>>>>> (or at least didn't once upon a time) proactively remove > >>>>>>>>>> unnecessary > >>>>>>>>>> run > >>>>>>>>>> > >>>>>>>>> files - removing all of them at end-of-job instead of at the en= d > of > >>>>>>>>> > >>>>>>>>>> the > >>>>>>>>>> > >>>>>>>>> execution phase that uses their contents. We may also have an > >>>>>>>>> > >>>>>>>>>> "Amdahl > >>>>>>>>>> > >>>>>>>>> problem" right now with our sort since we serialize phase two o= f > >>>>>>>> > >>>>>>>>> parallel > >>>>>>>>>> sorts - though this is not a query, it's index build, so that > >>>>>>>>>> shouldn't > >>>>>>>>>> > >>>>>>>>> be > >>>>>>>>> > >>>>>>>>> it. It would be interesting to put a df/sleep script on each o= f > >>>>>>>>>> the > >>>>>>>>>> nodes > >>>>>>>>>> when this is happening - actually a script that monitors the > temp > >>>>>>>>>> file > >>>>>>>>>> > >>>>>>>>> directory - and watch the lifecycle happen and the sizes > change.... > >>>>>>>> > >>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote: > >>>>>>>>>>> > >>>>>>>>>>> When you get the "disk full" warning, do a quick "df -i" on t= he > >>>>>>>>>>> device > >>>>>>>>>>> > >>>>>>>>>> - > >>>>>>>>> > >>>>>>>>> possibly you've run out of inodes even if the space isn't all > used > >>>>>>>>>> > >>>>>>>>>>> up. > >>>>>>>>>>> > >>>>>>>>>> It's > >>>>>>>>> > >>>>>>>>>> unlikely because I don't think AsterixDB creates a bunch of > small > >>>>>>>>>>>> > >>>>>>>>>>>> files, > >>>>>>>>>>> > >>>>>>>>>> but worth checking. > >>>>>>>>>> > >>>>>>>>>>> If that's not it, then can you share the full exception and > stack > >>>>>>>>>>>> > >>>>>>>>>>>> trace? > >>>>>>>>>>> > >>>>>>>>>> Ceej > >>>>>>>>>> > >>>>>>>>>>> aka Chris Hillery > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet < > >>>>>>>>>>>> > >>>>>>>>>>>> wael.y.k@gmail.com> > >>>>>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> I just cleared the hard drives to get 80% free space. I still > get > >>>>>>>>>>>> > >>>>>>>>>>>> the > >>>>>>>>>>> > >>>>>>>>>> same > >>>>>>>> > >>>>>>>>> issue. > >>>>>>>>>>>>> > >>>>>>>>>>>>> The data contains: > >>>>>>>>>>>>> 1- 2887453794 records. > >>>>>>>>>>>>> 2- Schema: > >>>>>>>>>>>>> > >>>>>>>>>>>>> create type CDRType as { > >>>>>>>>>>>>> > >>>>>>>>>>>>> id:uuid, > >>>>>>>>>>>>> > >>>>>>>>>>>>> 'date':string, > >>>>>>>>>>>>> > >>>>>>>>>>>>> 'time':string, > >>>>>>>>>>>>> > >>>>>>>>>>>>> 'duration':int64, > >>>>>>>>>>>>> > >>>>>>>>>>>>> 'caller':int64, > >>>>>>>>>>>>> > >>>>>>>>>>>>> 'callee':int64, > >>>>>>>>>>>>> > >>>>>>>>>>>>> location:point? > >>>>>>>>>>>>> > >>>>>>>>>>>>> } > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet < > >>>>>>>>>>>>> > >>>>>>>>>>>>> wael.y.k@gmail.com > >>>>>>>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Dears, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs each of > which > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> has > >>>>>>>>>>>>> > >>>>>>>>>>>> 2x500GB > >>>>>>>> > >>>>>>>>> SSD. > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Each of NC has two IODevices (partitions) in each hard dri= ve > >>>>>>>>>>>>>> (i.e > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> the > >>>>>>>>>>>>> > >>>>>>>>>>>> total is 4 iodevices per NC). After loading the data, each > >>>>>>>>> > >>>>>>>>>> Asterix > >>>>>>>>>>>>> > >>>>>>>>>>>> partition occupied 31GB. > >>>>>>>> > >>>>>>>>> The cluster has about 50% free space in each hard drive > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> (approximately > >>>>>>>>>>>>> > >>>>>>>>>>>> about 250GB free space in each hard drive). However, when I > >>>>>>>>>> tried > >>>>>>>>>> > >>>>>>>>>>> to > >>>>>>>>>>>>> > >>>>>>>>>>>> create > >>>>>>>>> > >>>>>>>>>> an index of type RTree, I got an exception that no space left = in > >>>>>>>>>>>>> the > >>>>>>>>>>>>> > >>>>>>>>>>>> hard > >>>>>>>>> > >>>>>>>>>> drive during the External Sort phase. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Is that normal ? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> *Regards,* > >>>>>>>>>>>>>> Wail Alkowaileet > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> > >>>>>>>>>>>>> *Regards,* > >>>>>>>>>>>>> Wail Alkowaileet > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>> > >>>>>>>>>> *Regards,* > >>>>>>>>>> Wail Alkowaileet > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>> > >>>>>>>>> *Regards,* > >>>>>>>>> Wail Alkowaileet > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>> > >>>>>> *Regards,* > >>>>>> Wail Alkowaileet > >>>>>> > >>>>>> > >>>>> Best, > >>>>> > >>>>> Jianfeng Jia > >>>>> PhD Candidate of Computer Science > >>>>> University of California, Irvine > >>>>> > >>>>> > >>>>> > >>>>> > > > > > -- > > *Regards,* > Wail Alkowaileet > --001a113d38d2276d04053c7b1442--