asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wail Alkowaileet <wael....@gmail.com>
Subject Re: Creating RTree: no space left
Date Thu, 25 Aug 2016 05:02:55 GMT
Hi Ian and Pouria,

The name of the files along with the sizes (there were 625 one of those
before crashing):

size        name
96MB     ExternalSortRunGenerator8917133039835449370.waf
128MB   ExternalSortRunGenerator8948724728025392343.waf

no files were generated beyond runs.
compiler.sortmemory = 64MB

Here is the full logs
<https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_25_07%3A34%3A52_AST_2016.zip?dl=0>

On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <pouria.pirzadeh@gmail.com>
wrote:

> We previously had issues with huge spilled sort temp files when creating
> inverted index for fuzzy queries, but NOT R-Trees.
> I also recall that Yingyi fixed the issue of delaying clean-up for
> intermediate temp files until the end of the query execution.
> If you can share names of a couple of temp files (and their sizes along
> with the sort memory setting you have in asterix-configuration.xml) we may
> be able to have a better guess as if the sort is really going into a
> two-level merge or not.
>
> Pouria
>
> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <imaxon@uci.edu> wrote:
>
> > I think that execption ("No space left on device") is just casted from
> the
> > native IOException. Therefore I would be inclined to believe it's
> genuinely
> > out of space. I suppose the question is why the external sort is so huge.
> > What is the query plan? Maybe that will shed light on a possible cause.
> >
> > On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <wael.y.k@gmail.com>
> > wrote:
> >
> > > I was monitoring Inodes ... it didn't go beyond 1%.
> > >
> > > On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <wael.y.k@gmail.com>
> > > wrote:
> > >
> > > > Hi Chris and Mike,
> > > >
> > > > Actually I was monitoring it to see what's going on:
> > > >
> > > >    - The size of each partition is about 40GB (80GB in total per
> > > >    iodevice).
> > > >    - The runs took 157GB per iodevice (about 2x of the dataset size).
> > > >    Each run takes either of 128MB or 96MB of storage.
> > > >    - At a certain time, there were 522 runs.
> > > >
> > > > I even tried to create a BTree Index to see if that happens as well.
> I
> > > > created two BTree indexes one for the *location* and one for the
> > *caller
> > > *and
> > > > they were created successfully. The sizes of the runs didn't take
> > anyway
> > > > near that.
> > > >
> > > > Logs are attached.
> > > >
> > > > On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dtabass@gmail.com>
> wrote:
> > > >
> > > >> I think we might have "file GC issues" - I vaguely remember that we
> > > don't
> > > >> (or at least didn't once upon a time) proactively remove unnecessary
> > run
> > > >> files - removing all of them at end-of-job instead of at the end of
> > the
> > > >> execution phase that uses their contents.  We may also have an
> "Amdahl
> > > >> problem" right now with our sort since we serialize phase two of
> > > parallel
> > > >> sorts - though this is not a query, it's index build, so that
> > shouldn't
> > > be
> > > >> it.  It would be interesting to put a df/sleep script on each of the
> > > nodes
> > > >> when this is happening - actually a script that monitors the temp
> file
> > > >> directory - and watch the lifecycle happen and the sizes change....
> > > >>
> > > >>
> > > >>
> > > >> On 8/23/16 2:06 AM, Chris Hillery wrote:
> > > >>
> > > >>> When you get the "disk full" warning, do a quick "df -i" on the
> > device
> > > -
> > > >>> possibly you've run out of inodes even if the space isn't all
used
> > up.
> > > >>> It's
> > > >>> unlikely because I don't think AsterixDB creates a bunch of small
> > > files,
> > > >>> but worth checking.
> > > >>>
> > > >>> If that's not it, then can you share the full exception and stack
> > > trace?
> > > >>>
> > > >>> Ceej
> > > >>> aka Chris Hillery
> > > >>>
> > > >>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> > wael.y.k@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>> I just cleared the hard drives to get 80% free space. I still
get
> the
> > > >>>> same
> > > >>>> issue.
> > > >>>>
> > > >>>> The data contains:
> > > >>>> 1- 2887453794 records.
> > > >>>> 2- Schema:
> > > >>>>
> > > >>>> create type CDRType as {
> > > >>>>
> > > >>>> id:uuid,
> > > >>>>
> > > >>>> 'date':string,
> > > >>>>
> > > >>>> 'time':string,
> > > >>>>
> > > >>>> 'duration':int64,
> > > >>>>
> > > >>>> 'caller':int64,
> > > >>>>
> > > >>>> 'callee':int64,
> > > >>>>
> > > >>>> location:point?
> > > >>>>
> > > >>>> }
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> > wael.y.k@gmail.com
> > > >
> > > >>>> wrote:
> > > >>>>
> > > >>>> Dears,
> > > >>>>>
> > > >>>>> I have a dataset of size 290GB loaded in a 3 NCs each
of which
> has
> > > >>>>>
> > > >>>> 2x500GB
> > > >>>>
> > > >>>>> SSD.
> > > >>>>>
> > > >>>>> Each of NC has two IODevices (partitions) in each hard
drive (i.e
> > the
> > > >>>>> total is 4 iodevices per NC). After loading the data,
each
> Asterix
> > > >>>>> partition occupied 31GB.
> > > >>>>>
> > > >>>>> The cluster has about 50% free space in each hard drive
> > > (approximately
> > > >>>>> about 250GB free space in each hard drive). However, when
I tried
> > to
> > > >>>>>
> > > >>>> create
> > > >>>>
> > > >>>>> an index of type RTree, I got an exception that no space
left in
> > the
> > > >>>>> hard
> > > >>>>> drive during the External Sort phase.
> > > >>>>>
> > > >>>>> Is that normal ?
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>>
> > > >>>>> *Regards,*
> > > >>>>> Wail Alkowaileet
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>> --
> > > >>>>
> > > >>>> *Regards,*
> > > >>>> Wail Alkowaileet
> > > >>>>
> > > >>>>
> > > >>
> > > >
> > > >
> > > > --
> > > >
> > > > *Regards,*
> > > > Wail Alkowaileet
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > *Regards,*
> > > Wail Alkowaileet
> > >
> >
>



-- 

*Regards,*
Wail Alkowaileet

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message