asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taewoo Kim <wangs...@gmail.com>
Subject Re: Creating RTree: no space left
Date Fri, 26 Aug 2016 16:55:20 GMT
Based on a rough calculation, per partition, each point field takes 3.6GB
(16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
that there was no issue when creating a B+ tree index, we need to check
what SORT process is required by R-Tree index.

Best,
Taewoo

On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <jianfeng.jia@gmail.com>
wrote:

> If all of the file names start with “ExternalSortRunGenerator”, then they
> are the first round files which can not be GCed.
> Could you provide the query plan as well?
>
> > On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com>
> wrote:
> >
> > Hi Ian and Pouria,
> >
> > The name of the files along with the sizes (there were 625 one of those
> > before crashing):
> >
> > size        name
> > 96MB     ExternalSortRunGenerator8917133039835449370.waf
> > 128MB   ExternalSortRunGenerator8948724728025392343.waf
> >
> > no files were generated beyond runs.
> > compiler.sortmemory = 64MB
> >
> > Here is the full logs
> > <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> >
> > On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> pouria.pirzadeh@gmail.com>
> > wrote:
> >
> >> We previously had issues with huge spilled sort temp files when creating
> >> inverted index for fuzzy queries, but NOT R-Trees.
> >> I also recall that Yingyi fixed the issue of delaying clean-up for
> >> intermediate temp files until the end of the query execution.
> >> If you can share names of a couple of temp files (and their sizes along
> >> with the sort memory setting you have in asterix-configuration.xml) we
> may
> >> be able to have a better guess as if the sort is really going into a
> >> two-level merge or not.
> >>
> >> Pouria
> >>
> >> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <imaxon@uci.edu> wrote:
> >>
> >>> I think that execption ("No space left on device") is just casted from
> >> the
> >>> native IOException. Therefore I would be inclined to believe it's
> >> genuinely
> >>> out of space. I suppose the question is why the external sort is so
> huge.
> >>> What is the query plan? Maybe that will shed light on a possible cause.
> >>>
> >>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <wael.y.k@gmail.com>
> >>> wrote:
> >>>
> >>>> I was monitoring Inodes ... it didn't go beyond 1%.
> >>>>
> >>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <wael.y.k@gmail.com
> >
> >>>> wrote:
> >>>>
> >>>>> Hi Chris and Mike,
> >>>>>
> >>>>> Actually I was monitoring it to see what's going on:
> >>>>>
> >>>>>   - The size of each partition is about 40GB (80GB in total per
> >>>>>   iodevice).
> >>>>>   - The runs took 157GB per iodevice (about 2x of the dataset size).
> >>>>>   Each run takes either of 128MB or 96MB of storage.
> >>>>>   - At a certain time, there were 522 runs.
> >>>>>
> >>>>> I even tried to create a BTree Index to see if that happens as well.
> >> I
> >>>>> created two BTree indexes one for the *location* and one for the
> >>> *caller
> >>>> *and
> >>>>> they were created successfully. The sizes of the runs didn't take
> >>> anyway
> >>>>> near that.
> >>>>>
> >>>>> Logs are attached.
> >>>>>
> >>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dtabass@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> I think we might have "file GC issues" - I vaguely remember
that we
> >>>> don't
> >>>>>> (or at least didn't once upon a time) proactively remove unnecessary
> >>> run
> >>>>>> files - removing all of them at end-of-job instead of at the
end of
> >>> the
> >>>>>> execution phase that uses their contents.  We may also have
an
> >> "Amdahl
> >>>>>> problem" right now with our sort since we serialize phase two
of
> >>>> parallel
> >>>>>> sorts - though this is not a query, it's index build, so that
> >>> shouldn't
> >>>> be
> >>>>>> it.  It would be interesting to put a df/sleep script on each
of the
> >>>> nodes
> >>>>>> when this is happening - actually a script that monitors the
temp
> >> file
> >>>>>> directory - and watch the lifecycle happen and the sizes change....
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> >>>>>>
> >>>>>>> When you get the "disk full" warning, do a quick "df -i"
on the
> >>> device
> >>>> -
> >>>>>>> possibly you've run out of inodes even if the space isn't
all used
> >>> up.
> >>>>>>> It's
> >>>>>>> unlikely because I don't think AsterixDB creates a bunch
of small
> >>>> files,
> >>>>>>> but worth checking.
> >>>>>>>
> >>>>>>> If that's not it, then can you share the full exception
and stack
> >>>> trace?
> >>>>>>>
> >>>>>>> Ceej
> >>>>>>> aka Chris Hillery
> >>>>>>>
> >>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> >>> wael.y.k@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> I just cleared the hard drives to get 80% free space. I
still get
> >> the
> >>>>>>>> same
> >>>>>>>> issue.
> >>>>>>>>
> >>>>>>>> The data contains:
> >>>>>>>> 1- 2887453794 records.
> >>>>>>>> 2- Schema:
> >>>>>>>>
> >>>>>>>> create type CDRType as {
> >>>>>>>>
> >>>>>>>> id:uuid,
> >>>>>>>>
> >>>>>>>> 'date':string,
> >>>>>>>>
> >>>>>>>> 'time':string,
> >>>>>>>>
> >>>>>>>> 'duration':int64,
> >>>>>>>>
> >>>>>>>> 'caller':int64,
> >>>>>>>>
> >>>>>>>> 'callee':int64,
> >>>>>>>>
> >>>>>>>> location:point?
> >>>>>>>>
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet <
> >>> wael.y.k@gmail.com
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Dears,
> >>>>>>>>>
> >>>>>>>>> I have a dataset of size 290GB loaded in a 3 NCs
each of which
> >> has
> >>>>>>>>>
> >>>>>>>> 2x500GB
> >>>>>>>>
> >>>>>>>>> SSD.
> >>>>>>>>>
> >>>>>>>>> Each of NC has two IODevices (partitions) in each
hard drive (i.e
> >>> the
> >>>>>>>>> total is 4 iodevices per NC). After loading the
data, each
> >> Asterix
> >>>>>>>>> partition occupied 31GB.
> >>>>>>>>>
> >>>>>>>>> The cluster has about 50% free space in each hard
drive
> >>>> (approximately
> >>>>>>>>> about 250GB free space in each hard drive). However,
when I tried
> >>> to
> >>>>>>>>>
> >>>>>>>> create
> >>>>>>>>
> >>>>>>>>> an index of type RTree, I got an exception that
no space left in
> >>> the
> >>>>>>>>> hard
> >>>>>>>>> drive during the External Sort phase.
> >>>>>>>>>
> >>>>>>>>> Is that normal ?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> *Regards,*
> >>>>>>>>> Wail Alkowaileet
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> *Regards,*
> >>>>>>>> Wail Alkowaileet
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> *Regards,*
> >>>>> Wail Alkowaileet
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> *Regards,*
> >>>> Wail Alkowaileet
> >>>>
> >>>
> >>
> >
> >
> >
> > --
> >
> > *Regards,*
> > Wail Alkowaileet
>
>
>
> Best,
>
> Jianfeng Jia
> PhD Candidate of Computer Science
> University of California, Irvine
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message