asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wail Alkowaileet <wael....@gmail.com>
Subject Re: Creating RTree: no space left
Date Fri, 26 Aug 2016 22:57:13 GMT
@Jianfeng: Sorry for the stupid questio. But it seems that the logs and the
WebUI does not show the plan. Is there a flag for that?

@Taewoo: I'll look into it and see what's going on. AFAIK, the comparator
is Hilbert.

On Fri, Aug 26, 2016 at 7:55 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:

> Based on a rough calculation, per partition, each point field takes 3.6GB
> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
> generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
> that there was no issue when creating a B+ tree index, we need to check
> what SORT process is required by R-Tree index.
>
> Best,
> Taewoo
>
> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <jianfeng.jia@gmail.com>
> wrote:
>
> > If all of the file names start with “ExternalSortRunGenerator”, then they
> > are the first round files which can not be GCed.
> > Could you provide the query plan as well?
> >
> > > On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com>
> > wrote:
> > >
> > > Hi Ian and Pouria,
> > >
> > > The name of the files along with the sizes (there were 625 one of those
> > > before crashing):
> > >
> > > size        name
> > > 96MB     ExternalSortRunGenerator8917133039835449370.waf
> > > 128MB   ExternalSortRunGenerator8948724728025392343.waf
> > >
> > > no files were generated beyond runs.
> > > compiler.sortmemory = 64MB
> > >
> > > Here is the full logs
> > > <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> > 25_07%3A34%3A52_AST_2016.zip?dl=0>
> > >
> > > On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
> > pouria.pirzadeh@gmail.com>
> > > wrote:
> > >
> > >> We previously had issues with huge spilled sort temp files when
> creating
> > >> inverted index for fuzzy queries, but NOT R-Trees.
> > >> I also recall that Yingyi fixed the issue of delaying clean-up for
> > >> intermediate temp files until the end of the query execution.
> > >> If you can share names of a couple of temp files (and their sizes
> along
> > >> with the sort memory setting you have in asterix-configuration.xml) we
> > may
> > >> be able to have a better guess as if the sort is really going into a
> > >> two-level merge or not.
> > >>
> > >> Pouria
> > >>
> > >> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <imaxon@uci.edu> wrote:
> > >>
> > >>> I think that execption ("No space left on device") is just casted
> from
> > >> the
> > >>> native IOException. Therefore I would be inclined to believe it's
> > >> genuinely
> > >>> out of space. I suppose the question is why the external sort is so
> > huge.
> > >>> What is the query plan? Maybe that will shed light on a possible
> cause.
> > >>>
> > >>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <
> wael.y.k@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> I was monitoring Inodes ... it didn't go beyond 1%.
> > >>>>
> > >>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
> wael.y.k@gmail.com
> > >
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Chris and Mike,
> > >>>>>
> > >>>>> Actually I was monitoring it to see what's going on:
> > >>>>>
> > >>>>>   - The size of each partition is about 40GB (80GB in total
per
> > >>>>>   iodevice).
> > >>>>>   - The runs took 157GB per iodevice (about 2x of the dataset
> size).
> > >>>>>   Each run takes either of 128MB or 96MB of storage.
> > >>>>>   - At a certain time, there were 522 runs.
> > >>>>>
> > >>>>> I even tried to create a BTree Index to see if that happens
as
> well.
> > >> I
> > >>>>> created two BTree indexes one for the *location* and one for
the
> > >>> *caller
> > >>>> *and
> > >>>>> they were created successfully. The sizes of the runs didn't
take
> > >>> anyway
> > >>>>> near that.
> > >>>>>
> > >>>>> Logs are attached.
> > >>>>>
> > >>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dtabass@gmail.com>
> > >> wrote:
> > >>>>>
> > >>>>>> I think we might have "file GC issues" - I vaguely remember
that
> we
> > >>>> don't
> > >>>>>> (or at least didn't once upon a time) proactively remove
> unnecessary
> > >>> run
> > >>>>>> files - removing all of them at end-of-job instead of at
the end
> of
> > >>> the
> > >>>>>> execution phase that uses their contents.  We may also
have an
> > >> "Amdahl
> > >>>>>> problem" right now with our sort since we serialize phase
two of
> > >>>> parallel
> > >>>>>> sorts - though this is not a query, it's index build, so
that
> > >>> shouldn't
> > >>>> be
> > >>>>>> it.  It would be interesting to put a df/sleep script on
each of
> the
> > >>>> nodes
> > >>>>>> when this is happening - actually a script that monitors
the temp
> > >> file
> > >>>>>> directory - and watch the lifecycle happen and the sizes
> change....
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> > >>>>>>
> > >>>>>>> When you get the "disk full" warning, do a quick "df
-i" on the
> > >>> device
> > >>>> -
> > >>>>>>> possibly you've run out of inodes even if the space
isn't all
> used
> > >>> up.
> > >>>>>>> It's
> > >>>>>>> unlikely because I don't think AsterixDB creates a
bunch of small
> > >>>> files,
> > >>>>>>> but worth checking.
> > >>>>>>>
> > >>>>>>> If that's not it, then can you share the full exception
and stack
> > >>>> trace?
> > >>>>>>>
> > >>>>>>> Ceej
> > >>>>>>> aka Chris Hillery
> > >>>>>>>
> > >>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet <
> > >>> wael.y.k@gmail.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>> I just cleared the hard drives to get 80% free space.
I still get
> > >> the
> > >>>>>>>> same
> > >>>>>>>> issue.
> > >>>>>>>>
> > >>>>>>>> The data contains:
> > >>>>>>>> 1- 2887453794 records.
> > >>>>>>>> 2- Schema:
> > >>>>>>>>
> > >>>>>>>> create type CDRType as {
> > >>>>>>>>
> > >>>>>>>> id:uuid,
> > >>>>>>>>
> > >>>>>>>> 'date':string,
> > >>>>>>>>
> > >>>>>>>> 'time':string,
> > >>>>>>>>
> > >>>>>>>> 'duration':int64,
> > >>>>>>>>
> > >>>>>>>> 'caller':int64,
> > >>>>>>>>
> > >>>>>>>> 'callee':int64,
> > >>>>>>>>
> > >>>>>>>> location:point?
> > >>>>>>>>
> > >>>>>>>> }
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet
<
> > >>> wael.y.k@gmail.com
> > >>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> Dears,
> > >>>>>>>>>
> > >>>>>>>>> I have a dataset of size 290GB loaded in a
3 NCs each of which
> > >> has
> > >>>>>>>>>
> > >>>>>>>> 2x500GB
> > >>>>>>>>
> > >>>>>>>>> SSD.
> > >>>>>>>>>
> > >>>>>>>>> Each of NC has two IODevices (partitions) in
each hard drive
> (i.e
> > >>> the
> > >>>>>>>>> total is 4 iodevices per NC). After loading
the data, each
> > >> Asterix
> > >>>>>>>>> partition occupied 31GB.
> > >>>>>>>>>
> > >>>>>>>>> The cluster has about 50% free space in each
hard drive
> > >>>> (approximately
> > >>>>>>>>> about 250GB free space in each hard drive).
However, when I
> tried
> > >>> to
> > >>>>>>>>>
> > >>>>>>>> create
> > >>>>>>>>
> > >>>>>>>>> an index of type RTree, I got an exception
that no space left
> in
> > >>> the
> > >>>>>>>>> hard
> > >>>>>>>>> drive during the External Sort phase.
> > >>>>>>>>>
> > >>>>>>>>> Is that normal ?
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>>
> > >>>>>>>>> *Regards,*
> > >>>>>>>>> Wail Alkowaileet
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>>
> > >>>>>>>> *Regards,*
> > >>>>>>>> Wail Alkowaileet
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>>
> > >>>>> *Regards,*
> > >>>>> Wail Alkowaileet
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>>
> > >>>> *Regards,*
> > >>>> Wail Alkowaileet
> > >>>>
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > *Regards,*
> > > Wail Alkowaileet
> >
> >
> >
> > Best,
> >
> > Jianfeng Jia
> > PhD Candidate of Computer Science
> > University of California, Irvine
> >
> >
>



-- 

*Regards,*
Wail Alkowaileet

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message