asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Li <che...@gmail.com>
Subject Re: Creating RTree: no space left
Date Thu, 15 Sep 2016 19:50:56 GMT
@Wail: as a use case related to selectivity, our current Cloudberry
prototype doesn't benefit from R-tree when the user is analyzing the data
for the entire US.  But we expect to have R-tree benefits when a user zooms
into a small region.

On Thu, Sep 15, 2016 at 8:25 AM, Wail Alkowaileet <wael.y.k@gmail.com>
wrote:

> Hi Ahmed and Mike,
>
> @Ahmed
> I actually did a small experiment where I loaded about 1/5 of the data (so
> I can index it) and seems that the R-Tree was really useful for querying
> small regions or neighborhoods.
> I also tried the B-Tree and it was slower than a full scan.
>
> @Mike
> Unfortunately, I cannot still even after anonymization :-)
>
>
> On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dtabass@gmail.com> wrote:
>
> > Interesting point, so to speak.  @Wail, any chance you could post a
> Google
> > maps screenshot showing a visualization of the points in this dataset on
> > the underlying geographic region?  (If the dataset is shareable in that
> > anonymized form?)  I would think an R-tree would still be good for
> > small-region geo queries - possibly shrinking the candidate object set
> by a
> > factor of 10,000 - so still useful - and we also do index-AND-ing now, so
> > we would also combine that shrinkage by other index-provided shrinkage on
> > any other index-amenable predicates.  I think the queries are still
> spatial
> > in nature, and the only AsterixDB choices for that are R-tree.  (We did
> > experiments with things like Hilbert B-trees, but the results led to the
> > conclusion that the code base only needs R-trees for spatial data for the
> > forseeable future - they just work too well and in a no-tuning-required
> > fashion.... :-))
> >
> >
> >
> > On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> >
> >> Looks like an interesting case. Just a small question. Are you sure a
> >> spatial index is the right one to use here? The spatial attribute looks
> >> more like a categorization and a hash or B-tree index could be more
> >> suitable. As far as I know, the spatial index in AsterixDB is a
> secondary
> >> R-tree index which, like any other secondary index, is only good for
> >> retrieving a small number of records. For this dataset, it seems that
> any
> >> small range would still return a huge number of records.
> >>
> >> It is still interesting to further investigate and fix the sort issue
> but
> >> I
> >> mentioned the usage issue for a different perspective.
> >>
> >> Thanks
> >> Ahmed
> >>
> >> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dtabass@gmail.com> wrote:
> >>
> >> ☺!
> >>>
> >>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wael.y.k@gmail.com>
> wrote:
> >>>
> >>> To be exact
> >>>> I have 2,255,091,590 records and 10,391 points :-)
> >>>>
> >>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dtabass@gmail.com>
> wrote:
> >>>>
> >>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
> >>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
> >>>>>
> >>>> guess...!
> >>>
> >>>>
> >>>>>
> >>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
> >>>>>
> >>>>> @Mike: You filed an issue -
> >>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
> >>>>>>
> >>>>>> Best,
> >>>>>> Taewoo
> >>>>>>
> >>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dtabass@gmail.com>
> >>>>>>
> >>>>> wrote:
> >>>
> >>>> I can't remember (slight jetlag? :-)) if I shared back to this list
> >>>>>>
> >>>>> one
> >>>
> >>>> theory that came up in India when Wail and I talked F2F - his data
> >>>>>>>
> >>>>>> has
> >>>
> >>>> a
> >>>>
> >>>>> lot of duplicate points, so maybe something goes awry in that case.
> >>>>>>>
> >>>>>> I
> >>>
> >>>> wonder if we've sufficiently tested that case?  (E.g., what if there
> >>>>>>>
> >>>>>> are
> >>>>
> >>>>> gazillions of records originating from a small handful of points?)
> >>>>>>>
> >>>>>>>
> >>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
> >>>>>>>
> >>>>>>> Based on a rough calculation, per partition, each point
field takes
> >>>>>>>
> >>>>>> 3.6GB
> >>>>
> >>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
> >>>>>>>>
> >>>>>>> are
> >>>
> >>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
> >>>>>>>>
> >>>>>>> mentioned
> >>>>
> >>>>> that there was no issue when creating a B+ tree index, we need to
> >>>>>>>>
> >>>>>>> check
> >>>>
> >>>>> what SORT process is required by R-Tree index.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Taewoo
> >>>>>>>>
> >>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
> >>>>>>>>
> >>>>>>> jianfeng.jia@gmail.com
> >>>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>> If all of the file names start with “ExternalSortRunGenerator”,
> then
> >>>>>>>> they
> >>>>>>>>
> >>>>>>>> are the first round files which can not be GCed.
> >>>>>>>>> Could you provide the query plan as well?
> >>>>>>>>>
> >>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <
> wael.y.k@gmail.com
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Ian and Pouria,
> >>>>>>>>>
> >>>>>>>>>> The name of the files along with the sizes (there
were 625 one
> of
> >>>>>>>>>> those
> >>>>>>>>>> before crashing):
> >>>>>>>>>>
> >>>>>>>>>> size        name
> >>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
> >>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
> >>>>>>>>>>
> >>>>>>>>>> no files were generated beyond runs.
> >>>>>>>>>> compiler.sortmemory = 64MB
> >>>>>>>>>>
> >>>>>>>>>> Here is the full logs
> >>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
> >>>>>>>>>>
> >>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
> >>>>>>>>>>
> >>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh
<
> >>>>>>>>>
> >>>>>>>>>> pouria.pirzadeh@gmail.com>
> >>>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> We previously had issues with huge spilled sort
temp files when
> >>>>>>>>>> creating
> >>>>>>>>>>
> >>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
> >>>>>>>>>>> I also recall that Yingyi fixed the issue
of delaying clean-up
> >>>>>>>>>>>
> >>>>>>>>>> for
> >>>
> >>>> intermediate temp files until the end of the query execution.
> >>>>>>>>>>> If you can share names of a couple of temp
files (and their
> sizes
> >>>>>>>>>>> along
> >>>>>>>>>>> with the sort memory setting you have in
> >>>>>>>>>>>
> >>>>>>>>>> asterix-configuration.xml)
> >>>
> >>>> we
> >>>>>>>>>>>
> >>>>>>>>>>> may
> >>>>>>>>>>>
> >>>>>>>>>> be able to have a better guess as if the sort
is really going
> >>>>>>>>>>
> >>>>>>>>> into a
> >>>
> >>>> two-level merge or not.
> >>>>>>>>>>>
> >>>>>>>>>>> Pouria
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon
<imaxon@uci.edu>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>
> >>>>> I think that execption ("No space left on device") is just casted
> >>>>>>>>>>> from
> >>>>>>>>>>> the
> >>>>>>>>>>>
> >>>>>>>>>>> native IOException. Therefore I would be
inclined to believe
> it's
> >>>>>>>>>>>
> >>>>>>>>>>>> genuinely
> >>>>>>>>>>>>
> >>>>>>>>>>> out of space. I suppose the question is
why the external sort
> is
> >>>>>>>>>>>
> >>>>>>>>>> so
> >>>
> >>>> huge.
> >>>>>>>>>>>>
> >>>>>>>>>>> What is the query plan? Maybe that will
shed light on a
> possible
> >>>>>>>>>> cause.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet
<
> >>>>>>>>>>>
> >>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I was monitoring Inodes ... it didn't
go beyond 1%.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail
Alkowaileet <
> >>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Chris and Mike,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Actually I was monitoring it to
see what's going on:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>      - The size of each partition
is about 40GB (80GB in
> total
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> per
> >>>>
> >>>>>      iodevice).
> >>>>>>>>>>>>>>      - The runs took 157GB per
iodevice (about 2x of the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> dataset
> >>>
> >>>> size).
> >>>>>>>>>>>>>>      Each run takes either of
128MB or 96MB of storage.
> >>>>>>>>>>>>>>      - At a certain time, there
were 522 runs.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I even tried to create a BTree
Index to see if that happens
> as
> >>>>>>>>>>>>>> well.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> created two BTree indexes one for
the *location* and one for
> >>>>>>>>>>>> the
> >>>>>>>>>>>>
> >>>>>>>>>>>> *caller
> >>>>>>>>>>>>> *and
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> they were created successfully.
The sizes of the runs didn't
> >>>>>>>>>>>>>
> >>>>>>>>>>>> take
> >>>
> >>>> anyway
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> near that.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Logs are attached.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19
PM, Mike Carey <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> dtabass@gmail.com>
> >>>
> >>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> I think we might have "file GC issues"
- I vaguely remember
> >>>>>>>>>>>> that
> >>>>>>>>>>>>
> >>>>>>>>>>> we
> >>>>
> >>>>> don't
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> (or at least didn't once upon
a time) proactively remove
> >>>>>>>>>>>>>> unnecessary
> >>>>>>>>>>>>>> run
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> files - removing all of them
at end-of-job instead of at the
> >>>>>>>>>>>>>
> >>>>>>>>>>>> end
> >>>
> >>>> of
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> execution phase that uses their
contents.  We may also have
> an
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> "Amdahl
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> problem" right now with our
sort since we serialize phase
> two
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> parallel
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> sorts - though this is not a
query, it's index build, so
> that
> >>>>>>>>>>>>>> shouldn't
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> it.  It would be interesting to
put a df/sleep script on each
> >>>>>>>>>>>>>
> >>>>>>>>>>>> of
> >>>
> >>>> the
> >>>>>>>>>>>>>> nodes
> >>>>>>>>>>>>>> when this is happening - actually
a script that monitors the
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> temp
> >>>>
> >>>>> file
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> directory - and watch the lifecycle
happen and the sizes
> >>>>>>>>>>>>>
> >>>>>>>>>>>> change....
> >>>>
> >>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> When you get the "disk full"
warning, do a quick "df -i" on
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> the
> >>>
> >>>> device
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> possibly you've run out of inodes
even if the space isn't all
> >>>>>>>>>>>>>
> >>>>>>>>>>>> used
> >>>>
> >>>>> up.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> It's
> >>>>>>>>>>>>>> unlikely because I don't think
AsterixDB creates a bunch of
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> small
> >>>>
> >>>>> files,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> but worth checking.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If that's not it, then can you
share the full exception and
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> stack
> >>>>
> >>>>> trace?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ceej
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> aka Chris Hillery
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Aug 23, 2016
at 1:59 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wael.y.k@gmail.com>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> I just cleared the hard drives
to get 80% free space. I
> still
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> get
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> issue.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> The data contains:
> >>>>>>>>>>>>>>>>> 1- 2887453794 records.
> >>>>>>>>>>>>>>>>> 2- Schema:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> create type CDRType
as {
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> id:uuid,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'date':string,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'time':string,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'duration':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'caller':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 'callee':int64,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> location:point?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, Aug 23,
2016 at 9:06 AM, Wail Alkowaileet <
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wael.y.k@gmail.com
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Dears,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I have a dataset of size
290GB loaded in a 3 NCs each of
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> which
> >>>>
> >>>>> has
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2x500GB
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> SSD.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Each of NC has two IODevices
(partitions) in each hard
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> drive
> >>>
> >>>> (i.e
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> total is 4 iodevices
per NC). After loading the data,
> each
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Asterix
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> partition occupied 31GB.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The cluster has about 50%
free space in each hard drive
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> (approximately
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> about 250GB free
space in each hard drive). However,
> when I
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> tried
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> create
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> an index of type RTree,
I got an exception that no space
> left
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> in
> >>>
> >>>> the
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> hard
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> drive during the External
Sort phase.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Is that normal ?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> *Regards,*
> >>>>>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>>
> >>>>>>>>>>>> *Regards,*
> >>>>>>>>>> Wail Alkowaileet
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>
> >>>>>>>>> Jianfeng Jia
> >>>>>>>>> PhD Candidate of Computer Science
> >>>>>>>>> University of California, Irvine
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>> --
> >>>>
> >>>> *Regards,*
> >>>> Wail Alkowaileet
> >>>>
> >>>>
> >
>
>
> --
>
> *Regards,*
> Wail Alkowaileet
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message