asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Creating RTree: no space left
Date Wed, 14 Sep 2016 20:29:49 GMT
Interesting point, so to speak.  @Wail, any chance you could post a 
Google maps screenshot showing a visualization of the points in this 
dataset on the underlying geographic region?  (If the dataset is 
shareable in that anonymized form?)  I would think an R-tree would still 
be good for small-region geo queries - possibly shrinking the candidate 
object set by a factor of 10,000 - so still useful - and we also do 
index-AND-ing now, so we would also combine that shrinkage by other 
index-provided shrinkage on any other index-amenable predicates.  I 
think the queries are still spatial in nature, and the only AsterixDB 
choices for that are R-tree.  (We did experiments with things like 
Hilbert B-trees, but the results led to the conclusion that the code 
base only needs R-trees for spatial data for the forseeable future - 
they just work too well and in a no-tuning-required fashion.... :-))


On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
> Looks like an interesting case. Just a small question. Are you sure a
> spatial index is the right one to use here? The spatial attribute looks
> more like a categorization and a hash or B-tree index could be more
> suitable. As far as I know, the spatial index in AsterixDB is a secondary
> R-tree index which, like any other secondary index, is only good for
> retrieving a small number of records. For this dataset, it seems that any
> small range would still return a huge number of records.
>
> It is still interesting to further investigate and fix the sort issue but I
> mentioned the usage issue for a different perspective.
>
> Thanks
> Ahmed
>
> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dtabass@gmail.com> wrote:
>
>> ☺!
>>
>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wael.y.k@gmail.com> wrote:
>>
>>> To be exact
>>> I have 2,255,091,590 records and 10,391 points :-)
>>>
>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dtabass@gmail.com> wrote:
>>>
>>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
>> guess...!
>>>>
>>>>
>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
>>>>
>>>>> @Mike: You filed an issue -
>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>>>>>
>>>>> Best,
>>>>> Taewoo
>>>>>
>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dtabass@gmail.com>
>> wrote:
>>>>> I can't remember (slight jetlag? :-)) if I shared back to this list
>> one
>>>>>> theory that came up in India when Wail and I talked F2F - his data
>> has
>>> a
>>>>>> lot of duplicate points, so maybe something goes awry in that case.
>> I
>>>>>> wonder if we've sufficiently tested that case?  (E.g., what if there
>>> are
>>>>>> gazillions of records originating from a small handful of points?)
>>>>>>
>>>>>>
>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>>>>>
>>>>>> Based on a rough calculation, per partition, each point field takes
>>> 3.6GB
>>>>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB,
we
>> are
>>>>>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
>>> mentioned
>>>>>>> that there was no issue when creating a B+ tree index, we need
to
>>> check
>>>>>>> what SORT process is required by R-Tree index.
>>>>>>>
>>>>>>> Best,
>>>>>>> Taewoo
>>>>>>>
>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
>> jianfeng.jia@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> If all of the file names start with “ExternalSortRunGenerator”,
then
>>>>>>> they
>>>>>>>
>>>>>>>> are the first round files which can not be GCed.
>>>>>>>> Could you provide the query plan as well?
>>>>>>>>
>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Ian and Pouria,
>>>>>>>>> The name of the files along with the sizes (there were
625 one of
>>>>>>>>> those
>>>>>>>>> before crashing):
>>>>>>>>>
>>>>>>>>> size        name
>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>>>>>
>>>>>>>>> no files were generated beyond runs.
>>>>>>>>> compiler.sortmemory = 64MB
>>>>>>>>>
>>>>>>>>> Here is the full logs
>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>>>>>
>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>>>>> pouria.pirzadeh@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> We previously had issues with huge spilled sort temp
files when
>>>>>>>>> creating
>>>>>>>>>
>>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>>>>>> I also recall that Yingyi fixed the issue of delaying
clean-up
>> for
>>>>>>>>>> intermediate temp files until the end of the query
execution.
>>>>>>>>>> If you can share names of a couple of temp files
(and their sizes
>>>>>>>>>> along
>>>>>>>>>> with the sort memory setting you have in
>> asterix-configuration.xml)
>>>>>>>>>> we
>>>>>>>>>>
>>>>>>>>>> may
>>>>>>>>> be able to have a better guess as if the sort is really
going
>> into a
>>>>>>>>>> two-level merge or not.
>>>>>>>>>>
>>>>>>>>>> Pouria
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <imaxon@uci.edu>
>>> wrote:
>>>>>>>>>> I think that execption ("No space left on device")
is just casted
>>>>>>>>>> from
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>> native IOException. Therefore I would be inclined
to believe it's
>>>>>>>>>>> genuinely
>>>>>>>>>> out of space. I suppose the question is why the external
sort is
>> so
>>>>>>>>>>> huge.
>>>>>>>>> What is the query plan? Maybe that will shed light on
a possible
>>>>>>>>> cause.
>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet
<
>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I was monitoring Inodes ... it didn't go beyond
1%.
>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet
<
>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Chris and Mike,
>>>>>>>>>>>>
>>>>>>>>>>>>> Actually I was monitoring it to see what's
going on:
>>>>>>>>>>>>>
>>>>>>>>>>>>>      - The size of each partition is
about 40GB (80GB in total
>>> per
>>>>>>>>>>>>>      iodevice).
>>>>>>>>>>>>>      - The runs took 157GB per iodevice
(about 2x of the
>> dataset
>>>>>>>>>>>>> size).
>>>>>>>>>>>>>      Each run takes either of 128MB or
96MB of storage.
>>>>>>>>>>>>>      - At a certain time, there were
522 runs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I even tried to create a BTree Index
to see if that happens as
>>>>>>>>>>>>> well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I
>>>>>>>>>>> created two BTree indexes one for the *location*
and one for the
>>>>>>>>>>>
>>>>>>>>>>>> *caller
>>>>>>>>>>>> *and
>>>>>>>>>>>>
>>>>>>>>>>>> they were created successfully. The sizes
of the runs didn't
>> take
>>>>>>>>>>>>> anyway
>>>>>>>>>>>> near that.
>>>>>>>>>>>>
>>>>>>>>>>>>> Logs are attached.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike
Carey <
>> dtabass@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>> I think we might have "file GC issues" - I vaguely
remember that
>>> we
>>>>>>>>>>>> don't
>>>>>>>>>>>>> (or at least didn't once upon a time)
proactively remove
>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>> run
>>>>>>>>>>>>>
>>>>>>>>>>>> files - removing all of them at end-of-job
instead of at the
>> end
>>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>> execution phase that uses their contents.
 We may also have an
>>>>>>>>>>>>
>>>>>>>>>>>>> "Amdahl
>>>>>>>>>>>>>
>>>>>>>>>>>> problem" right now with our sort since we
serialize phase two
>> of
>>>>>>>>>>>> parallel
>>>>>>>>>>>>> sorts - though this is not a query, it's
index build, so that
>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>>
>>>>>>>>>>>> be
>>>>>>>>>>>>
>>>>>>>>>>>> it.  It would be interesting to put a df/sleep
script on each
>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>> nodes
>>>>>>>>>>>>> when this is happening - actually a script
that monitors the
>>> temp
>>>>>>>>>>>>> file
>>>>>>>>>>>>>
>>>>>>>>>>>> directory - and watch the lifecycle happen
and the sizes
>>> change....
>>>>>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery
wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When you get the "disk full" warning,
do a quick "df -i" on
>> the
>>>>>>>>>>>>>> device
>>>>>>>>>>>>>>
>>>>>>>>>>>>> -
>>>>>>>>>>>> possibly you've run out of inodes even if
the space isn't all
>>> used
>>>>>>>>>>>>>> up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> It's
>>>>>>>>>>>>> unlikely because I don't think AsterixDB
creates a bunch of
>>> small
>>>>>>>>>>>>>>> files,
>>>>>>>>>>>>> but worth checking.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If that's not it, then can you share
the full exception and
>>> stack
>>>>>>>>>>>>>>> trace?
>>>>>>>>>>>>> Ceej
>>>>>>>>>>>>>
>>>>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59
AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> I just cleared the hard drives to get
80% free space. I still
>>> get
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>> same
>>>>>>>>>>>> issue.
>>>>>>>>>>>>>>>> The data contains:
>>>>>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> location:point?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06
AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Dears,
>>>>>>>>>>>>>>>> I have a dataset of size
290GB loaded in a 3 NCs each of
>>> which
>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>> 2x500GB
>>>>>>>>>>>> SSD.
>>>>>>>>>>>>>>>>> Each of NC has two IODevices
(partitions) in each hard
>> drive
>>>>>>>>>>>>>>>>> (i.e
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> total is 4 iodevices per NC).
After loading the data, each
>>>>>>>>>>>>> Asterix
>>>>>>>>>>>>>>> partition occupied 31GB.
>>>>>>>>>>>> The cluster has about 50% free space in each
hard drive
>>>>>>>>>>>>>>>>> (approximately
>>>>>>>>>>>>>>> about 250GB free space in each
hard drive). However, when I
>>>>>>>>>>>>> tried
>>>>>>>>>>>>>
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> create
>>>>>>>>>>>>> an index of type RTree, I got an exception
that no space left
>> in
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hard
>>>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>> *Regards,*
>>>>>>>>> Wail Alkowaileet
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Jianfeng Jia
>>>>>>>> PhD Candidate of Computer Science
>>>>>>>> University of California, Irvine
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>> --
>>>
>>> *Regards,*
>>> Wail Alkowaileet
>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message