asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Creating RTree: no space left
Date Wed, 14 Sep 2016 07:46:21 GMT
Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't 
remember having done it for sure.  Oops! Scattered from VLDB, I guess...!


On 9/13/16 9:58 PM, Taewoo Kim wrote:
> @Mike: You filed an issue -
> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>
> Best,
> Taewoo
>
> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dtabass@gmail.com> wrote:
>
>> I can't remember (slight jetlag? :-)) if I shared back to this list one
>> theory that came up in India when Wail and I talked F2F - his data has a
>> lot of duplicate points, so maybe something goes awry in that case.  I
>> wonder if we've sufficiently tested that case?  (E.g., what if there are
>> gazillions of records originating from a small handful of points?)
>>
>>
>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>
>>> Based on a rough calculation, per partition, each point field takes 3.6GB
>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we are
>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail mentioned
>>> that there was no issue when creating a B+ tree index, we need to check
>>> what SORT process is required by R-Tree index.
>>>
>>> Best,
>>> Taewoo
>>>
>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <jianfeng.jia@gmail.com>
>>> wrote:
>>>
>>> If all of the file names start with “ExternalSortRunGenerator”, then they
>>>> are the first round files which can not be GCed.
>>>> Could you provide the query plan as well?
>>>>
>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Ian and Pouria,
>>>>>
>>>>> The name of the files along with the sizes (there were 625 one of those
>>>>> before crashing):
>>>>>
>>>>> size        name
>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>
>>>>> no files were generated beyond runs.
>>>>> compiler.sortmemory = 64MB
>>>>>
>>>>> Here is the full logs
>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>
>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>
>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>
>>>> pouria.pirzadeh@gmail.com>
>>>>
>>>>> wrote:
>>>>>
>>>>> We previously had issues with huge spilled sort temp files when creating
>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>> I also recall that Yingyi fixed the issue of delaying clean-up for
>>>>>> intermediate temp files until the end of the query execution.
>>>>>> If you can share names of a couple of temp files (and their sizes
along
>>>>>> with the sort memory setting you have in asterix-configuration.xml)
we
>>>>>>
>>>>> may
>>>>> be able to have a better guess as if the sort is really going into a
>>>>>> two-level merge or not.
>>>>>>
>>>>>> Pouria
>>>>>>
>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <imaxon@uci.edu>
wrote:
>>>>>>
>>>>>> I think that execption ("No space left on device") is just casted
from
>>>>>> the
>>>>>>
>>>>>>> native IOException. Therefore I would be inclined to believe
it's
>>>>>>>
>>>>>> genuinely
>>>>>>
>>>>>>> out of space. I suppose the question is why the external sort
is so
>>>>>>>
>>>>>> huge.
>>>>> What is the query plan? Maybe that will shed light on a possible cause.
>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>> I was monitoring Inodes ... it didn't go beyond 1%.
>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet <
>>>>>>>> wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Chris and Mike,
>>>>>>>>> Actually I was monitoring it to see what's going on:
>>>>>>>>>
>>>>>>>>>     - The size of each partition is about 40GB (80GB
in total per
>>>>>>>>>     iodevice).
>>>>>>>>>     - The runs took 157GB per iodevice (about 2x of the
dataset
>>>>>>>>> size).
>>>>>>>>>     Each run takes either of 128MB or 96MB of storage.
>>>>>>>>>     - At a certain time, there were 522 runs.
>>>>>>>>>
>>>>>>>>> I even tried to create a BTree Index to see if that happens
as well.
>>>>>>>>>
>>>>>>>> I
>>>>>>> created two BTree indexes one for the *location* and one for
the
>>>>>>>> *caller
>>>>>>>> *and
>>>>>>>>
>>>>>>>>> they were created successfully. The sizes of the runs
didn't take
>>>>>>>>>
>>>>>>>> anyway
>>>>>>>> near that.
>>>>>>>>> Logs are attached.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM, Mike Carey <dtabass@gmail.com>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>> I think we might have "file GC issues" - I vaguely remember that
we
>>>>>>>>> don't
>>>>>>>>> (or at least didn't once upon a time) proactively remove
unnecessary
>>>>>>>>> run
>>>>>>>> files - removing all of them at end-of-job instead of at
the end of
>>>>>>>>> the
>>>>>>>> execution phase that uses their contents.  We may also have
an
>>>>>>>>> "Amdahl
>>>>>>> problem" right now with our sort since we serialize phase two
of
>>>>>>>>> parallel
>>>>>>>>> sorts - though this is not a query, it's index build,
so that
>>>>>>>>> shouldn't
>>>>>>>> be
>>>>>>>>
>>>>>>>>> it.  It would be interesting to put a df/sleep script
on each of the
>>>>>>>>> nodes
>>>>>>>>> when this is happening - actually a script that monitors
the temp
>>>>>>>>> file
>>>>>>> directory - and watch the lifecycle happen and the sizes change....
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>>
>>>>>>>>>> When you get the "disk full" warning, do a quick
"df -i" on the
>>>>>>>>>> device
>>>>>>>> -
>>>>>>>>
>>>>>>>>> possibly you've run out of inodes even if the space isn't
all used
>>>>>>>>>> up.
>>>>>>>> It's
>>>>>>>>>>> unlikely because I don't think AsterixDB creates
a bunch of small
>>>>>>>>>>>
>>>>>>>>>> files,
>>>>>>>>> but worth checking.
>>>>>>>>>>> If that's not it, then can you share the full
exception and stack
>>>>>>>>>>>
>>>>>>>>>> trace?
>>>>>>>>> Ceej
>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59 AM, Wail Alkowaileet
<
>>>>>>>>>>>
>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>>> I just cleared the hard drives to get 80% free
space. I still get
>>>>>>>>>>>
>>>>>>>>>> the
>>>>>>> same
>>>>>>>>>>>> issue.
>>>>>>>>>>>>
>>>>>>>>>>>> The data contains:
>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>
>>>>>>>>>>>> create type CDRType as {
>>>>>>>>>>>>
>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>
>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>
>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>
>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>
>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>
>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>
>>>>>>>>>>>> location:point?
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 9:06 AM, Wail Alkowaileet
<
>>>>>>>>>>>>
>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>> wrote:
>>>>>>>>>>>> Dears,
>>>>>>>>>>>>
>>>>>>>>>>>>> I have a dataset of size 290GB loaded
in a 3 NCs each of which
>>>>>>>>>>>>>
>>>>>>>>>>>> has
>>>>>>> 2x500GB
>>>>>>>>>>>> SSD.
>>>>>>>>>>>>> Each of NC has two IODevices (partitions)
in each hard drive
>>>>>>>>>>>>> (i.e
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>> total is 4 iodevices per NC). After loading the data, each
>>>>>>>>>>>> Asterix
>>>>>>> partition occupied 31GB.
>>>>>>>>>>>>> The cluster has about 50% free space
in each hard drive
>>>>>>>>>>>>>
>>>>>>>>>>>> (approximately
>>>>>>>>> about 250GB free space in each hard drive). However,
when I tried
>>>>>>>>>>>> to
>>>>>>>> create
>>>>>>>>>>>> an index of type RTree, I got an exception
that no space left in
>>>>>>>>>>>> the
>>>>>>>> hard
>>>>>>>>>>>>> drive during the External Sort phase.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> *Regards,*
>>>>>>>>> Wail Alkowaileet
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> *Regards,*
>>>>>>>> Wail Alkowaileet
>>>>>>>>
>>>>>>>>
>>>>> --
>>>>>
>>>>> *Regards,*
>>>>> Wail Alkowaileet
>>>>>
>>>>
>>>> Best,
>>>>
>>>> Jianfeng Jia
>>>> PhD Candidate of Computer Science
>>>> University of California, Irvine
>>>>
>>>>
>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message