asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wail Alkowaileet <wael....@gmail.com>
Subject Re: Creating RTree: no space left
Date Thu, 15 Sep 2016 15:25:22 GMT
Hi Ahmed and Mike,

@Ahmed
I actually did a small experiment where I loaded about 1/5 of the data (so
I can index it) and seems that the R-Tree was really useful for querying
small regions or neighborhoods.
I also tried the B-Tree and it was slower than a full scan.

@Mike
Unfortunately, I cannot still even after anonymization :-)


On Wed, Sep 14, 2016 at 11:29 PM, Mike Carey <dtabass@gmail.com> wrote:

> Interesting point, so to speak.  @Wail, any chance you could post a Google
> maps screenshot showing a visualization of the points in this dataset on
> the underlying geographic region?  (If the dataset is shareable in that
> anonymized form?)  I would think an R-tree would still be good for
> small-region geo queries - possibly shrinking the candidate object set by a
> factor of 10,000 - so still useful - and we also do index-AND-ing now, so
> we would also combine that shrinkage by other index-provided shrinkage on
> any other index-amenable predicates.  I think the queries are still spatial
> in nature, and the only AsterixDB choices for that are R-tree.  (We did
> experiments with things like Hilbert B-trees, but the results led to the
> conclusion that the code base only needs R-trees for spatial data for the
> forseeable future - they just work too well and in a no-tuning-required
> fashion.... :-))
>
>
>
> On 9/14/16 12:49 PM, Ahmed Eldawy wrote:
>
>> Looks like an interesting case. Just a small question. Are you sure a
>> spatial index is the right one to use here? The spatial attribute looks
>> more like a categorization and a hash or B-tree index could be more
>> suitable. As far as I know, the spatial index in AsterixDB is a secondary
>> R-tree index which, like any other secondary index, is only good for
>> retrieving a small number of records. For this dataset, it seems that any
>> small range would still return a huge number of records.
>>
>> It is still interesting to further investigate and fix the sort issue but
>> I
>> mentioned the usage issue for a different perspective.
>>
>> Thanks
>> Ahmed
>>
>> On Wed, Sep 14, 2016 at 10:30 AM Mike Carey <dtabass@gmail.com> wrote:
>>
>> ☺!
>>>
>>> On Sep 14, 2016 1:11 AM, "Wail Alkowaileet" <wael.y.k@gmail.com> wrote:
>>>
>>> To be exact
>>>> I have 2,255,091,590 records and 10,391 points :-)
>>>>
>>>> On Wed, Sep 14, 2016 at 10:46 AM, Mike Carey <dtabass@gmail.com> wrote:
>>>>
>>>> Thx!  I knew I'd meant to "activate" the thought somehow, but couldn't
>>>>> remember having done it for sure.  Oops! Scattered from VLDB, I
>>>>>
>>>> guess...!
>>>
>>>>
>>>>>
>>>>> On 9/13/16 9:58 PM, Taewoo Kim wrote:
>>>>>
>>>>> @Mike: You filed an issue -
>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1639. :-)
>>>>>>
>>>>>> Best,
>>>>>> Taewoo
>>>>>>
>>>>>> On Tue, Sep 13, 2016 at 9:28 PM, Mike Carey <dtabass@gmail.com>
>>>>>>
>>>>> wrote:
>>>
>>>> I can't remember (slight jetlag? :-)) if I shared back to this list
>>>>>>
>>>>> one
>>>
>>>> theory that came up in India when Wail and I talked F2F - his data
>>>>>>>
>>>>>> has
>>>
>>>> a
>>>>
>>>>> lot of duplicate points, so maybe something goes awry in that case.
>>>>>>>
>>>>>> I
>>>
>>>> wonder if we've sufficiently tested that case?  (E.g., what if there
>>>>>>>
>>>>>> are
>>>>
>>>>> gazillions of records originating from a small handful of points?)
>>>>>>>
>>>>>>>
>>>>>>> On 8/26/16 9:55 AM, Taewoo Kim wrote:
>>>>>>>
>>>>>>> Based on a rough calculation, per partition, each point field
takes
>>>>>>>
>>>>>> 3.6GB
>>>>
>>>>> (16 bytes * 2887453794 records / 12 partition). To sort 3.6GB, we
>>>>>>>>
>>>>>>> are
>>>
>>>> generating 625 files (96MB or 128MB each) = 157GB. Since Wail
>>>>>>>>
>>>>>>> mentioned
>>>>
>>>>> that there was no issue when creating a B+ tree index, we need to
>>>>>>>>
>>>>>>> check
>>>>
>>>>> what SORT process is required by R-Tree index.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Taewoo
>>>>>>>>
>>>>>>>> On Fri, Aug 26, 2016 at 7:52 AM, Jianfeng Jia <
>>>>>>>>
>>>>>>> jianfeng.jia@gmail.com
>>>
>>>> wrote:
>>>>>>>>
>>>>>>>> If all of the file names start with “ExternalSortRunGenerator”,
then
>>>>>>>> they
>>>>>>>>
>>>>>>>> are the first round files which can not be GCed.
>>>>>>>>> Could you provide the query plan as well?
>>>>>>>>>
>>>>>>>>> On Aug 24, 2016, at 10:02 PM, Wail Alkowaileet <wael.y.k@gmail.com
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Ian and Pouria,
>>>>>>>>>
>>>>>>>>>> The name of the files along with the sizes (there
were 625 one of
>>>>>>>>>> those
>>>>>>>>>> before crashing):
>>>>>>>>>>
>>>>>>>>>> size        name
>>>>>>>>>> 96MB     ExternalSortRunGenerator8917133039835449370.waf
>>>>>>>>>> 128MB   ExternalSortRunGenerator8948724728025392343.waf
>>>>>>>>>>
>>>>>>>>>> no files were generated beyond runs.
>>>>>>>>>> compiler.sortmemory = 64MB
>>>>>>>>>>
>>>>>>>>>> Here is the full logs
>>>>>>>>>> <https://www.dropbox.com/s/k2qbo3wybc8mnnk/log_Thu_Aug_
>>>>>>>>>>
>>>>>>>>>> 25_07%3A34%3A52_AST_2016.zip?dl=0>
>>>>>>>>>>
>>>>>>>>> On Tue, Aug 23, 2016 at 9:29 PM, Pouria Pirzadeh <
>>>>>>>>>
>>>>>>>>>> pouria.pirzadeh@gmail.com>
>>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> We previously had issues with huge spilled sort temp
files when
>>>>>>>>>> creating
>>>>>>>>>>
>>>>>>>>>> inverted index for fuzzy queries, but NOT R-Trees.
>>>>>>>>>>> I also recall that Yingyi fixed the issue of
delaying clean-up
>>>>>>>>>>>
>>>>>>>>>> for
>>>
>>>> intermediate temp files until the end of the query execution.
>>>>>>>>>>> If you can share names of a couple of temp files
(and their sizes
>>>>>>>>>>> along
>>>>>>>>>>> with the sort memory setting you have in
>>>>>>>>>>>
>>>>>>>>>> asterix-configuration.xml)
>>>
>>>> we
>>>>>>>>>>>
>>>>>>>>>>> may
>>>>>>>>>>>
>>>>>>>>>> be able to have a better guess as if the sort is
really going
>>>>>>>>>>
>>>>>>>>> into a
>>>
>>>> two-level merge or not.
>>>>>>>>>>>
>>>>>>>>>>> Pouria
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 23, 2016 at 11:09 AM, Ian Maxon <imaxon@uci.edu>
>>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>
>>>>> I think that execption ("No space left on device") is just casted
>>>>>>>>>>> from
>>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>> native IOException. Therefore I would be inclined
to believe it's
>>>>>>>>>>>
>>>>>>>>>>>> genuinely
>>>>>>>>>>>>
>>>>>>>>>>> out of space. I suppose the question is why the
external sort is
>>>>>>>>>>>
>>>>>>>>>> so
>>>
>>>> huge.
>>>>>>>>>>>>
>>>>>>>>>>> What is the query plan? Maybe that will shed
light on a possible
>>>>>>>>>> cause.
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 23, 2016 at 9:59 AM, Wail Alkowaileet
<
>>>>>>>>>>>
>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I was monitoring Inodes ... it didn't go
beyond 1%.
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:58 PM, Wail Alkowaileet
<
>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Chris and Mike,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Actually I was monitoring it to see what's
going on:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      - The size of each partition
is about 40GB (80GB in total
>>>>>>>>>>>>>>
>>>>>>>>>>>>> per
>>>>
>>>>>      iodevice).
>>>>>>>>>>>>>>      - The runs took 157GB per iodevice
(about 2x of the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> dataset
>>>
>>>> size).
>>>>>>>>>>>>>>      Each run takes either of 128MB
or 96MB of storage.
>>>>>>>>>>>>>>      - At a certain time, there were
522 runs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I even tried to create a BTree Index
to see if that happens as
>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>
>>>>>>>>>>>>> created two BTree indexes one for the
*location* and one for
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>> *caller
>>>>>>>>>>>>> *and
>>>>>>>>>>>>>
>>>>>>>>>>>>> they were created successfully. The sizes
of the runs didn't
>>>>>>>>>>>>>
>>>>>>>>>>>> take
>>>
>>>> anyway
>>>>>>>>>>>>>>
>>>>>>>>>>>>> near that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Logs are attached.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 7:19 PM,
Mike Carey <
>>>>>>>>>>>>>>
>>>>>>>>>>>>> dtabass@gmail.com>
>>>
>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I think we might have "file GC issues"
- I vaguely remember
>>>>>>>>>>>> that
>>>>>>>>>>>>
>>>>>>>>>>> we
>>>>
>>>>> don't
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (or at least didn't once upon a time)
proactively remove
>>>>>>>>>>>>>> unnecessary
>>>>>>>>>>>>>> run
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> files - removing all of them at end-of-job
instead of at the
>>>>>>>>>>>>>
>>>>>>>>>>>> end
>>>
>>>> of
>>>>
>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> execution phase that uses their contents.
 We may also have an
>>>>>>>>>>>>>
>>>>>>>>>>>>> "Amdahl
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> problem" right now with our sort
since we serialize phase two
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>
>>>> parallel
>>>>>>>>>>>>>
>>>>>>>>>>>>>> sorts - though this is not a query,
it's index build, so that
>>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>>> it.  It would be interesting to put a
df/sleep script on each
>>>>>>>>>>>>>
>>>>>>>>>>>> of
>>>
>>>> the
>>>>>>>>>>>>>> nodes
>>>>>>>>>>>>>> when this is happening - actually
a script that monitors the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> temp
>>>>
>>>>> file
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> directory - and watch the lifecycle
happen and the sizes
>>>>>>>>>>>>>
>>>>>>>>>>>> change....
>>>>
>>>>> On 8/23/16 2:06 AM, Chris Hillery wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you get the "disk full"
warning, do a quick "df -i" on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>
>>>> device
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>
>>>>>>>>>>>>> possibly you've run out of inodes even
if the space isn't all
>>>>>>>>>>>>>
>>>>>>>>>>>> used
>>>>
>>>>> up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It's
>>>>>>>>>>>>>> unlikely because I don't think AsterixDB
creates a bunch of
>>>>>>>>>>>>>>
>>>>>>>>>>>>> small
>>>>
>>>>> files,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> but worth checking.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If that's not it, then can you share
the full exception and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> stack
>>>>
>>>>> trace?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ceej
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016 at 1:59
AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wael.y.k@gmail.com>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> I just cleared the hard drives to
get 80% free space. I still
>>>>>>>>>>>>>>
>>>>>>>>>>>>> get
>>>>
>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>
>>>>>>>>>>>>> issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The data contains:
>>>>>>>>>>>>>>>>> 1- 2887453794 records.
>>>>>>>>>>>>>>>>> 2- Schema:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> create type CDRType as
{
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> id:uuid,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'date':string,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'time':string,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'duration':int64,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'caller':int64,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 'callee':int64,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> location:point?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Aug 23, 2016
at 9:06 AM, Wail Alkowaileet <
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> wael.y.k@gmail.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dears,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have a dataset of size 290GB
loaded in a 3 NCs each of
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> which
>>>>
>>>>> has
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2x500GB
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> SSD.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Each of NC has two IODevices (partitions)
in each hard
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> drive
>>>
>>>> (i.e
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> total is 4 iodevices
per NC). After loading the data, each
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Asterix
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> partition occupied 31GB.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The cluster has about 50% free
space in each hard drive
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (approximately
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> about 250GB free space
in each hard drive). However, when I
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> tried
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> create
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> an index of type RTree, I got
an exception that no space left
>>>>>>>>>>>>>>
>>>>>>>>>>>>> in
>>>
>>>> the
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hard
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> drive during the External Sort
phase.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is that normal ?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>> *Regards,*
>>>>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>> *Regards,*
>>>>>>>>>> Wail Alkowaileet
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Jianfeng Jia
>>>>>>>>> PhD Candidate of Computer Science
>>>>>>>>> University of California, Irvine
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>> --
>>>>
>>>> *Regards,*
>>>> Wail Alkowaileet
>>>>
>>>>
>


-- 

*Regards,*
Wail Alkowaileet

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message