asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Malarout, Namrata (398M-Affiliate)" <Namrata.Malar...@jpl.nasa.gov>
Subject Re: Few Questions
Date Tue, 22 Sep 2015 22:26:36 GMT
Hi Ian,
Thanks for your answers. The details you provided helped me understand
things better. I would like to know when the Beta version of 0.8.7 will be
available.
Thanks,
Namrata

On 9/22/15, 1:12 PM, "Ian Maxon" <imaxon@uci.edu> wrote:

>Hi Namrata,
>Let me try and address your questions inline...
>
>>Can I insert more data into a dataset by importing a .adm file? In the
>>documentation I saw the insert statement but I wanted to know if I could
>>insert it using a file and the syntax for it. I tried to do it but I got
>>syntax errors.
>Yes, but it's a little more complex than that. The 'load dataset
>Foo...' syntax performs a bulk load, which is something that is only
>done once. Generally the reason we do this is because building a BTree
>with a large run of sorted data is very fast compared to doing
>individual inserts, with the caveat that it needs to be an empty tree.
>This is more of a technical detail however, I believe the eventual
>hope is to have 'load' work on previously loaded datasets, but it will
>just be slower due to not being able to use the aforementioned trick.
>
>The way to do this today, is to create an external dataset (e.g.
>create external dataset Bar using localfs...) on the new data, and
>then insert every new record from that to the existing data (e.g.
>insert into dataset Foo( for $x in Bar return $x) ). However, as a
>word of caution, this may not work very well with the default
>parameters, as they aren't tuned to ingesting data. The best thing to
>do is to increase the in-memory component budget and max heap size in
>the asterix-configuration.xml before trying this
>(storage.memorycomponent.numpages and
>storage.memorycomponent.pagesize, as well as the -Xmx parameter in
>nc.java.opts)
>
>>I would like to know if there is a way I can increase the amount of data
>>Hyracks can handle. Currently I¹m using version 0.8.6. The error I get
>>when I try to load my data is: Field accessor is not defined for values
>>of type null [AlgebricksException].  In 0.8.7 Snapshot the error is:
>>Unable to allocate frame larger than:255 bytes [HyracksDataException]. I
>>am sure that the error are because of the size of array-based data
>>because when I deleted a lot of values the load operation works. If
>>anyone is interested in looking at the data, here¹s a sample record:
>>https://drive.google.com/file/d/0B6wmo4_-H0P2UUhJcFpVdWdZT1k/view?usp=sha
>>ring. I saw the conf/asterix-configuration.xml file and was wondering if
>>changing some values would make a difference.
>
>The things you are seeing are unfortunately bugs. Given the type of
>data, in version 0.8.6 I am not surprised by that behavior, because
>back then there were many limitations regarding large records/objects
>that have since been corrected.
>However on the new version, I am surprised that you got the error that
>you did. I will try out the sample data and try to see what the issue
>might be, but in general Hyracks now has support for transporting
>records which are larger than the default frame size. The main
>limitations we have come across lately in this area have been from the
>object model (e.g. 65k limit on string size) or from the storage layer
>(objects cannot be bigger than half a page).
>
>>What is a field accessor? I couldn¹t understand what the error²Field
>>accessor is not defined for values of type null² means. Any idea why it
>>was triggered? I am assuming that it has something to do with the size
>>of data only because I didn¹t get the error on reducing it. But from the
>>wording I couldn¹t really understand it.
>
>A field accessor is basically what deserializes the data from bytes
>within a frame (which is all Hyracks sees) to whatever AsterixDB needs
>to see (String, int, double, collection, etc.). However this is an
>implementation detail, at a user level it shouldn't be of concern
>
>Hopefully that helps! If anything is unclear please feel free to ask.
>Also if you'd like to Skype or chat over IRC to try discussing in more
>realtime, I'm available for that as well.
>
>Thanks,
>-Ian
>
>On Tue, Sep 22, 2015 at 11:26 AM, Malarout, Namrata (398M-Affiliate)
><Namrata.Malarout@jpl.nasa.gov> wrote:
>> Hi,
>> I have a few questions:
>>
>> Can I insert more data into a dataset by importing a .adm file? In the
>> documentation I saw the insert statement but I wanted to know if I could
>> insert it using a file and the syntax for it. I tried to do it but I got
>> syntax errors.
>> I would like to know if there is a way I can increase the amount of data
>> Hyracks can handle. Currently I¹m using version 0.8.6. The error I get
>>when
>> I try to load my data is: Field accessor is not defined for values of
>>type
>> null [AlgebricksException].  In 0.8.7 Snapshot the error is: Unable to
>> allocate frame larger than:255 bytes [HyracksDataException]. I am sure
>>that
>> the error are because of the size of array-based data because when I
>>deleted
>> a lot of values the load operation works. If anyone is interested in
>>looking
>> at the data, here¹s a sample record:
>> 
>>https://drive.google.com/file/d/0B6wmo4_-H0P2UUhJcFpVdWdZT1k/view?usp=sha
>>ring.
>> I saw the conf/asterix-configuration.xml file and was wondering if
>>changing
>> some values would make a difference.
>> What is a field accessor? I couldn¹t understand what the error²Field
>> accessor is not defined for values of type null² means. Any idea why it
>>was
>> triggered? I am assuming that it has something to do with the size of
>>data
>> only because I didn¹t get the error on reducing it. But from the
>>wording I
>> couldn¹t really understand it.
>>
>> Thanks for the help.
>> Regards,
>> Namrata
>>


Mime
View raw message