asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: Few Questions
Date Tue, 22 Sep 2015 20:12:38 GMT
Hi Namrata,
Let me try and address your questions inline...

>Can I insert more data into a dataset by importing a .adm file? In the documentation I
saw the insert statement but I wanted to know if I could insert it using a file and the syntax
for it. I tried to do it but I got syntax errors.
Yes, but it's a little more complex than that. The 'load dataset
Foo...' syntax performs a bulk load, which is something that is only
done once. Generally the reason we do this is because building a BTree
with a large run of sorted data is very fast compared to doing
individual inserts, with the caveat that it needs to be an empty tree.
This is more of a technical detail however, I believe the eventual
hope is to have 'load' work on previously loaded datasets, but it will
just be slower due to not being able to use the aforementioned trick.

The way to do this today, is to create an external dataset (e.g.
create external dataset Bar using localfs...) on the new data, and
then insert every new record from that to the existing data (e.g.
insert into dataset Foo( for $x in Bar return $x) ). However, as a
word of caution, this may not work very well with the default
parameters, as they aren't tuned to ingesting data. The best thing to
do is to increase the in-memory component budget and max heap size in
the asterix-configuration.xml before trying this
(storage.memorycomponent.numpages and
storage.memorycomponent.pagesize, as well as the -Xmx parameter in
nc.java.opts)

>I would like to know if there is a way I can increase the amount of data Hyracks can handle.
Currently I’m using version 0.8.6. The error I get when I try to load my data is: Field
accessor is not defined for values of type null [AlgebricksException].  In 0.8.7 Snapshot
the error is: Unable to allocate frame larger than:255 bytes [HyracksDataException]. I am
sure that the error are because of the size of array-based data because when I deleted a lot
of values the load operation works. If anyone is interested in looking at the data, here’s
a sample record: https://drive.google.com/file/d/0B6wmo4_-H0P2UUhJcFpVdWdZT1k/view?usp=sharing.
I saw the conf/asterix-configuration.xml file and was wondering if changing some values would
make a difference.

The things you are seeing are unfortunately bugs. Given the type of
data, in version 0.8.6 I am not surprised by that behavior, because
back then there were many limitations regarding large records/objects
that have since been corrected.
However on the new version, I am surprised that you got the error that
you did. I will try out the sample data and try to see what the issue
might be, but in general Hyracks now has support for transporting
records which are larger than the default frame size. The main
limitations we have come across lately in this area have been from the
object model (e.g. 65k limit on string size) or from the storage layer
(objects cannot be bigger than half a page).

>What is a field accessor? I couldn’t understand what the error”Field accessor is not
defined for values of type null” means. Any idea why it was triggered? I am assuming that
it has something to do with the size of data only because I didn’t get the error on reducing
it. But from the wording I couldn’t really understand it.

A field accessor is basically what deserializes the data from bytes
within a frame (which is all Hyracks sees) to whatever AsterixDB needs
to see (String, int, double, collection, etc.). However this is an
implementation detail, at a user level it shouldn't be of concern

Hopefully that helps! If anything is unclear please feel free to ask.
Also if you'd like to Skype or chat over IRC to try discussing in more
realtime, I'm available for that as well.

Thanks,
-Ian

On Tue, Sep 22, 2015 at 11:26 AM, Malarout, Namrata (398M-Affiliate)
<Namrata.Malarout@jpl.nasa.gov> wrote:
> Hi,
> I have a few questions:
>
> Can I insert more data into a dataset by importing a .adm file? In the
> documentation I saw the insert statement but I wanted to know if I could
> insert it using a file and the syntax for it. I tried to do it but I got
> syntax errors.
> I would like to know if there is a way I can increase the amount of data
> Hyracks can handle. Currently I’m using version 0.8.6. The error I get when
> I try to load my data is: Field accessor is not defined for values of type
> null [AlgebricksException].  In 0.8.7 Snapshot the error is: Unable to
> allocate frame larger than:255 bytes [HyracksDataException]. I am sure that
> the error are because of the size of array-based data because when I deleted
> a lot of values the load operation works. If anyone is interested in looking
> at the data, here’s a sample record:
> https://drive.google.com/file/d/0B6wmo4_-H0P2UUhJcFpVdWdZT1k/view?usp=sharing.
> I saw the conf/asterix-configuration.xml file and was wondering if changing
> some values would make a difference.
> What is a field accessor? I couldn’t understand what the error”Field
> accessor is not defined for values of type null” means. Any idea why it was
> triggered? I am assuming that it has something to do with the size of data
> only because I didn’t get the error on reducing it. But from the wording I
> couldn’t really understand it.
>
> Thanks for the help.
> Regards,
> Namrata
>

Mime
View raw message