asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Carey <mjca...@ics.uci.edu>
Subject Re: Internal error [NegativeArraySizeException]
Date Fri, 09 Oct 2015 06:27:56 GMT
One thought might be for a few of us in AsterixDB-land to make a road 
trip up to JPL and give an overview talk to any/all interested data 
folks there - and then get in a conference room with you and your mentor 
and take a more top-down and in-person look at what you're wanting to do 
(especially for the largeness-inducing array fields)?

On 10/7/15 2:29 PM, Ian Maxon wrote:
> Hm, I see, that is interesting. What is the 'mask' field then? It 
> looked like some sort of array from my first glance, but I bet it's 
> more than that. Is it something that could be split up in some way? 
> One thought is to have the metadata in one dataset, and the masks that 
> are split in some way in another.
>
> -Ian
>
> On Wed, Oct 7, 2015 at 2:16 PM, Malarout, Namrata (398M-Affiliate) 
> <Namrata.Malarout@jpl.nasa.gov <mailto:Namrata.Malarout@jpl.nasa.gov>> 
> wrote:
>
>     Hi Ian,
>
>     Thanks for getting back about this so quickly. The data I provided
>     was a subset of the records that we have. Similar to mask, we have
>     about 4 or 5 other fields which are even bigger. Unfortunately we
>     can't filter them out. The data that you see other after filtering
>     out mask is just the metadata of the file. I have ingested just
>     the metadata when I was familiarizing myself with AsterixDB and as
>     you said, it works just fine. But, the actual data on which we
>     will be querying is stored in these large objects.
>
>     Regards,
>     Namrata
>     ------------------------------------------------------------------------
>     *From:* Ian Maxon [imaxon@uci.edu <mailto:imaxon@uci.edu>]
>     *Sent:* Tuesday, October 06, 2015 7:36 PM
>     *To:* users@asterixdb.incubator.apache.org
>     <mailto:users@asterixdb.incubator.apache.org>
>     *Subject:* Re: Internal error [NegativeArraySizeException]
>
>     Hi Namrata,
>     First, I think the behavior you are experiencing is a bug, so
>     we'll look into that. The load fails because each row is really
>     large, about 3MB, and somehow the sort operator doesn't deal with
>     this well.
>     However it may be good that we ran into this, because, while huge
>     objects like this should eventually be handled more gracefully in
>     AsterixDB, they're viewed as being exceptional rather than the
>     norm. Hence the performance will not be as good when these types
>     of big objects/fields are accessed while mixed in with
>     comparatively tiny data.
>     The field I see taking up almost all of the space in the object is
>     the "mask" field. Is this something that is actually needed? Or
>     can it be filtered/projected out?
>
>     I've attached a version of the sample data where I cut out the
>     "mask" field, this one seems to load in just fine using the
>     provided DDL.
>
>     ​
>     new_nomask.adm
>     <https://drive.google.com/a/uci.edu/file/d/0B9fobkjZFASia2xaZ054T25nUFU/view?usp=drive_web>
>     ​
>
>     Thanks,
>     -Ian
>
>     On Tue, Oct 6, 2015 at 10:37 AM, Ian Maxon <imaxon@uci.edu
>     <mailto:imaxon@uci.edu>> wrote:
>     > Awesome, Thanks Namrata. I'll give this a close look later today.
>     >
>     > -Ian
>     >
>     > On Tue, Oct 6, 2015 at 10:24 AM, Malarout, Namrata (398M-Affiliate)
>     > <Namrata.Malarout@jpl.nasa.gov
>     <mailto:Namrata.Malarout@jpl.nasa.gov>> wrote:
>     >> Hi Ian,
>     >> I just realized I didn¹t provide the DDL. Sorry about that.
>     I¹ve kept it
>     >> really simple:
>     >>
>     >> drop dataverse TestL4 if exists;
>     >> create dataverse TestL4;
>     >> use dataverse TestL4;
>     >>
>     >>
>     >> create type GlobL4Type as open {
>     >> fid: string,
>     >> }
>     >>
>     >>
>     >> create dataset GlobL4(GlobL4Type)
>     >> primary key fid;
>     >>
>     >> Please let me know if you have any questions.
>     >> Thanks,
>     >> Namrata
>     >>
>     >>
>     >>
>     >> On 10/1/15, 5:33 PM, "Ian Maxon" <imaxon@uci.edu
>     <mailto:imaxon@uci.edu>> wrote:
>     >>
>     >>>P.S., if you have the data/DDL/so on that caused this error to
>     happen,
>     >>>I can try to reproduce here locally if the exception/logs may have
>     >>>gotten lost somewhere.
>     >>>
>     >>>-Ian
>     >>>
>     >>>On Thu, Oct 1, 2015 at 5:19 PM, Ian Maxon <imaxon@uci.edu
>     <mailto:imaxon@uci.edu>> wrote:
>     >>>> Hey Namrata,
>     >>>> Those logs are not logs in the diagnostic sense, but rather
>     >>>> write-ahead logs, so a log of the transactions that are
>     occuring in
>     >>>> the instance. If you were using the single-machine package I
>     gave you,
>     >>>> the error's stack trace should actually be on the console.
>     >>>>
>     >>>> Thanks,
>     >>>> -Ian
>     >>>>
>     >>>> On Thu, Oct 1, 2015 at 5:13 PM, Malarout, Namrata
>     (398M-Affiliate)
>     >>>> <Namrata.Malarout@jpl.nasa.gov
>     <mailto:Namrata.Malarout@jpl.nasa.gov>> wrote:
>     >>>>> Hi,
>     >>>>> I got an error while trying to ingest data.
>     >>>>> Internal error. Please check instance logs for further details.
>     >>>>> [NegativeArraySizeException]
>     >>>>>
>     >>>>> I¹ve attached the logs. When I open them it¹s unreadable.
>     The logs in
>     >>>>> ClusterControllerService are empty (screenshot attached).
>     >>>>>
>     >>>>> I have had errors when I was using version 0.8.6 ingesting
>     data due to
>     >>>>>the
>     >>>>> size of the data. Has anyone encountered this error before?
>     >>>>> Thanks in advance for the help.
>     >>>>>
>     >>>>> Regards,
>     >>>>> Namrata
>     >>
>
>


Mime
View raw message