asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Error loading data
Date Fri, 29 Apr 2016 13:40:05 GMT
That's bizarre ... Not at all what others see for load behavior.  I wonder
what could be going wrong here....?
On Apr 29, 2016 6:27 AM, "Magnus Kongshem" <kongshem@online.ntnu.no> wrote:

> Well, it works, but it takes forever. Initializing my collection with 20
> GB of data takes about 1,5 hour. Adding 5GB of new data to the collection
> "never" completes. It has working now for 48 hours and only managed to
> insert 1/3 of the 5GB. I had to abort it to see what actually managed to be
> inserted.
>
> BG,
>
> Magnus
>
> On Tue, Apr 19, 2016 at 8:01 PM, Magnus Kongshem <kongshem@online.ntnu.no>
> wrote:
>
>>
>> Ah, I see, should have realized that myself. I will test it first thing
>> in the morning.
>>
>> --
>> Mvh
>> Magnus Kongshem
>> 41565906
>>
>> 19. apr. 2016 19:11 skrev "Ildar Absalyamov" <ildar.absalyamov@gmail.com
>> >:
>> >
>> > Magnus,
>> >
>> > Since you are using autogenerated key the inserted record should not
>> have that field. To do that you need to change the record format in the
>> return clause:
>> >
>> > insert into dataset posdata(
>> >   for $x in dataset posdata_temp return {
>> > "campus": $x.campus,
>> > "building": $x.building,
>> > "floor": $x.floor,
>> > "timestamp": $x.timestamp,
>> > "dayOfWeek": $x.day,
>> > "hourOfDay": $x.hour,
>> > "latitude": $x.latitude,
>> > "salt_timestamp": $x.salt,
>> > "longitude": $x.longitude,
>> > "id": $x.id,
>> > "accuracy": $x.accuracy
>> > }
>> > )
>> >
>> >> On Apr 19, 2016, at 04:54, Magnus Kongshem <kongshem@stud.ntnu.no>
>> wrote:
>> >>
>> >> Your suggestion does not work because you get duplicate fields.
>> >>
>> >> Exception: Duplicate field &quot;uuid&quot; encountered
>> [AlgebricksException]
>> >>
>> >> Any other suggestions? This is a major issue in my view, and as Mike
>> Carey said: It should be easy and seamless to add more data to the dataset.
>> >>
>> >> BG,
>> >> Magnus
>> >>
>> >> On Thu, Apr 14, 2016 at 6:34 PM, Ildar Absalyamov <
>> ildar.absalyamov@gmail.com> wrote:
>> >>>
>> >>> Magnus,
>> >>>
>> >>> You could still add data to non-empty dataset via inserts.
>> >>> The easiest way to do that, granted you have the data you want to
>> insert in a files, is to bulkhead data to new temp dataset and insert it to
>> the desired dataset:
>> >>>
>> >>> create dataset posdata_temp(table) primary key uid auto generated;
>> >>> load dataset posdata_temp using localfs
>> "path"="localhost:///data/path/to/file/file.adm,localhost:///data/path/to/file/file2.adm,localhost:///data/path/to/file/file3.adm"),("format"="adm"));
>> >>> insert into dataset posdata(
>> >>>   for $x in dataset posdata_temp return $x
>> >>> )
>> >>>
>> >>>> On Apr 14, 2016, at 07:41, Magnus Kongshem <kongshem@stud.ntnu.no>
>> wrote:
>> >>>>
>> >>>> Does this mean that adding additional data to an instance and
>> dataverse is not supported?
>> >>>>
>> >>>> Magnus
>> >>>>
>> >>>> On Wed, Mar 30, 2016 at 8:11 PM, Ian Maxon <imaxon@uci.edu>
wrote:
>> >>>>>
>> >>>>> It should just be a quoted string with commas inside separating
the
>> URL-ish paths, so like:
>> >>>>>
>> >>>>> load dataset foo using localfs
>> "path"="localhost:///data/path/to/file/file.adm,localhost:///data/path/to/file/file2.adm,localhost:///data/path/to/file/file3.adm"),("format"="adm"));
>> >>>>>
>> >>>>> On Wed, Mar 30, 2016 at 6:24 AM, Magnus Kongshem <
>> kongshem@stud.ntnu.no> wrote:
>> >>>>>>
>> >>>>>> Yes I am.
>> >>>>>>
>> >>>>>> So, combinding each file will and doing the command once
will
>> solve it, or do I have to input the AQL for each file like below?
>> >>>>>>
>> >>>>>> use dataverse bigd;
>> >>>>>> load dataset posdata using localfs
>> >>>>>>
>> (("path"="localhost:///data/path/to/file/file.adm"),("format"="adm"));
>> >>>>>>
>> (("path"="localhost:///data/path/to/file/file2.adm"),("format"="adm"));
>> >>>>>>
>> (("path"="localhost:///data/path/to/file/file3.adm"),("format"="adm"));
>> >>>>>>
>> >>>>>>
>> >>>>>> BG
>> >>>>>> Magnus
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Mar 30, 2016 at 3:21 PM, Wail Alkowaileet <
>> wael.y.k@gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> Are you trying to load each file separately?
>> >>>>>>> That AFAK is not supported.
>> >>>>>>>
>> >>>>>>> On Mar 30, 2016 16:13, "Magnus Kongshem" <kongshem@stud.ntnu.no>
>> wrote:
>> >>>>>>>>
>> >>>>>>>> I will be loading 12 files.
>> >>>>>>>>
>> >>>>>>>> AQL below:
>> >>>>>>>>
>> >>>>>>>> use dataverse bigd;
>> >>>>>>>> load dataset posdata using localfs
>> >>>>>>>>
>> (("path"="localhost:///data/path/to/file/file.adm"),("format"="adm"));
>> >>>>>>>>
>> >>>>>>>> Will it be solved if I concatinate the files and
do the dataset
>> loading only once?
>> >>>>>>>>
>> >>>>>>>> Magnus
>> >>>>>>>>
>> >>>>>>>> On Wed, Mar 30, 2016 at 3:06 PM, Wail Alkowaileet
<
>> wael.y.k@gmail.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>> How many files you're loading?
>> >>>>>>>>> Can you send the loading AQL?
>> >>>>>>>>>
>> >>>>>>>>> On Mar 30, 2016 16:01, "Magnus Kongshem" <kongshem@stud.ntnu.no>
>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Using asterixdb v0.8.8.
>> >>>>>>>>>>
>> >>>>>>>>>> I am loading data into my asterixDB instance.
>> >>>>>>>>>>
>> >>>>>>>>>> Loading the first file is successful. But
when I try to load
>> another file, I get a "Internal error. Please check instance logs for
>> further details. [NullPointerException]"
>> >>>>>>>>>>
>> >>>>>>>>>> The files are of the type adm and as good
as equal in size (3
>> gb).
>> >>>>>>>>>>
>> >>>>>>>>>> My instance was initialized with these commands:
>> >>>>>>>>>>
>> >>>>>>>>>> drop dataverse bigd if exists;
>> >>>>>>>>>>     create dataverse bigd;
>> >>>>>>>>>>     use dataverse bigd;
>> >>>>>>>>>>
>> >>>>>>>>>>     create type table as open {
>> >>>>>>>>>> uid: uuid,
>> >>>>>>>>>>         campus: string,
>> >>>>>>>>>> building: string,
>> >>>>>>>>>> floor: string,
>> >>>>>>>>>>         timestamp: int32,
>> >>>>>>>>>> dayOfWeek: int32,
>> >>>>>>>>>> hourOfDay: int32,
>> >>>>>>>>>>         latitude: double,
>> >>>>>>>>>>         salt_timestamp: int32,
>> >>>>>>>>>>         longitude: double,
>> >>>>>>>>>>         id: string,
>> >>>>>>>>>> accuracy: double
>> >>>>>>>>>>     }
>> >>>>>>>>>> create dataset posdata(table)
>> >>>>>>>>>>     primary key uid autogenerated;
>> >>>>>>>>>> create index stamp on posdata(timestamp);
>> >>>>>>>>>> create index hour on posdata(hourOfDay);
>> >>>>>>>>>> create index day on posdata(dayOfWeek);
>> >>>>>>>>>>
>> >>>>>>>>>> My log file is attached.
>> >>>>>>>>>>
>> >>>>>>>>>> Any help?
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>>
>> >>>>>>>>>> Mvh
>> >>>>>>>>>>
>> >>>>>>>>>> Magnus Kongshem
>> >>>>>>>>>>
>> >>>>>>>>>> NTNU
>> >>>>>>>>>> +47 415 65 906
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>>
>> >>>>>>>> Mvh
>> >>>>>>>>
>> >>>>>>>> Magnus Alderslyst Kongshem
>> >>>>>>>> Leder av seniorkomiteen
>> >>>>>>>> Online, linjeforeningen for informatikk
>> >>>>>>>> +47 415 65 906
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>>
>> >>>>>> Mvh
>> >>>>>>
>> >>>>>> Magnus Alderslyst Kongshem
>> >>>>>> Leder av seniorkomiteen
>> >>>>>> Online, linjeforeningen for informatikk
>> >>>>>> +47 415 65 906
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>>
>> >>>> Mvh
>> >>>>
>> >>>> Magnus Alderslyst Kongshem
>> >>>> Leder av seniorkomiteen
>> >>>> Online, linjeforeningen for informatikk
>> >>>> +47 415 65 906
>> >>>
>> >>>
>> >>> Best regards,
>> >>> Ildar
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Mvh
>> >>
>> >> Magnus Alderslyst Kongshem
>> >> Leder av seniorkomiteen
>> >> Online, linjeforeningen for informatikk
>> >> +47 415 65 906
>> >
>> >
>> > Best regards,
>> > Ildar
>> >
>>
>
>
>
> --
>
> Mvh
>
> Magnus Alderslyst Kongshem
> Seniorkomiteen
> Online, linjeforeningen for informatikk
> +47 415 65 906
>

Mime
View raw message