asterixdb-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Magnus Kongshem <kongs...@online.ntnu.no>
Subject Re: Error loading data
Date Fri, 29 Apr 2016 13:26:30 GMT
Well, it works, but it takes forever. Initializing my collection with 20 GB
of data takes about 1,5 hour. Adding 5GB of new data to the collection
"never" completes. It has working now for 48 hours and only managed to
insert 1/3 of the 5GB. I had to abort it to see what actually managed to be
inserted.

BG,

Magnus

On Tue, Apr 19, 2016 at 8:01 PM, Magnus Kongshem <kongshem@online.ntnu.no>
wrote:

>
> Ah, I see, should have realized that myself. I will test it first thing in
> the morning.
>
> --
> Mvh
> Magnus Kongshem
> 41565906
>
> 19. apr. 2016 19:11 skrev "Ildar Absalyamov" <ildar.absalyamov@gmail.com>:
> >
> > Magnus,
> >
> > Since you are using autogenerated key the inserted record should not
> have that field. To do that you need to change the record format in the
> return clause:
> >
> > insert into dataset posdata(
> >   for $x in dataset posdata_temp return {
> > "campus": $x.campus,
> > "building": $x.building,
> > "floor": $x.floor,
> > "timestamp": $x.timestamp,
> > "dayOfWeek": $x.day,
> > "hourOfDay": $x.hour,
> > "latitude": $x.latitude,
> > "salt_timestamp": $x.salt,
> > "longitude": $x.longitude,
> > "id": $x.id,
> > "accuracy": $x.accuracy
> > }
> > )
> >
> >> On Apr 19, 2016, at 04:54, Magnus Kongshem <kongshem@stud.ntnu.no>
> wrote:
> >>
> >> Your suggestion does not work because you get duplicate fields.
> >>
> >> Exception: Duplicate field &quot;uuid&quot; encountered
> [AlgebricksException]
> >>
> >> Any other suggestions? This is a major issue in my view, and as Mike
> Carey said: It should be easy and seamless to add more data to the dataset.
> >>
> >> BG,
> >> Magnus
> >>
> >> On Thu, Apr 14, 2016 at 6:34 PM, Ildar Absalyamov <
> ildar.absalyamov@gmail.com> wrote:
> >>>
> >>> Magnus,
> >>>
> >>> You could still add data to non-empty dataset via inserts.
> >>> The easiest way to do that, granted you have the data you want to
> insert in a files, is to bulkhead data to new temp dataset and insert it to
> the desired dataset:
> >>>
> >>> create dataset posdata_temp(table) primary key uid auto generated;
> >>> load dataset posdata_temp using localfs
> "path"="localhost:///data/path/to/file/file.adm,localhost:///data/path/to/file/file2.adm,localhost:///data/path/to/file/file3.adm"),("format"="adm"));
> >>> insert into dataset posdata(
> >>>   for $x in dataset posdata_temp return $x
> >>> )
> >>>
> >>>> On Apr 14, 2016, at 07:41, Magnus Kongshem <kongshem@stud.ntnu.no>
> wrote:
> >>>>
> >>>> Does this mean that adding additional data to an instance and
> dataverse is not supported?
> >>>>
> >>>> Magnus
> >>>>
> >>>> On Wed, Mar 30, 2016 at 8:11 PM, Ian Maxon <imaxon@uci.edu> wrote:
> >>>>>
> >>>>> It should just be a quoted string with commas inside separating
the
> URL-ish paths, so like:
> >>>>>
> >>>>> load dataset foo using localfs
> "path"="localhost:///data/path/to/file/file.adm,localhost:///data/path/to/file/file2.adm,localhost:///data/path/to/file/file3.adm"),("format"="adm"));
> >>>>>
> >>>>> On Wed, Mar 30, 2016 at 6:24 AM, Magnus Kongshem <
> kongshem@stud.ntnu.no> wrote:
> >>>>>>
> >>>>>> Yes I am.
> >>>>>>
> >>>>>> So, combinding each file will and doing the command once will
solve
> it, or do I have to input the AQL for each file like below?
> >>>>>>
> >>>>>> use dataverse bigd;
> >>>>>> load dataset posdata using localfs
> >>>>>>
> (("path"="localhost:///data/path/to/file/file.adm"),("format"="adm"));
> >>>>>>
> (("path"="localhost:///data/path/to/file/file2.adm"),("format"="adm"));
> >>>>>>
> (("path"="localhost:///data/path/to/file/file3.adm"),("format"="adm"));
> >>>>>>
> >>>>>>
> >>>>>> BG
> >>>>>> Magnus
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Mar 30, 2016 at 3:21 PM, Wail Alkowaileet <
> wael.y.k@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Are you trying to load each file separately?
> >>>>>>> That AFAK is not supported.
> >>>>>>>
> >>>>>>> On Mar 30, 2016 16:13, "Magnus Kongshem" <kongshem@stud.ntnu.no>
> wrote:
> >>>>>>>>
> >>>>>>>> I will be loading 12 files.
> >>>>>>>>
> >>>>>>>> AQL below:
> >>>>>>>>
> >>>>>>>> use dataverse bigd;
> >>>>>>>> load dataset posdata using localfs
> >>>>>>>>
> (("path"="localhost:///data/path/to/file/file.adm"),("format"="adm"));
> >>>>>>>>
> >>>>>>>> Will it be solved if I concatinate the files and do
the dataset
> loading only once?
> >>>>>>>>
> >>>>>>>> Magnus
> >>>>>>>>
> >>>>>>>> On Wed, Mar 30, 2016 at 3:06 PM, Wail Alkowaileet <
> wael.y.k@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> How many files you're loading?
> >>>>>>>>> Can you send the loading AQL?
> >>>>>>>>>
> >>>>>>>>> On Mar 30, 2016 16:01, "Magnus Kongshem" <kongshem@stud.ntnu.no>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Using asterixdb v0.8.8.
> >>>>>>>>>>
> >>>>>>>>>> I am loading data into my asterixDB instance.
> >>>>>>>>>>
> >>>>>>>>>> Loading the first file is successful. But when
I try to load
> another file, I get a "Internal error. Please check instance logs for
> further details. [NullPointerException]"
> >>>>>>>>>>
> >>>>>>>>>> The files are of the type adm and as good as
equal in size (3
> gb).
> >>>>>>>>>>
> >>>>>>>>>> My instance was initialized with these commands:
> >>>>>>>>>>
> >>>>>>>>>> drop dataverse bigd if exists;
> >>>>>>>>>>     create dataverse bigd;
> >>>>>>>>>>     use dataverse bigd;
> >>>>>>>>>>
> >>>>>>>>>>     create type table as open {
> >>>>>>>>>> uid: uuid,
> >>>>>>>>>>         campus: string,
> >>>>>>>>>> building: string,
> >>>>>>>>>> floor: string,
> >>>>>>>>>>         timestamp: int32,
> >>>>>>>>>> dayOfWeek: int32,
> >>>>>>>>>> hourOfDay: int32,
> >>>>>>>>>>         latitude: double,
> >>>>>>>>>>         salt_timestamp: int32,
> >>>>>>>>>>         longitude: double,
> >>>>>>>>>>         id: string,
> >>>>>>>>>> accuracy: double
> >>>>>>>>>>     }
> >>>>>>>>>> create dataset posdata(table)
> >>>>>>>>>>     primary key uid autogenerated;
> >>>>>>>>>> create index stamp on posdata(timestamp);
> >>>>>>>>>> create index hour on posdata(hourOfDay);
> >>>>>>>>>> create index day on posdata(dayOfWeek);
> >>>>>>>>>>
> >>>>>>>>>> My log file is attached.
> >>>>>>>>>>
> >>>>>>>>>> Any help?
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> Mvh
> >>>>>>>>>>
> >>>>>>>>>> Magnus Kongshem
> >>>>>>>>>>
> >>>>>>>>>> NTNU
> >>>>>>>>>> +47 415 65 906
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Mvh
> >>>>>>>>
> >>>>>>>> Magnus Alderslyst Kongshem
> >>>>>>>> Leder av seniorkomiteen
> >>>>>>>> Online, linjeforeningen for informatikk
> >>>>>>>> +47 415 65 906
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Mvh
> >>>>>>
> >>>>>> Magnus Alderslyst Kongshem
> >>>>>> Leder av seniorkomiteen
> >>>>>> Online, linjeforeningen for informatikk
> >>>>>> +47 415 65 906
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Mvh
> >>>>
> >>>> Magnus Alderslyst Kongshem
> >>>> Leder av seniorkomiteen
> >>>> Online, linjeforeningen for informatikk
> >>>> +47 415 65 906
> >>>
> >>>
> >>> Best regards,
> >>> Ildar
> >>>
> >>
> >>
> >>
> >> --
> >>
> >> Mvh
> >>
> >> Magnus Alderslyst Kongshem
> >> Leder av seniorkomiteen
> >> Online, linjeforeningen for informatikk
> >> +47 415 65 906
> >
> >
> > Best regards,
> > Ildar
> >
>



-- 

Mvh

Magnus Alderslyst Kongshem
Seniorkomiteen
Online, linjeforeningen for informatikk
+47 415 65 906

Mime
View raw message