asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mingda li <limingda1...@gmail.com>
Subject Re: Load HDFS data to AsterixDB
Date Wed, 16 Nov 2016 19:15:42 GMT
Yeah, do you think it's easy to compile against 1.0.4 ? I prefer this
choice if it's easy. Or I can just load data from HDFS to local and
transfer to AsterixDB. (Now we are experimenting the efficiency of multiple
join with different orders, the data to test is from 1g to 1000g. So it
takes time to load to local.)

Thanks

On Wed, Nov 16, 2016 at 10:55 AM, abdullah alamoudi <bamousaa@gmail.com>
wrote:

> Mingda,
> I think you're right. currently, the adapter only supports 2.2.0 "going
> forward". There were fundamental changes introduced in Hadoop 2.2.0. Those
> changes made it difficult to support versions before and after 2.2.0. We
> then collectively made a decision to only support 2.2.0+.
>
> The choices I see for you Mingda is:
> 1. Migrate to newer Hadoop version.
> 2. Compile AsterixDB against 1.0.4.
>
> The second option might require small code changes.
>
> Hope this helps,
> ~Abdullah.
>
> On Wed, Nov 16, 2016 at 10:02 AM, mingda li <limingda1993@gmail.com>
> wrote:
>
> > @ Abdullah,
> > I check the version of HDFS and Hadoop. They are both 1.0.4. They come
> > together. So maybe caused by the mismatch with adapter?
> > @Till,
> > Sure, I have built an issue in
> > https://issues.apache.org/jira/browse/ASTERIXDB-1735
> > Bests,
> > Mingda
> >
> > On Tue, Nov 15, 2016 at 11:08 PM, Till Westmann <tillw@apache.org>
> wrote:
> >
> > > Hi Mingda,
> > >
> > > It would be good to have an issue with the different scenarios and the
> > > different exceptions that you received. I think that the aim of the
> > > issue should be to report useful errors that indicate the different
> > > error conditions (version mismatch, missing parameter, something
> else?).
> > >
> > > Could you file that?
> > >
> > > Cheers,
> > > Till
> > >
> > >
> > > On 15 Nov 2016, at 21:32, abdullah alamoudi wrote:
> > >
> > > Mingda,
> > >> The issue is caused by Hadoop version mismatch between the running
> HDFS
> > >> instance and the version the adapter was compiled against.
> > >> Can you get version information about the running HDFS instance?
> > >>
> > >> Cheers,
> > >> Abdullah.
> > >>
> > >> On Tue, Nov 15, 2016 at 9:27 PM, mingda li <limingda1993@gmail.com>
> > >> wrote:
> > >>
> > >> Hi Abdullah,
> > >>>
> > >>> Thanks for your quick reply. Actually, I ever tried to add the
> > >>> ("input-format"="text-input-format"), and the error becomes:
> > >>>
> > >>> Message
> > >>>
> > >>> Internal error. Please check instance logs for further details.
> > >>> [EOFException]
> > >>>
> > >>> The log file for it is:
> > >>>
> > >>> EVERE: Unable to create adapter
> > >>>
> > >>> org.apache.hyracks.algebricks.common.exceptions.AlgebricksException:
> > >>> Unable to create adapter
> > >>>
> > >>> at org.apache.asterix.metadata.declared.AqlMetadataProvider.
> > >>> getConfiguredAdapterFactory(AqlMetadataProvider.java:990)
> > >>>
> > >>> at org.apache.asterix.metadata.declared.LoadableDataSource.
> > >>> buildDatasourceScanRuntime(LoadableDataSource.java:141)
> > >>>
> > >>> Caused by: org.apache.asterix.common.exceptions.AsterixException:
> > >>> java.io.IOException: Failed on local exception: java.io.EOFException;
> > >>> Host
> > >>> Details : local host is: "SCAI01/131.179.64.20"; destination host
> is:
> > "
> > >>> SCAI01.CS.UCLA.EDU":9000;
> > >>>
> > >>> at org.apache.asterix.external.input.HDFSDataSourceFactory.
> configure(
> > >>> HDFSDataSourceFactory.java:112)
> > >>>
> > >>> at org.apache.asterix.external.adapter.factory.
> GenericAdapterFactory.
> > >>> configure(GenericAdapterFactory.java:139)
> > >>>
> > >>> Caused by: java.io.IOException: Failed on local exception:
> > >>> java.io.EOFException; Host Details : local host is: "SCAI01/
> > >>> 131.179.64.20";
> > >>> destination host is: "SCAI01.CS.UCLA.EDU":9000;
> > >>>
> > >>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> > >>>
> > >>> at org.apache.hadoop.ipc.Client.call(Client.java:1351)
> > >>>
> > >>> at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> > >>>
> > >>> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
> > >>> invoke(ProtobufRpcEngine.java:206)
> > >>>
> > >>> at com.sun.proxy.$Proxy18.getFileInfo(Unknown Source)
> > >>>
> > >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>>
> > >>> at sun.reflect.NativeMethodAccessorImpl.invoke(
> > >>> NativeMethodAccessorImpl.java:62)
> > >>>
> > >>> Caused by: java.io.EOFException
> > >>>
> > >>> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> > >>>
> > >>> at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
> > >>> Client.java:995)
> > >>>
> > >>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
> > >>> The details are in attachment cc3.log. And the cc2.log is for the
> query
> > >>> without ("input-format"="text-input-format").
> > >>> Thanks
> > >>>
> > >>>
> > >>> On Tue, Nov 15, 2016 at 9:16 PM, abdullah alamoudi <
> bamousaa@gmail.com
> > >
> > >>> wrote:
> > >>>
> > >>> There is also a missing parameter: ("input-format"="text-input-
> > format").
> > >>>> In
> > >>>> HDFS, the file containing the data can have one of many formats,
> Text,
> > >>>> Sequence, RC, etc. Hence, the adapter needs to know the file input
> > >>>> format
> > >>>> so it can access it.
> > >>>>
> > >>>> Cheers,
> > >>>> ~Abdullah.
> > >>>>
> > >>>> On Tue, Nov 15, 2016 at 9:13 PM, abdullah alamoudi <
> > bamousaa@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> Where is the attachment?
> > >>>>>
> > >>>>> On Tue, Nov 15, 2016 at 9:11 PM, mingda li <limingda1993@gmail.com
> >
> > >>>>>
> > >>>> wrote:
> > >>>>
> > >>>>>
> > >>>>> Hi,
> > >>>>>>
> > >>>>>> Has anyone loaded data from HDFS? I met a problem when
loading
> data
> > >>>>>> use
> > >>>>>> following query:
> > >>>>>>
> > >>>>>> use dataverse tpcds3;
> > >>>>>>
> > >>>>>> load dataset inventory
> > >>>>>>
> > >>>>>> using hdfs(("hdfs"="hdfs://SCAI01.CS.UCLA.EDU:9000
> > >>>>>> <http://scai01.cs.ucla.edu:9000/>"),("path"="/cla
> > >>>>>> sh/datasets/tpcds/10/inventory"),("format"="delimited-text")
> > >>>>>> ,("delimiter"="|"));
> > >>>>>>
> > >>>>>>
> > >>>>>> The Error in web interface is:
> > >>>>>>
> > >>>>>> Internal error. Please check instance logs for further
details.
> > >>>>>> [NullPointerException]
> > >>>>>>
> > >>>>>>
> > >>>>>> I check the cc.log (cluster controller) and find the following
> > >>>>>> problem:
> > >>>>>>
> > >>>>>> SEVERE: Unable to create adapter
> > >>>>>>
> > >>>>>> org.apache.hyracks.algebricks.common.exceptions.
> > AlgebricksException:
> > >>>>>> Unable to create adapter
> > >>>>>>
> > >>>>>>         at org.apache.asterix.metadata.de
> > >>>>>>
> > >>>>> clared.AqlMetadataProvider.get
> > >>>>
> > >>>>> ConfiguredAdapterFactory(AqlMetadataProvider.java:990)
> > >>>>>>
> > >>>>>>         at org.apache.asterix.metadata.de
> > >>>>>>
> > >>>>> clared.LoadableDataSource.buil
> > >>>>
> > >>>>> dDatasourceScanRuntime(LoadableDataSource.java:141)
> > >>>>>>
> > >>>>>> More about the log is in the attachment.
> > >>>>>>
> > >>>>>>
> > >>>>>> I think there is no problem about the syntax of query.
Does anyone
> > >>>>>> have
> > >>>>>> idea about this?
> > >>>>>>
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>>
> > >>>>>> Mingda
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message