nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre <andre-li...@fucs.org>
Subject Re: GetHDFS from Azure Blob
Date Tue, 28 Mar 2017 19:34:33 GMT
Austin,

Perhaps that wasn't explicit but the settings don't need to be system wide,
instead the defaultFS may be changed just for a particular processor, while
the others may use configurations.

The *HDFS processor documentation mentions it allows yout to set particular
 hadoop configurations:

" A file or comma separated list of files which contains the Hadoop file
system configuration. Without this, Hadoop will search the classpath for a
'core-site.xml' and 'hdfs-site.xml' file or will revert to a default
configuration"

Have you tried using this field to point to a file as described by Bryan?

Cheers

On 29 Mar 2017 05:21, "Austin Heyne" <aheyne@ccri.com> wrote:

Thanks Bryan,

Working with the configuration you sent what I needed to change was to set
the fs.defaultFS to the wasb url that we're working from. Unfortunately
this is a less than ideal solution since we'll be pulling files from
multiple wasb urls and ingesting them into an Accumulo datastore. Changing
the defaultFS I'm pretty certainly would mess with our local HDFS/Accumulo
install. In addition we're trying to maintain all of this configuration
with Ambari, which from what I can tell only supports one core-site
configuration file.

Is the only solution here to maintain multiple core-site.xml files or is
there another way we configure this?

Thanks,

Austin



On 03/28/2017 01:41 PM, Bryan Bende wrote:

> Austin,
>
> Can you provide the full error message and stacktrace for  the
> IllegalArgumentException from nifi-app.log?
>
> When you start the processor it creates a FileSystem instance based on
> the config files provided to the processor, which in turn causes all
> of the corresponding classes to load.
>
> I'm not that familiar with Azure, but if "Azure blob store" is WASB,
> then I have successfully done the following...
>
> In core-site.xml:
>
> <configuration>
>
>      <property>
>        <name>fs.defaultFS</name>
>        <value>wasb://YOUR_USER@YOUR_HOST/</value>
>      </property>
>
>      <property>
>        <name>fs.azure.account.key.nifi.blob.core.windows.net</name>
>        <value>YOUR_KEY</value>
>      </property>
>
>      <property>
>        <name>fs.AbstractFileSystem.wasb.impl</name>
>        <value>org.apache.hadoop.fs.azure.Wasb</value>
>      </property>
>
>      <property>
>        <name>fs.wasb.impl</name>
>        <value>org.apache.hadoop.fs.azure.NativeAzureFileSystem</value>
>      </property>
>
>      <property>
>        <name>fs.azure.skip.metrics</name>
>        <value>true</value>
>      </property>
>
> </configuration>
>
> In Additional Resources property of an HDFS processor, point to a
> directory with:
>
> azure-storage-2.0.0.jar
> commons-codec-1.6.jar
> commons-lang3-3.3.2.jar
> commons-logging-1.1.1.jar
> guava-11.0.2.jar
> hadoop-azure-2.7.3.jar
> httpclient-4.2.5.jar
> httpcore-4.2.4.jar
> jackson-core-2.2.3.jar
> jsr305-1.3.9.jar
> slf4j-api-1.7.5.jar
>
>
> Thanks,
>
> Bryan
>
>
> On Tue, Mar 28, 2017 at 1:15 PM, Austin Heyne <aheyne@ccri.com> wrote:
>
>> Hi all,
>>
>> Thanks for all the help you've given me so far. Today I'm trying to pull
>> files from an Azure blob store. I've done some reading on this and from
>> previous tickets [1] and guides [2] it seems the recommended approach is
>> to
>> place the required jars, to use the HDFS Azure protocol, in 'Additional
>> Classpath Resoures' and the hadoop core-site and hdfs-site configs into
>> the
>> 'Hadoop Configuration Resources'. I have my local HDFS properly configured
>> to access wasb urls. I'm able to ls, copy to and from, etc with out
>> problem.
>> Using the same HDFS config files and trying both all the jars in my
>> hadoop-client/lib directory (hdp) and using the jars recommend in [1] I'm
>> still seeing the "java.lang.IllegalArgumentException: Wrong FS: " error
>> in
>> my NiFi logs and am unable to pull files from Azure blob storage.
>>
>> Interestingly, it seems the processor is spinning up way to fast, the
>> errors
>> appear in the log as soon as I start the processor. I'm not sure how it
>> could be loading all of those jars that quickly.
>>
>> Does anyone have any experience with this or recommendations to try?
>>
>> Thanks,
>> Austin
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-1922
>> [2]
>> https://community.hortonworks.com/articles/71916/connecting-
>> to-azure-data-lake-from-a-nifi-dataflow.html
>>
>>
>>

Mime
View raw message