incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: pluggable s4r fetching strategies
Date Tue, 16 Apr 2013 11:08:08 GMT

Hello,

a key difference with S4 0.5 is that custom modules are specified in the application configuration,
and can be loaded remotely with the modulesURI parameter.

However, if for fetching remote modules you need a special module, that does not work. 

In S4ApplicationMaster for S4 0.5 we specified the hdfs fetcher for the S4 node as :
// add module for fetching from hdfs
extraModulesClasses.add(HdfsFetcherModule.class.getName());

However in S4 0.4 we don't have that parameter when starting the node. 

Right now there is no clean way to configure a remote file fetcher when starting a node. That
is something we need to improve, thanks for pointing this out!


Ideally we should give the option to pass custom modules when starting the node. It's actually
just a few lines of code to add. (adding a parameter to get extra modules classes, instantiate
these modules by reflection and pass them to the injector in line 80 of S4Node - we can help
if you can't make it work).

Otherwise you may modify the BaseModule to include the dependency to the hdfs fetcher (you'll
need the hdfs fetcher in the classpath when compiling). 
Or make the application available through another procotol such as http.


Hope this helps,

Matthieu


On Apr 16, 2013, at 12:08 , JiHyoun Park wrote:

> Hi,
> 
> I am trying to apply 'pluggable s4r fetching strategies' of S4-25 to S4 0.6.0.
> 
> Changes that I made are
> 
> 1) org.apache.s4.core.BaseModule.java
>     protected void configure() {
>         ...
>         // added this codes
>         Multibinder<ArchiveFetcher> archiveFetcherMultibinder = Multibinder.newSetBinder(binder(),
ArchiveFetcher.class);
>         archiveFetcherMultibinder.addBinding().to(FileSystemArchiveFetcher.class);
>         archiveFetcherMultibinder.addBinding().to(HttpArchiveFetcher.class);
> }
> 
> 2) org.apache.s4.core.util.ArchiveFetcher.java
>     // added this codes
>     boolean handlesProtocol(URI uri);
> 
> 3) org.apache.s4.deploy.FileSystemArchiveFetcher.java
>     // added this codes
>     @Override
>     public boolean handlesProtocol(URI uri) {
>         return "file".equalsIgnoreCase(uri.getScheme());
>     }
> 
> 4) org.apache.s4.core.util.HttpArchiveFetcher.java
>     // added this codes
>     @Override
>     public boolean handlesProtocol(URI uri) {
>         return ("http".equalsIgnoreCase(uri.getScheme()) || "https".equalsIgnoreCase(uri.getScheme()));
>     }
> 
> 5) org.apache.s4.core.util.RemoteFileFetcher.java
>     // added this codes
>     private final Set<ArchiveFetcher> archiveFetchers;
> 
>     // added this codes
>     @Inject
>     public RemoteFileFetcher(Set<ArchiveFetcher> archiveFetchers) {
>         this.archiveFetchers = archiveFetchers;
>     }
> 
>     public InputStream fetch(URI uri) throws ArchiveFetchException {
>         ....
>         /* removed this codes
>         if ("file".equalsIgnoreCase(scheme)) {
>             return new FileSystemArchiveFetcher().fetch(uri);
>         }
>         if ("http".equalsIgnoreCase(scheme) || "https".equalsIgnoreCase(scheme)) {
>             return new HttpArchiveFetcher().fetch(uri);
>         }
>         */
> 
>         // added this codes
>         for (ArchiveFetcher archiveFetcher : archiveFetchers) {
>             if (archiveFetcher.handlesProtocol(uri)) {
>                 return archiveFetcher.fetch(uri);
>             }
>         }
>     }
> 
> 6) I passed the "-modulesClasses=my.HdfsFetcherModule" argument when I deploy my s4r
file. The HdfsFetcherModule is the same HdfsFetcherModule from S4-25.
> 
> 
> But, when I run it, I got this error.
> 17:46:24.151 [S4 platform loader] ERROR org.apache.s4.core.S4Bootstrap - Cannot start
S4 node
> org.apache.s4.deploy.DeploymentFailedException: Cannot deploy application [SimpleApp]
from URI [hdfs://localhost:9000/user/root/s4data/SimpleApp/203/s4app.s4r] 
> 
> 	at org.apache.s4.core.S4Bootstrap.loadApp(S4Bootstrap.java:219) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
> 	at org.apache.s4.core.S4Bootstrap.startS4App(S4Bootstrap.java:149) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
> 
> 	at org.apache.s4.core.S4Bootstrap.access$000(S4Bootstrap.java:80) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
> 	at org.apache.s4.core.S4Bootstrap$1.run(S4Bootstrap.java:139) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
> 
> 	at java.lang.Thread.run(Thread.java:679) [na:1.6.0_22]
> Caused by: org.apache.s4.core.util.ArchiveFetchException: Unsupported protocol hdfs
> 	at org.apache.s4.core.util.RemoteFileFetcher.fetch(RemoteFileFetcher.java:63) ~[s4-core-0.6.0-incubating.jar:0.6.0-incubating]
> 
> 	at org.apache.s4.core.S4Bootstrap.loadApp(S4Bootstrap.java:214) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
> 	... 4 common frames omitted
> 
> I think there are some missing links to make S4 recognize the custom module.
> Is it related with the new parameter "-modulesURIs"? How can I use it?
> I found some description for this parameter at org.apache.s4.tools.Deploy.java
>         @Parameter(names = { "-modulesURIs", "-mu" }, description = "URIs for fetching
code of custom modules")
>         List<String> modulesURIs = new ArrayList<String>();
> 
> But I have no idea about how to use it. My custom module will be located in HDFS with
the s4r file that I want to deploy.
> Or are there any other things that I have to take into consideration in the above implementation?
> 
> Best Regards
> Jihyoun.
> 


Mime
View raw message