incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JiHyoun Park <apr...@gmail.com>
Subject Re: pluggable s4r fetching strategies
Date Wed, 17 Apr 2013 09:42:46 GMT
Hi,

When I specify both -s4r and -mu, I found -mu override -s4r. Am I using it
correctly?

./s4 deploy -s4r=uri/to/app.s4r -c=cluster1 -appName=myApp \
-emc=my.project.FancyKeyValueStoreBackendCheckpointingModule \
-mu=uri/to/fancyKeyValueStoreCheckpointingModule.jar

(without -mu option)

org.apache.s4.deploy.DeploymentFailedException: Cannot deploy
application [App] from URI
[hdfs://localhost:9000/user/root/s4/App/213/app.s4r]

(with -mu option)

org.apache.s4.deploy.DeploymentFailedException: Cannot deploy
application [App] from URI [./HdfsFetcherModule.jar]


Anyway, I still couldn't make the hdfsfetchermodule visible to S4.

However, I followed your previous suggestion to add lines of codes at
S4Node.java to accept custom modules when starting a node.
Here are my changes.

===========================================================
[org.apache.s4.core.S4Node.java]

// added
        @Parameter(names = { "-extraModulesClasses", "-emc" }, description
= "Comma-separated list of additional configuration modules (they will be
instantiated through their constructor without arguments).", required =
false, hidden = false)
        List<String> extraModulesClasses = new ArrayList<String>();

/* replaced
        Injector injector = Guice.createInjector(Modules.override(
                new
BaseModule(Resources.getResource("default.s4.base.properties").openStream(),
nodeArgs.clusterName))
                .with(new ParametersInjectionModule(inlineParameters)));
*/

// with
        List<com.google.inject.Module> extraModules = new
ArrayList<com.google.inject.Module>();
        for (String moduleClass : nodeArgs.extraModulesClasses) {
            extraModules.add((Module)
Class.forName(moduleClass).newInstance());
        }
        Module combinedModule = Modules.combine(new BaseModule(

Resources.getResource("default.s4.base.properties").openStream(),
nodeArgs.clusterName));
        if (extraModules.size() > 0) {
            OverriddenModuleBuilder overridenModuleBuilder =
Modules.override(combinedModule);
            combinedModule = overridenModuleBuilder.with(extraModules);
        }

        combinedModule = Modules.override(combinedModule)
                .with(new ParametersInjectionModule(inlineParameters));

        Injector injector = Guice.createInjector(combinedModule);
===========================================================

Now I can successfully run s4 applications based on s4-0.6.0 on Hadoop.

Could you please include these changes in the official s4-0.6.0 package?
It will enhance S4's ability to integrate external systems more flexibly in
runtime.

Best Regards
Jihyoun



On Wed, Apr 17, 2013 at 5:11 PM, Matthieu Morel <mmorel@apache.org> wrote:

>
> On Apr 17, 2013, at 09:50 , JiHyoun Park wrote:
>
> Hi
>
> Supposed that I have my custom module in my local machine,
>
> How can I set the application configuration for custom modules? Where?
> What is the correct string format of the modulesURI parameter?
> Can you give me an example?
>
>
> For specifying custom modules in the application configuration, see here:
> http://incubator.apache.org/s4/doc/0.6.0/configuration/ ("overriding
> modules" section)
>
> Regards,
>
> Matthieu
>
>
>
> Best Regards
> Jihyoun
>
>
> On Tue, Apr 16, 2013 at 7:08 PM, Matthieu Morel <mmorel@apache.org> wrote:
>
>>
>> Hello,
>>
>> a key difference with S4 0.5 is that custom modules are specified in the
>> application configuration, and can be loaded remotely with the modulesURI
>> parameter.
>>
>> However, if for fetching remote modules you need a special module, that
>> does not work.
>>
>> In S4ApplicationMaster for S4 0.5 we specified the hdfs fetcher for the
>> S4 node as :
>> // add module for fetching from hdfs
>> extraModulesClasses.add(HdfsFetcherModule.class.getName());
>>
>> However in S4 0.4 we don't have that parameter when starting the node.
>>
>> Right now there is no clean way to configure a remote file fetcher when
>> starting a node. That is something we need to improve, thanks for pointing
>> this out!
>>
>>
>> Ideally we should give the option to pass custom modules when starting
>> the node. It's actually just a few lines of code to add. (adding a
>> parameter to get extra modules classes, instantiate these modules by
>> reflection and pass them to the injector in line 80 of S4Node - we can help
>> if you can't make it work).
>>
>> Otherwise you may modify the BaseModule to include the dependency to the
>> hdfs fetcher (you'll need the hdfs fetcher in the classpath when
>> compiling).
>> Or make the application available through another procotol such as http.
>>
>>
>> Hope this helps,
>>
>> Matthieu
>>
>>
>> On Apr 16, 2013, at 12:08 , JiHyoun Park wrote:
>>
>> Hi,
>>
>> I am trying to apply 'pluggable s4r fetching strategies' of S4-25 to S4
>> 0.6.0.
>>
>> Changes that I made are
>>
>> 1) org.apache.s4.core.BaseModule.java
>>     protected void configure() {
>>         ...
>>         // added this codes
>>         Multibinder<ArchiveFetcher> archiveFetcherMultibinder =
>> Multibinder.newSetBinder(binder(), ArchiveFetcher.class);
>>
>> archiveFetcherMultibinder.addBinding().to(FileSystemArchiveFetcher.class);
>>
>> archiveFetcherMultibinder.addBinding().to(HttpArchiveFetcher.class);
>> }
>>
>> 2) org.apache.s4.core.util.ArchiveFetcher.java
>>     // added this codes
>>     boolean handlesProtocol(URI uri);
>>
>> 3) org.apache.s4.deploy.FileSystemArchiveFetcher.java
>>     // added this codes
>>     @Override
>>     public boolean handlesProtocol(URI uri) {
>>         return "file".equalsIgnoreCase(uri.getScheme());
>>     }
>>
>> 4) org.apache.s4.core.util.HttpArchiveFetcher.java
>>     // added this codes
>>     @Override
>>     public boolean handlesProtocol(URI uri) {
>>         return ("http".equalsIgnoreCase(uri.getScheme()) ||
>> "https".equalsIgnoreCase(uri.getScheme()));
>>     }
>>
>> 5) org.apache.s4.core.util.RemoteFileFetcher.java
>>     // added this codes
>>     private final Set<ArchiveFetcher> archiveFetchers;
>>
>>     // added this codes
>>     @Inject
>>     public RemoteFileFetcher(Set<ArchiveFetcher> archiveFetchers) {
>>         this.archiveFetchers = archiveFetchers;
>>     }
>>
>>     public InputStream fetch(URI uri) throws ArchiveFetchException {
>>         ....
>>         /* removed this codes
>>         if ("file".equalsIgnoreCase(scheme)) {
>>             return new FileSystemArchiveFetcher().fetch(uri);
>>         }
>>         if ("http".equalsIgnoreCase(scheme) ||
>> "https".equalsIgnoreCase(scheme)) {
>>             return new HttpArchiveFetcher().fetch(uri);
>>         }
>>         */
>>
>>         // added this codes
>>         for (ArchiveFetcher archiveFetcher : archiveFetchers) {
>>             if (archiveFetcher.handlesProtocol(uri)) {
>>                 return archiveFetcher.fetch(uri);
>>             }
>>         }
>>     }
>>
>> 6) I passed the "-modulesClasses=my.HdfsFetcherModule" argument when I
>> deploy my s4r file. The HdfsFetcherModule is the same HdfsFetcherModule
>> from S4-25.
>>
>>
>> But, when I run it, I got this error.
>>
>> 17:46:24.151 [S4 platform loader] ERROR org.apache.s4.core.S4Bootstrap - Cannot start
S4 node
>> org.apache.s4.deploy.DeploymentFailedException: Cannot deploy application [SimpleApp]
from URI [hdfs://localhost:9000/user/root/s4data/SimpleApp/203/s4app.s4r]
>>
>>
>> 	at org.apache.s4.core.S4Bootstrap.loadApp(S4Bootstrap.java:219) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>> 	at org.apache.s4.core.S4Bootstrap.startS4App(S4Bootstrap.java:149) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>
>>
>> 	at org.apache.s4.core.S4Bootstrap.access$000(S4Bootstrap.java:80) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>> 	at org.apache.s4.core.S4Bootstrap$1.run(S4Bootstrap.java:139) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>
>>
>> 	at java.lang.Thread.run(Thread.java:679) [na:1.6.0_22]
>> Caused by: org.apache.s4.core.util.ArchiveFetchException: Unsupported protocol hdfs
>> 	at org.apache.s4.core.util.RemoteFileFetcher.fetch(RemoteFileFetcher.java:63) ~[s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>
>>
>> 	at org.apache.s4.core.S4Bootstrap.loadApp(S4Bootstrap.java:214) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>> 	... 4 common frames omitted
>>
>>
>> I think there are some missing links to make S4 recognize the custom
>> module.
>> Is it related with the new parameter "-modulesURIs"? How can I use it?
>> I found some description for this parameter at
>> org.apache.s4.tools.Deploy.java
>>         @Parameter(names = { "-modulesURIs", "-mu" }, description = "URIs
>> for fetching code of custom modules")
>>         List<String> modulesURIs = new ArrayList<String>();
>>
>> But I have no idea about how to use it. My custom module will be located
>> in HDFS with the s4r file that I want to deploy.
>> Or are there any other things that I have to take into consideration in
>> the above implementation?
>>
>> Best Regards
>> Jihyoun.
>>
>>
>>
>
>

Mime
View raw message