incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: pluggable s4r fetching strategies
Date Wed, 17 Apr 2013 16:34:22 GMT
On Apr 17, 2013, at 11:42 , JiHyoun Park wrote:

> Hi,
> 
> When I specify both -s4r and -mu, I found -mu override -s4r. Am I using it correctly?
> ./s4 deploy -s4r=uri/to/app.s4r -c=cluster1 -appName=myApp \
> -emc=my.project.FancyKeyValueStoreBackendCheckpointingModule \
> 
> -mu=uri/to/fancyKeyValueStoreCheckpointingModule.jar
> 
> (without -mu option)
> org.apache.s4.deploy.DeploymentFailedException: Cannot deploy application [App] from
URI [hdfs://localhost:9000/user/root/s4/App/213/app.s4r] 
> (with -mu option)
> org.apache.s4.deploy.DeploymentFailedException: Cannot deploy application [App] from
URI [./HdfsFetcherModule.jar] 

I can't reproduce that, maybe that is related to some changes you introduced?

Here is how I configure an app to use the VerboseFileSystemStorageModule from the custom-modules.jar
specified in -mu parameter while the app classes are specified through the s4r parameter.

./s4 deploy -s4r=http://$LOCALHOST:8080/test-apps/twitter-counter/build/libs/counterArchive.s4r
-c=cluster1 -appName=twitter-counter -p=s4.checkpointing.filesystem.storageRootPath2=/tmp/toto3,s4.metrics.config=console:10:SECONDS
-emc=org.apache.s4.storage.VerboseFileSystemStorageModule -mu=file://`pwd`/test-apps/custom-modules/build/libs/app/custom-modules.jar

> 
> Anyway, I still couldn't make the hdfsfetchermodule visible to S4.
> 
> However, I followed your previous suggestion to add lines of codes at S4Node.java to
accept custom modules when starting a node.
> Here are my changes.
> 
> ===========================================================
> [org.apache.s4.core.S4Node.java]
> 
> // added
>         @Parameter(names = { "-extraModulesClasses", "-emc" }, description = "Comma-separated
list of additional configuration modules (they will be instantiated through their constructor
without arguments).", required = false, hidden = false)
>         List<String> extraModulesClasses = new ArrayList<String>();
> 
> /* replaced
>         Injector injector = Guice.createInjector(Modules.override(
>                 new BaseModule(Resources.getResource("default.s4.base.properties").openStream(),
nodeArgs.clusterName))
>                 .with(new ParametersInjectionModule(inlineParameters)));
> */
> 
> // with
>         List<com.google.inject.Module> extraModules = new ArrayList<com.google.inject.Module>();
>         for (String moduleClass : nodeArgs.extraModulesClasses) {
>             extraModules.add((Module) Class.forName(moduleClass).newInstance());
>         }
>         Module combinedModule = Modules.combine(new BaseModule(
>                 Resources.getResource("default.s4.base.properties").openStream(), nodeArgs.clusterName));
>         if (extraModules.size() > 0) {
>             OverriddenModuleBuilder overridenModuleBuilder = Modules.override(combinedModule);
>             combinedModule = overridenModuleBuilder.with(extraModules);
>         }
> 
>         combinedModule = Modules.override(combinedModule)
>                 .with(new ParametersInjectionModule(inlineParameters));
>                 
>         Injector injector = Guice.createInjector(combinedModule);
> ===========================================================
> 
> Now I can successfully run s4 applications based on s4-0.6.0 on Hadoop.
> 
> Could you please include these changes in the official s4-0.6.0 package?
> It will enhance S4's ability to integrate external systems more flexibly in runtime.

Great to know that this works!

Right now we are already voting on S4 0.6 RC4 but we can include these changes in the dev
branch for the next release.

The most efficient way to get these changes included is:
1/ create a jira ticket with the request (that's really important)
2/ submit a well formatted git patch (with "git format-patch") (attach it to the ticket)


Thanks!

Matthieu
 


> 
> Best Regards
> Jihyoun
> 
> 
> 
> On Wed, Apr 17, 2013 at 5:11 PM, Matthieu Morel <mmorel@apache.org> wrote:
> 
> On Apr 17, 2013, at 09:50 , JiHyoun Park wrote:
> 
>> Hi
>> 
>> Supposed that I have my custom module in my local machine, 
>> 
>> How can I set the application configuration for custom modules? Where?
>> What is the correct string format of the modulesURI parameter?
>> Can you give me an example?
> 
> For specifying custom modules in the application configuration, see here: http://incubator.apache.org/s4/doc/0.6.0/configuration/
("overriding modules" section)
> 
> Regards,
> 
> Matthieu
> 
> 
>> 
>> Best Regards
>> Jihyoun
>> 
>> 
>> On Tue, Apr 16, 2013 at 7:08 PM, Matthieu Morel <mmorel@apache.org> wrote:
>> 
>> Hello,
>> 
>> a key difference with S4 0.5 is that custom modules are specified in the application
configuration, and can be loaded remotely with the modulesURI parameter.
>> 
>> However, if for fetching remote modules you need a special module, that does not
work. 
>> 
>> In S4ApplicationMaster for S4 0.5 we specified the hdfs fetcher for the S4 node as
:
>> // add module for fetching from hdfs
>> extraModulesClasses.add(HdfsFetcherModule.class.getName());
>> 
>> However in S4 0.4 we don't have that parameter when starting the node. 
>> 
>> Right now there is no clean way to configure a remote file fetcher when starting
a node. That is something we need to improve, thanks for pointing this out!
>> 
>> 
>> Ideally we should give the option to pass custom modules when starting the node.
It's actually just a few lines of code to add. (adding a parameter to get extra modules classes,
instantiate these modules by reflection and pass them to the injector in line 80 of S4Node
- we can help if you can't make it work).
>> 
>> Otherwise you may modify the BaseModule to include the dependency to the hdfs fetcher
(you'll need the hdfs fetcher in the classpath when compiling). 
>> Or make the application available through another procotol such as http.
>> 
>> 
>> Hope this helps,
>> 
>> Matthieu
>> 
>> 
>> On Apr 16, 2013, at 12:08 , JiHyoun Park wrote:
>> 
>>> Hi,
>>> 
>>> I am trying to apply 'pluggable s4r fetching strategies' of S4-25 to S4 0.6.0.
>>> 
>>> Changes that I made are
>>> 
>>> 1) org.apache.s4.core.BaseModule.java
>>>     protected void configure() {
>>>         ...
>>>         // added this codes
>>>         Multibinder<ArchiveFetcher> archiveFetcherMultibinder = Multibinder.newSetBinder(binder(),
ArchiveFetcher.class);
>>>         archiveFetcherMultibinder.addBinding().to(FileSystemArchiveFetcher.class);
>>>         archiveFetcherMultibinder.addBinding().to(HttpArchiveFetcher.class);
>>> }
>>> 
>>> 2) org.apache.s4.core.util.ArchiveFetcher.java
>>>     // added this codes
>>>     boolean handlesProtocol(URI uri);
>>> 
>>> 3) org.apache.s4.deploy.FileSystemArchiveFetcher.java
>>>     // added this codes
>>>     @Override
>>>     public boolean handlesProtocol(URI uri) {
>>>         return "file".equalsIgnoreCase(uri.getScheme());
>>>     }
>>> 
>>> 4) org.apache.s4.core.util.HttpArchiveFetcher.java
>>>     // added this codes
>>>     @Override
>>>     public boolean handlesProtocol(URI uri) {
>>>         return ("http".equalsIgnoreCase(uri.getScheme()) || "https".equalsIgnoreCase(uri.getScheme()));
>>>     }
>>> 
>>> 5) org.apache.s4.core.util.RemoteFileFetcher.java
>>>     // added this codes
>>>     private final Set<ArchiveFetcher> archiveFetchers;
>>> 
>>>     // added this codes
>>>     @Inject
>>>     public RemoteFileFetcher(Set<ArchiveFetcher> archiveFetchers) {
>>>         this.archiveFetchers = archiveFetchers;
>>>     }
>>> 
>>>     public InputStream fetch(URI uri) throws ArchiveFetchException {
>>>         ....
>>>         /* removed this codes
>>>         if ("file".equalsIgnoreCase(scheme)) {
>>>             return new FileSystemArchiveFetcher().fetch(uri);
>>>         }
>>>         if ("http".equalsIgnoreCase(scheme) || "https".equalsIgnoreCase(scheme))
{
>>>             return new HttpArchiveFetcher().fetch(uri);
>>>         }
>>>         */
>>> 
>>>         // added this codes
>>>         for (ArchiveFetcher archiveFetcher : archiveFetchers) {
>>>             if (archiveFetcher.handlesProtocol(uri)) {
>>>                 return archiveFetcher.fetch(uri);
>>>             }
>>>         }
>>>     }
>>> 
>>> 6) I passed the "-modulesClasses=my.HdfsFetcherModule" argument when I deploy
my s4r file. The HdfsFetcherModule is the same HdfsFetcherModule from S4-25.
>>> 
>>> 
>>> But, when I run it, I got this error.
>>> 17:46:24.151 [S4 platform loader] ERROR org.apache.s4.core.S4Bootstrap - Cannot
start S4 node
>>> org.apache.s4.deploy.DeploymentFailedException: Cannot deploy application [SimpleApp]
from URI [hdfs://localhost:9000/user/root/s4data/SimpleApp/203/s4app.s4r] 
>>> 
>>> 
>>> 
>>> 	at org.apache.s4.core.S4Bootstrap.loadApp(S4Bootstrap.java:219) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>> 
>>> 	at org.apache.s4.core.S4Bootstrap.startS4App(S4Bootstrap.java:149) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>> 
>>> 
>>> 
>>> 	at org.apache.s4.core.S4Bootstrap.access$000(S4Bootstrap.java:80) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>> 
>>> 	at org.apache.s4.core.S4Bootstrap$1.run(S4Bootstrap.java:139) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>> 
>>> 
>>> 
>>> 	at java.lang.Thread.run(Thread.java:679) [na:1.6.0_22]
>>> 
>>> Caused by: org.apache.s4.core.util.ArchiveFetchException: Unsupported protocol
hdfs
>>> 	at org.apache.s4.core.util.RemoteFileFetcher.fetch(RemoteFileFetcher.java:63)
~[s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>> 
>>> 
>>> 
>>> 	at org.apache.s4.core.S4Bootstrap.loadApp(S4Bootstrap.java:214) [s4-core-0.6.0-incubating.jar:0.6.0-incubating]
>>> 
>>> 	... 4 common frames omitted
>>> 
>>> I think there are some missing links to make S4 recognize the custom module.
>>> Is it related with the new parameter "-modulesURIs"? How can I use it?
>>> I found some description for this parameter at org.apache.s4.tools.Deploy.java
>>>         @Parameter(names = { "-modulesURIs", "-mu" }, description = "URIs for
fetching code of custom modules")
>>>         List<String> modulesURIs = new ArrayList<String>();
>>> 
>>> But I have no idea about how to use it. My custom module will be located in HDFS
with the s4r file that I want to deploy.
>>> Or are there any other things that I have to take into consideration in the above
implementation?
>>> 
>>> Best Regards
>>> Jihyoun.
>>> 
>> 
>> 
> 
> 


Mime
View raw message