incubator-any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simone Tripodi (Updated) (JIRA)" <>
Subject [jira] [Updated] (ANY23-71) improve the current CLI engine
Date Sun, 01 Apr 2012 18:42:28 GMT


Simone Tripodi updated ANY23-71:

    Attachment: enhanced_cli.patch

The attached patch contains a proposal for implementation wich replaces commons-cli (and custom
args parsing) with JCommander; I also reduced generated sh binaries to a single /any23 script,
wich is the single entry point able to launch all any23 commands that are plugged via plugins
The {{Version}} command has been replaced with the {{-v|--version}} option.

A sample usage print:

$ sh any23 -h
Usage: any23 [options] [command] [command options]
    -h, --help          Display help information.
                        Default: false
    -p, --plugins-dir   The Any23 plugins directory.
                        Default: /Users/simonetripodi/.any23/plugins
    -X, --verbose       Produce execution verbose output.
                        Default: false
    -v, --version       Display version information.
                        Default: false
    extractor      Utility for obtaining documentation about metadata extractors.
      Usage: extractor [options] Extractor name      
          -a, --all     shows a report about all available extractors
                        Default: false
          -i, --input   shows example input for the given extractor
                        Default: false
          -l, --list    shows the names of all available extractors
                        Default: false
          -o, --outut   shows example output for the given extractor
                        Default: false

    microdata      Commandline Tool for extracting Microdata from file/HTTP source.
      Usage: microdata [options] Input document URL, {http://path/to/resource.html|file:/path/to/local.file}
    mimes      MIME Type Detector Tool.
      Usage: mimes [options] Input document URL, {http://path/to/resource.html|file:///path/to/local.file|inline://
some inline content}
    verify      Utility for plugin management verification.
      Usage: verify [options] plugins-dir
    rover      Any23 Command Line Tool.
      Usage: rover [options] input URIs {<url>|<file>}+      
          -d, --defaultns    Override the default namespace used to produce
          -e, --extractors   a comma-separated list of extractors, e.g.
                             Default: []
          -f, --format       the output format
                             Default: turtle
          -l, --log          Produce log within a file.
          -n, --nesting      Disable production of nesting triples.
                             Default: false
          -t, --notrivial    Filter trivial statements (e.g. CSS related ones).
                             Default: false
          -o, --output       Specify Output file (defaults to standard output)
          -p, --pedantic     Validate and fixes HTML content detecting commons
                             Default: false
          -s, --stats        Print out extraction statistics.
                             Default: false

    vocab      Prints out the RDF Schema of the vocabularies used by Any23.
      Usage: vocab [options]      
          -f, --format   Vocabulary output format
                         Default: NQuads

see the version:

$ ah any23 -v
Apache Any23 0.7.0-incubating-SNAPSHOT (trunk@r1304362; 2012-04-01 19:03:41+0200)
Java version: 1.6.0_29, vendor: Apple Inc.
Java home: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
Default locale: en_US, platform encoding: MacRoman
OS name: "Mac OS X", version: "10.7.3", arch: "x86_64", family: "mac"

execute a command:

$ sh any23 mimes

Apache Any23 :: mimes


Apache Any23 SUCCESS
Total time: 4s
Finished at: Sun Apr 01 20:39:29 CEST 2012
Final Memory: 16M/493M

Unfortunately while rearranging things I got an issue in the tests:

Test set: org.apache.any23.cli.RoverTest
Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 8.055 sec <<< FAILURE!
testRunMultiFiles(org.apache.any23.cli.RoverTest)  Time elapsed: 0.155 sec  <<<
ERROR! Unexpected end of line. [line 27, column 304]
    at org.apache.any23.rdf.RDFUtils.parseRDF(
    at org.apache.any23.rdf.RDFUtils.parseRDF(
    at org.apache.any23.rdf.RDFUtils.parseRDF(
    at org.apache.any23.cli.RoverTest.runWithMultiSourcesAndVerify(
    at org.apache.any23.cli.RoverTest.testRunMultiFiles(

Can one of the original Any23 developer take a look at it, if interested on the proposal,

Many thanks in advance, all the best!
> improve the current CLI engine
> ------------------------------
>                 Key: ANY23-71
>                 URL:
>             Project: Apache Any23
>          Issue Type: Improvement
>    Affects Versions: 0.7.0
>            Reporter: Simone Tripodi
>             Fix For: 0.7.0
>         Attachments: enhanced_cli.patch
> Actual CLI - even if nicely working - can be improved in therms of of both internal architecture
and user interface.
> I see two main "issues" on current CLI:
>  * on UI, the CLI exposes internal details, since the {{ToolRunner}} requires the classname
of the tool has to be executed;
>  * on internals, each Tool has to parse the the chunk of the command line, which can
be automated.
> So my proposal is to automate, via the already working plugins discovery, the CLI arguments
parsing AND implementing a svn/git commands-based alike interface.
> My preferred choice for that is [JCommande|] because:
>  * it allows binding CLI arguments to Java properties via Annotations - no more manual
>  * it already supports a complex syntax to implement [commands|];
>  * commands aliases can be expressed via annotations - no more needs to expose internals;
> Patch with proposal is coming

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message