incubator-any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <simonetrip...@apache.org>
Subject Re: svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./ any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/ any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/or
Date Wed, 11 Jan 2012 07:19:35 GMT
Hi Mic,

happy new year you too indeed :P

Please shout if you need any help on reorganizing stuff, I would be
more than glad to provide my help!

TIA!
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Tue, Jan 10, 2012 at 6:13 PM, Michele Mostarda
<michele.mostarda@gmail.com> wrote:
> On 10 January 2012 18:08, Simone Tripodi <simonetripodi@apache.org> wrote:
>
>> Hi Mic,
>>
>
> Hi Simo, happy new year !
>
> this is something great, thanks for the hard work of merging!
>> next step is renaming the packages in org.apache.any23 :)
>>
>
> Sure :) It is the next critical issue scheduled on Jira.
> The we can start discussing about the release.
>
> Ciao
>
> Mic
>
>
>>
>> All the best, have a nice day!
>> -Simo
>>
>> http://people.apache.org/~simonetripodi/
>> http://simonetripodi.livejournal.com/
>> http://twitter.com/simonetripodi
>> http://www.99soft.org/
>>
>>
>>
>> On Tue, Jan 10, 2012 at 5:32 PM,  <mostarda@apache.org> wrote:
>> > Author: mostarda
>> > Date: Tue Jan 10 16:32:28 2012
>> > New Revision: 1229627
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
>> > Log:
>> > This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
>> > with the current Apache Any23 SVN repo, including the issues
>> > developed during the initial import transition phase.
>> > Such issues have been tracked on the original Any23 Google Code Issue
>> Tracker [2].
>> > Below the extract of the original repository commit log.
>> >
>> > This commit is related to issue ANY23-27.
>> >
>> > [1] http://any23.googlecode.com/svn/trunk/
>> > [2] http://code.google.com/p/any23/issues/list
>> >
>> > ==== BEGIN: Original Log ====
>> >
>> > ------------------------------------------------------------------------
>> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
>> 1 line
>> >
>> > Improved numeric datatype assigment. This commit fixes issue #208.
>> > ------------------------------------------------------------------------
>> > hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
>> > ------------------------------------------------------------------------
>> > r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) |
>> 1 line
>> >
>> > Improved numeric datatype assigment. This commit fixes issue #208.
>> > ------------------------------------------------------------------------
>> > r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'.
>> Fixed HTMLMetaExtractorTest.java to match this new
>> > namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS
>> declared as resource instead that as a URI. Fixed
>> > RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration.
>> This commit is related to issue #203.
>> > ------------------------------------------------------------------------
>> > r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found
>> wrong declaration of Class(Resource) in WO.java
>> > voca. Fixed and updated RDFSchemaUtils.java test. This commit is related
>> to issue #198.
>> > ------------------------------------------------------------------------
>> > r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Added utility method.
>> > ------------------------------------------------------------------------
>> > r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) |
>> 1 line
>> >
>> > Improved Vocabulary.java class: added support for comments to any
>> resource. Improved RDFSchemaUtils.java serialization
>> > support, added separators to RDFXML serialization. This commit is
>> related to issue #198.
>> > ------------------------------------------------------------------------
>> > r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) |
>> 1 line
>> >
>> > Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved
>> prefix declaration parsing in RDFa11Parser, this
>> > new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix
>> declarations. Fixed support for prefix mapping resolution in
>> > RDFa11Parser, this allows the correct support for the structured
>> properties introduced by the latest version of the Open
>> > Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest
>> to the new output of vocabularies serialization.
>> > Updated Any23PluginManagerTest to include a new class. This commit is
>> related to issue #206.
>> > ------------------------------------------------------------------------
>> > r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) |
>> 1 line
>> >
>> > Restricted scope of testGetClassesFromClasspath to avoid updating it
>> every time a new class is added.
>> > ------------------------------------------------------------------------
>> > r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Improved validation mode support. Improved descriptions of Validation
>> and Report fields. This commit is related to issue
>> > #209.
>> > ------------------------------------------------------------------------
>> > r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Improved Any23 Service XML Report format documentation.
>> > ------------------------------------------------------------------------
>> > r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Added URL encoding to the source location path. This commit fixes issue
>> #205. Chosen not to write a formal test which
>> > requires the creation of folders with spaces
>> > ------------------------------------------------------------------------
>> > r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) |
>> 1 line
>> >
>> > Removed obsolete section.
>> > ------------------------------------------------------------------------
>> > r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) |
>> 1 line
>> >
>> > Improved Any23 facade, added method createDocumentSource() to simplify
>> the extraction setup.
>> > ------------------------------------------------------------------------
>> > r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) |
>> 1 line
>> >
>> > Refactored Rover CLI class to made it extensible from other CLI
>> implementations.
>> > ------------------------------------------------------------------------
>> > r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) |
>> 3 lines
>> >
>> > Removed wrong artifact name.
>> >
>> >
>> > ------------------------------------------------------------------------
>> > r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Upload by wagon-svn
>> > ------------------------------------------------------------------------
>> > r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Removed no longer used jspf lib. Added crawler4j dependencies. Added
>> README. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Changed attributes visibility to facilitate the class extensibility.
>> > ------------------------------------------------------------------------
>> > r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added helper methods to extract file lines as list of strings. Improved
>> javadoc.
>> > ------------------------------------------------------------------------
>> > r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added first version of basic-crawler plugin. This commit is related to
>> issue #211.
>> > ------------------------------------------------------------------------
>> > r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Added plugins README.
>> > ------------------------------------------------------------------------
>> > r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Updated main README, added references to plugin and lib.
>> > ------------------------------------------------------------------------
>> > r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed assembly name.
>> > ------------------------------------------------------------------------
>> > r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed Tool signature. This commit is related to #211.
>> > ------------------------------------------------------------------------
>> > r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Improved logging.
>> > ------------------------------------------------------------------------
>> > r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Included plugin basic-crawler in reactor. Improved ToolRunner and
>> Any23PluginManager tests to be compliant to the new
>> > plugin classes. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) |
>> 1 line
>> >
>> > Fixed Crawler4j group id. Related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Improved plugin documentation. Introduced Office Scraper specific page.
>> This commit is related to issue #213.
>> > ------------------------------------------------------------------------
>> > r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed POST method documentation. Related to issue #213.
>> > ------------------------------------------------------------------------
>> > r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed code snippets, prettified, added missing finalization logic. See
>> issue #187.
>> > ------------------------------------------------------------------------
>> > r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed var name. See #187.
>> > ------------------------------------------------------------------------
>> > r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Updated code snippets and tutorial, added explicit TripleHandler
>> closure. This commit is related to issue #187.
>> > ------------------------------------------------------------------------
>> > r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Fixed data type handling management in NQuadsParser. This commit is
>> related to issue #210.
>> > ------------------------------------------------------------------------
>> > r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Added missing JSON output format. See #214.
>> > ------------------------------------------------------------------------
>> > r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) |
>> 1 line
>> >
>> > Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output
>> format support to Rover. This commit is related to
>> > issue #215.
>> > ------------------------------------------------------------------------
>> > r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Added Sesame TriX IO dependency. This commit is related to #215.
>> > ------------------------------------------------------------------------
>> > r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Some suppressed suppressed have been reactivated as Ignored.
>> > ------------------------------------------------------------------------
>> > r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Added TriX output format to the Any23 Service. Commit related to issue
>> #215.
>> > ------------------------------------------------------------------------
>> > r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) |
>> 1 line
>> >
>> > Improved FormatWriter management, added WriterRegistry. Improved Writer
>> format management in Rover and WebResponder.
>> > This commit is related to issues #215 and #216.
>> > ------------------------------------------------------------------------
>> > r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) |
>> 6 lines
>> >
>> > Added TriXExtractor and textual example (example-trix.trx), added trix
>> support in RDFParserFactory.
>> > Registered TriXExtractor to the ExtractorRegistry.
>> > Added TriX mimetype support in TikaMIMETypeDetector (through
>> mimetypes.xml) and added specific test.
>> > Added support and doc to TriX format in Any23 Service web page
>> (form.html).
>> > This commit is related to issue #215.
>> >
>> > ------------------------------------------------------------------------
>> > r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) |
>> 1 line
>> >
>> > Fixed number of extractors (+1 after adding TriXExtractor). Commit
>> related to issue #215.
>> > ------------------------------------------------------------------------
>> > r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Added method getExtractorType() .
>> > ------------------------------------------------------------------------
>> > r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) |
>> 4 lines
>> >
>> > Improved ExtractorDocumentation support, added missing format examples.
>> > Improved output layout. This commit is related to issue #194.
>> >
>> >
>> > ------------------------------------------------------------------------
>> > r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Improved classpath management in Any23PluginManager. Renamed
>> getClasses\* in loadClasses\* . This commit is related to
>> > issue #212.
>> > ------------------------------------------------------------------------
>> > r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Separated log messages from specific outout data.
>> > ------------------------------------------------------------------------
>> > r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Added human readable report printing support in ReportingTripleHandler
>> and Rover.
>> > ------------------------------------------------------------------------
>> > r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Fixed major issue in output generation, added final activity report,
>> help prettification. This commit is related to
>> > issue #211.
>> > ------------------------------------------------------------------------
>> > r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Upgraded to Sesame 2.6.1 See issue #217.
>> > ------------------------------------------------------------------------
>> > r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue
>> #216
>> > ------------------------------------------------------------------------
>> > r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) |
>> 1 line
>> >
>> > Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
>> > ------------------------------------------------------------------------
>> > r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) |
>> 1 line
>> >
>> > Added specific Crawler CLI documentation. Updated general CLI
>> documentation. This commit is related to issue #211.
>> > ------------------------------------------------------------------------
>> > r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) |
>> 4 lines
>> >
>> > The Eval CLI Tool has been removed as well as the org.deri.any23.eval
>> package classes related to it.
>> > Updated tests verifying CLI tool detection.
>> > This commit is related to issue #218.
>> >
>> > ------------------------------------------------------------------------
>> > r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) |
>> 5 lines
>> >
>> > Added MimeDetector CLI Tool and test case, removed main() from
>> > TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
>> > Updated CLI doc.
>> > This commit is related to issue #219.
>> >
>> > ------------------------------------------------------------------------
>> > r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Added support for comment serialization. Related to issue #158.
>> > ------------------------------------------------------------------------
>> > r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Add support for annotation writing in FormatWriter implementations. This
>> commit is related to issue #158.
>> > ------------------------------------------------------------------------
>> > r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) |
>> 1 line
>> >
>> > Added support for 'annotate' flag in Any23 Service.
>> > ------------------------------------------------------------------------
>> >
>> > ==== END  : Original Log ====
>> >
>> >
>> > Added:
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
>> >    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
>> >    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
>> >    incubator/any23/trunk/any23-core/src/test/resources/application/trix/
>> >
>>  incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
>> >
>>  incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
>> >
>>  incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
>> >    incubator/any23/trunk/lib/README.txt
>> >    incubator/any23/trunk/plugins/README.txt
>> >    incubator/any23/trunk/plugins/basic-crawler/
>> >    incubator/any23/trunk/plugins/basic-crawler/pom.xml
>> >    incubator/any23/trunk/plugins/basic-crawler/src/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
>> >    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
>> >
>>  incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
>> >    incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
>> > Removed:
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
>> > Modified:
>> >    incubator/any23/trunk/README.txt
>> >    incubator/any23/trunk/any23-core/bin/any23
>> >    incubator/any23/trunk/any23-core/bin/any23tools
>> >    incubator/any23/trunk/any23-core/pom.xml
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
>> >
>>  incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
>> >
>>  incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
>> >
>>  incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
>> >
>>  incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
>> >
>>  incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
>> >
>>  incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
>> >    incubator/any23/trunk/lib/install-deps.sh
>> >
>>  incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
>> >    incubator/any23/trunk/pom.xml
>> >    incubator/any23/trunk/src/site/apt/any23-plugins.apt
>> >    incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
>> >    incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
>> >    incubator/any23/trunk/src/site/apt/getting-started.apt
>> >    incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
>> >    incubator/any23/trunk/src/site/apt/service.apt
>> >    incubator/any23/trunk/src/site/apt/supported-formats.apt
>> >
>> > Modified: incubator/any23/trunk/README.txt
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/README.txt (original)
>> > +++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
>> > @@ -20,7 +20,8 @@ Distribution Content
>> >
>> >  any23-core           The library core codebase.
>> >  any23-service        The library HTTP service codebase.
>> > -plugins              Library plugins codebase.
>> > +lib                  Contains the Any23 the external deps (read
>> lib/README.txt for further details).
>> > +plugins              Library plugins codebase (read plugins/README.txt
>> for further details).
>> >  RELEASE-NOTES.txt    File reporting main release notes for every
>> version.
>> >  LICENSE.txt          Applicable project license.
>> >  README.txt           This file.
>> > @@ -240,15 +241,14 @@ Upload the produced packages in download
>> >
>> >    http://code.google.com/p/any23/downloads/list
>> >
>> > +--------------------
>> > +Manage External Deps
>> > +--------------------
>> >
>> > -Fix Release Procedure
>> > ----------------------
>> > -
>> > -   Currently the *plugins/integration-test* module is excluded from the
>> parent
>> > -   reactor.
>> > -   To fix it in tag follow procedure as described at issue #171:
>> > -
>> > -        http://code.google.com/p/any23/issues/detail?id=171
>> > +::Developers interest only.::
>> >
>> > +External Deps are libraries used by some Any23 modules which are
>> > +not available in public Maven repositories. Such libraries are
>> > +managed within the 'lib' dir.
>> >
>> >  EOF
>> >
>> > Modified: incubator/any23/trunk/any23-core/bin/any23
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/bin/any23 (original)
>> > +++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
>> > @@ -9,12 +9,12 @@
>> >  ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
>> >
>> >  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> > -    echo "Generating executable JAR..."
>> > -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    echo "Generating executable JAR..." >&2
>> > +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >         ||\
>> > -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >        ||\
>> > -    { echo "Error while generating commandline assembly."; exit 1; }
>> > +    { echo "Error while generating commandline assembly."  >&2; exit 1;
>> }
>> >  fi
>> >
>> >  SEP=':'
>> >
>> > Modified: incubator/any23/trunk/any23-core/bin/any23tools
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/bin/any23tools (original)
>> > +++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28
>> 2012
>> > @@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd
>> >  PLUGINS_DIR=plugins
>> >
>> >  if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then
>> > -    echo "Generating executable JAR..."
>> > -    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    echo "Generating executable JAR..." >&2
>> > +    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >         ||\
>> > -    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly\
>> > +    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean
>> assembly:assembly >&2 \
>> >        ||\
>> > -    { echo "Error while generating commandline assembly."; exit 1; }
>> > +    { echo "Error while generating commandline assembly." >&2; exit 1; }
>> >  fi
>> >
>> >  SEP=':'
>> > @@ -30,6 +30,7 @@ done
>> >  # Plugins classpath.
>> >  for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name
>> "*-plugin.jar" -depth 1)
>> >  do
>> > +  echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
>> >   if [ ! -e "$jar" ]; then continue; fi
>> >   CP="$CP$SEP$jar"
>> >  done
>> >
>> > Modified: incubator/any23/trunk/any23-core/pom.xml
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > --- incubator/any23/trunk/any23-core/pom.xml (original)
>> > +++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
>> > @@ -92,6 +92,10 @@
>> >         </dependency>
>> >         <dependency>
>> >             <groupId>org.openrdf.sesame</groupId>
>> > +            <artifactId>sesame-rio-trix</artifactId>
>> > +        </dependency>
>> > +        <dependency>
>> > +            <groupId>org.openrdf.sesame</groupId>
>> >             <artifactId>sesame-repository-sail</artifactId>
>> >         </dependency>
>> >         <dependency>
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -258,6 +258,28 @@ public class Any23 {
>> >     }
>> >
>> >     /**
>> > +     * Returns the most appropriate {@link DocumentSource} for the
>> given<code>documentURI</code>.
>> > +     *
>> > +     * @param documentURI the document <i>URI</i>.
>> > +     * @return a new instance of DocumentSource.
>> > +     * @throws URISyntaxException if an error occurs while parsing the
>> <code>documentURI</code> as a <i>URI</i>.
>> > +     * @throws IOException if an error occurs while initializing the
>> internal {@link HTTPClient}.
>> > +     */
>> > +    public DocumentSource createDocumentSource(String documentURI)
>> throws URISyntaxException, IOException {
>> > +        if(documentURI == null) throw new
>> NullPointerException("documentURI cannot be null.");
>> > +        if (documentURI.toLowerCase().startsWith("file:")) {
>> > +            return new FileDocumentSource( new File(new
>> URI(documentURI)) );
>> > +        }
>> > +        if (documentURI.toLowerCase().startsWith("http:") ||
>> documentURI.toLowerCase().startsWith("https:")) {
>> > +            return new HTTPDocumentSource(getHTTPClient(), documentURI);
>> > +        }
>> > +        throw new IllegalArgumentException(
>> > +                String.format("Unsupported protocol for document URI:
>> '%s' .", documentURI)
>> > +        );
>> > +    }
>> > +
>> > +
>> > +    /**
>> >      * Performs metadata extraction from the content of the given
>> >      * <code>in</code> document source, sending the generated events
>> >      * to the specified <code>outputHandler</code>.
>> > @@ -363,13 +385,7 @@ public class Any23 {
>> >     public ExtractionReport extract(ExtractionParameters eps, String
>> documentURI, TripleHandler outputHandler)
>> >     throws IOException, ExtractionException {
>> >         try {
>> > -            if (documentURI.toLowerCase().startsWith("file:")) {
>> > -                return extract(eps, new FileDocumentSource(new File(new
>> URI(documentURI))), outputHandler);
>> > -            }
>> > -            if (documentURI.toLowerCase().startsWith("http:") ||
>> documentURI.toLowerCase().startsWith("https:")) {
>> > -                return extract(eps, new
>> HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
>> > -            }
>> > -            throw new ExtractionException("Not a valid absolute URI: "
>> + documentURI);
>> > +            return extract(eps, createDocumentSource(documentURI),
>> outputHandler);
>> >         } catch (URISyntaxException ex) {
>> >             throw new ExtractionException("Error while extracting data
>> from document URI.", ex);
>> >         }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -16,7 +16,7 @@
>> >
>> >  package org.deri.any23.cli;
>> >
>> > -import org.deri.any23.LogUtil;
>> > +import org.deri.any23.util.LogUtils;
>> >  import org.deri.any23.extractor.ExampleInputOutput;
>> >  import org.deri.any23.extractor.ExtractionException;
>> >  import org.deri.any23.extractor.Extractor;
>> > @@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
>> >     }
>> >
>> >     public int run(String[] args) {
>> > -        LogUtil.setDefaultLogging();
>> > +        LogUtils.setDefaultLogging();
>> >         try {
>> >             if (args.length == 0) {
>> >                 printUsage();
>> > @@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
>> >      * Prints the list of all the available extractors.
>> >      */
>> >     public void printExtractorList() {
>> > -        for (String extractorName :
>> ExtractorRegistry.getInstance().getAllNames()) {
>> > -            System.out.println(extractorName);
>> > +        for(ExtractorFactory factory :
>> ExtractorRegistry.getInstance().getExtractorGroup()) {
>> > +            System.out.println( String.format("%25s [%15s]",
>> factory.getExtractorName(), factory.getExtractorType()));
>> >         }
>> >     }
>> >
>> > @@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
>> >             ExtractorFactory<?> factory =
>> ExtractorRegistry.getInstance().getFactory(extractorName);
>> >             ExampleInputOutput example = new ExampleInputOutput(factory);
>> >             System.out.println("Extractor: " + extractorName);
>> > -            System.out.println("  type: " + getType(factory));
>> > -            String output = example.getExampleOutput();
>> > -            if (output == null) {
>> > -                System.out.println("(no example output)");
>> > +            System.out.println("\ttype: " + getType(factory));
>> > +            System.out.println();
>> > +            final String exampleInput = example.getExampleInput();
>> > +            if(exampleInput == null) {
>> > +                System.out.println("(No Example Available)");
>> >             } else {
>> > -                System.out.println("-------- example output --------");
>> > -                System.out.println(output);
>> > +                System.out.println("-------- Example Input  --------");
>> > +                System.out.println(exampleInput);
>> > +                System.out.println("-------- Example Output --------");
>> > +                String output = example.getExampleOutput();
>> > +                System.out.println(output == null ||
>> output.trim().length() == 0 ? "(No Output Generated)" : output);
>> >             }
>> > -            System.out.println();
>> >             System.out.println("================================");
>> > +            System.out.println();
>> >         }
>> >     }
>> >
>> >
>> > Added:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> (added)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -0,0 +1,113 @@
>> > +/*
>> > + * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
>> > + *
>> > + * Licensed under the Apache License, Version 2.0 (the "License");
>> > + * you may not use this file except in compliance with the License.
>> > + * You may obtain a copy of the License at
>> > + *
>> > + *          http://www.apache.org/licenses/LICENSE-2.0
>> > + *
>> > + * Unless required by applicable law or agreed to in writing, software
>> > + * distributed under the License is distributed on an "AS IS" BASIS,
>> > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>> implied.
>> > + * See the License for the specific language governing permissions and
>> > + * limitations under the License.
>> > + */
>> > +
>> > +package org.deri.any23.cli;
>> > +
>> > +import org.deri.any23.configuration.DefaultConfiguration;
>> > +import org.deri.any23.http.DefaultHTTPClient;
>> > +import org.deri.any23.http.HTTPClient;
>> > +import org.deri.any23.http.HTTPClientConfiguration;
>> > +import org.deri.any23.mime.MIMEType;
>> > +import org.deri.any23.mime.MIMETypeDetector;
>> > +import org.deri.any23.mime.TikaMIMETypeDetector;
>> > +import org.deri.any23.source.DocumentSource;
>> > +import org.deri.any23.source.FileDocumentSource;
>> > +import org.deri.any23.source.HTTPDocumentSource;
>> > +import org.deri.any23.source.StringDocumentSource;
>> > +
>> > +import java.io.File;
>> > +import java.net.URISyntaxException;
>> > +
>> > +/**
>> > + * Commandline tool to detect <b>MIME Type</b>s from
>> > + * file, HTTP and direct input sources.
>> > + * The implementation of this tool is based on {@link
>> TikaMIMETypeDetector}.
>> > + *
>> > + * @author Michele Mostarda (mostarda@fbk.eu)
>> > + */
>> > +@ToolRunner.Description("MIME Type Detector Tool.")
>> > +public class MimeDetector implements Tool{
>> > +
>> > +    public static final String FILE_DOCUMENT_PREFIX   = "file://";
>> > +    public static final String INLINE_DOCUMENT_PREFIX = "inline://";
>> > +    public static final String URL_DOCUMENT_RE        = "^https?://.*";
>> > +
>> > +    public static void main(String[] args) {
>> > +        System.exit( new MimeDetector().run(args) );
>> > +    }
>> > +
>> > +    @Override
>> > +    public int run(String[] args) {
>> > +          if(args.length != 1) {
>> > +            System.err.println("USAGE: {
>> http://path/to/resource.html|file:///path/to/local.file|inline:// some
>> inline content}");
>> > +            return 1;
>> > +        }
>> > +
>> > +        final String document = args[0];
>> > +        try {
>> > +            final DocumentSource documentSource =
>> createDocumentSource(document);
>> > +            final MIMETypeDetector detector = new
>> TikaMIMETypeDetector();
>> > +            final MIMEType mimeType = detector.guessMIMEType(
>> > +                    documentSource.getDocumentURI(),
>> > +                    documentSource.openInputStream(),
>> > +                    MIMEType.parse(documentSource.getContentType())
>> > +            );
>> > +            System.out.println(mimeType);
>> > +            return 0;
>> > +        } catch (Exception e) {
>> > +            System.err.print("Error while detecting MIME Type.");
>> > +            e.printStackTrace(System.err);
>> > +            return 1;
>> > +        }
>> > +    }
>> > +
>> > +    private DocumentSource createDocumentSource(String document) throws
>> URISyntaxException {
>> > +        if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
>> > +            return new FileDocumentSource(
>> > +                    new File(
>> > +
>>  document.substring(FILE_DOCUMENT_PREFIX.length())
>> > +                    )
>> > +            );
>> > +        }
>> > +        if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
>> > +            return new StringDocumentSource(
>> > +                    document.substring(INLINE_DOCUMENT_PREFIX.length()),
>> > +                    ""
>> > +            );
>> > +        }
>> > +        if(document.matches(URL_DOCUMENT_RE)) {
>> > +            final HTTPClient client = new DefaultHTTPClient();
>> > +            // TODO: anonymous config class also used in Any23.
>> centralize.
>> > +            client.init(new HTTPClientConfiguration() {
>> > +                public String getUserAgent() {
>> > +                    return
>> DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
>> > +                }
>> > +                public String getAcceptHeader() {
>> > +                    return "";
>> > +                }
>> > +                public int getDefaultTimeout() {
>> > +                    return
>> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
>> > +                }
>> > +                public int getMaxConnections() {
>> > +                    return
>> DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
>> > +                }
>> > +            });
>> > +            return new HTTPDocumentSource(client, document);
>> > +        }
>> > +        throw new IllegalArgumentException("Unsupported protocol for
>> document " + document);
>> > +    }
>> > +
>> > +}
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
>> >  import org.apache.commons.cli.Options;
>> >  import org.apache.commons.cli.PosixParser;
>> >  import org.deri.any23.Any23;
>> > -import org.deri.any23.LogUtil;
>> > +import org.deri.any23.util.LogUtils;
>> >  import org.deri.any23.configuration.Configuration;
>> >  import org.deri.any23.configuration.DefaultConfiguration;
>> >  import org.deri.any23.extractor.ExtractionException;
>> > @@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
>> >  import org.deri.any23.extractor.SingleDocumentExtraction;
>> >  import org.deri.any23.filter.IgnoreAccidentalRDFa;
>> >  import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
>> > +import org.deri.any23.source.DocumentSource;
>> >  import org.deri.any23.writer.BenchmarkTripleHandler;
>> >  import org.deri.any23.writer.LoggingTripleHandler;
>> > -import org.deri.any23.writer.NQuadsWriter;
>> > -import org.deri.any23.writer.NTriplesWriter;
>> > -import org.deri.any23.writer.RDFXMLWriter;
>> >  import org.deri.any23.writer.ReportingTripleHandler;
>> >  import org.deri.any23.writer.TripleHandler;
>> >  import org.deri.any23.writer.TripleHandlerException;
>> > -import org.deri.any23.writer.TurtleWriter;
>> > -import org.deri.any23.writer.URIListWriter;
>> > +import org.deri.any23.writer.WriterRegistry;
>> >  import org.slf4j.Logger;
>> >  import org.slf4j.LoggerFactory;
>> >
>> > @@ -51,6 +48,7 @@ import java.io.OutputStream;
>> >  import java.io.PrintStream;
>> >  import java.io.PrintWriter;
>> >  import java.net.MalformedURLException;
>> > +import java.net.URISyntaxException;
>> >  import java.net.URL;
>> >
>> >  import static
>> org.deri.any23.extractor.ExtractionParameters.ValidationMode;
>> > @@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
>> >  * A default rover implementation. Goes and fetches a URL using an hint
>> >  * as to what format should require, then tries to convert it to RDF.
>> >  *
>> > - * @author Gabriele Renzi
>> > - * @author Richard Cyganiak (richard@cyganiak.de)
>> >  * @author Michele Mostarda (mostarda@fbk.eu)
>> > + * @author Richard Cyganiak (richard@cyganiak.de)
>> > + * @author Gabriele Renzi
>> >  */
>> >  @ToolRunner.Description("Any23 Command Line Tool.")
>> >  public class Rover implements Tool {
>> >
>> > -    // Supported formats.
>> > -    private static final String TURTLE_FORMAT  = "turtle";
>> > -    private static final String NTRIPLE_FORMAT = "ntriples";
>> > -    private static final String RDFXML_FORMAT  = "rdfxml";
>> > -    private static final String NQUADS_FORMAT  = "nquads";
>> > -    private static final String URIS_FORMAT    = "uris";
>> > -
>> > -    private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
>> > +    private static final String[] FORMATS =
>> WriterRegistry.getInstance().getIdentifiers();
>> > +    private static final int DEFAULT_FORMAT_INDEX = 0;
>> >
>> >     private static final Logger logger =
>> LoggerFactory.getLogger(Rover.class);
>> >
>> > -    private static Options options;
>> > +    private Options options;
>> >
>> > -    public static void main(String[] args) {
>> > -        System.exit( new Rover().run(args) );
>> > -    }
>> > +    private CommandLine commandLine;
>> >
>> > -    public int run(String[] args) {
>> > -        final CommandLineParser parser = new PosixParser();
>> > -        final CommandLine commandLine;
>> > +    private boolean verbose = false;
>> >
>> > -        boolean verbose = false;
>> > -        try {
>> > -            options = createOptions();
>> > -            commandLine = parser.parse(options, args);
>> > +    private PrintStream outputStream;
>> > +    private TripleHandler tripleHandler;
>> > +    private ReportingTripleHandler reportingTripleHandler;
>> > +    private BenchmarkTripleHandler benchmarkTripleHandler;
>> >
>> > -            if (commandLine.hasOption("h")) {
>> > -                printHelp();
>> > -                return 0;
>> > -            }
>> > +    private ExtractionParameters eps;
>> > +    private Any23 any23;
>> >
>> > -            if (commandLine.hasOption('v')) {
>> > -                verbose = true;
>> > -                LogUtil.setVerboseLogging();
>> > -            } else {
>> > -                LogUtil.setDefaultLogging();
>> > -            }
>> > -
>> > -            if (commandLine.getArgs().length < 1) {
>> > -                printHelp();
>> > -                throw new IllegalArgumentException("Expected at least 1
>> argument.");
>> > -            }
>> > +    protected boolean isVerbose() {
>> > +        return verbose;
>> > +    }
>> >
>> > -            final String[] inputURIs      =
>> argumentsToURIs(commandLine.getArgs());
>> > -            final String[] extractorNames = getExtractors(commandLine);
>> > +    public static void main(String[] args) {
>> > +        System.exit( new Rover().run(args) );
>> > +    }
>> >
>> > -            PrintStream outputStream    = null;
>> > -            TripleHandler tripleHandler = null;
>> > -            try {
>> > -                outputStream  = getOutputStream(commandLine);
>> > +    public int run(String[] args) {
>> > +        try {
>> > +            final String[] uris = configure(args);
>> > +            performExtraction(uris);
>> > +            return 0;
>> > +        } catch (Exception e) {
>> > +            System.err.println( e.getMessage() );
>> > +            final int exitCode = e instanceof ExitCodeException ?
>> ((ExitCodeException) e).exitCode : 1;
>> > +            if(verbose) e.printStackTrace(System.err);
>> > +            return exitCode;
>> > +        }
>> > +    }
>> >
>> > -                tripleHandler = getTripleHandler(commandLine,
>> outputStream);
>> > +    protected CommandLine getCommandLine() {
>> > +        if(commandLine == null) throw new IllegalStateException("Rover
>> must be configured first.");
>> > +        return commandLine;
>> > +    }
>> >
>> > -                tripleHandler = decorateWithLogHandler(commandLine,
>> tripleHandler);
>> > +    protected String[] configure(String[] args) throws Exception {
>> > +        final CommandLineParser parser = new PosixParser();
>> > +        options = createOptions();
>> > +        commandLine = parser.parse(options, args);
>> >
>> > -                tripleHandler =
>> decorateWithStatisticsHandler(commandLine, tripleHandler);
>> > -                final BenchmarkTripleHandler benchmarkTripleHandler =
>> > -                        tripleHandler instanceof BenchmarkTripleHandler
>> ? (BenchmarkTripleHandler) tripleHandler : null;
>> > +        if (commandLine.hasOption("h")) {
>> > +            printHelp();
>> > +            throw new ExitCodeException(0);
>> > +        }
>> >
>> > -                tripleHandler =
>> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> > +        if (commandLine.hasOption('v')) {
>> > +            verbose = true;
>> > +            LogUtils.setVerboseLogging();
>> > +        } else {
>> > +            LogUtils.setDefaultLogging();
>> > +        }
>> >
>> > -                final ReportingTripleHandler reportingTripleHandler =
>> new ReportingTripleHandler(tripleHandler);
>> > +        if (commandLine.getArgs().length < 1) {
>> > +            printHelp();
>> > +            throw new IllegalArgumentException("Expected at least 1
>> argument.");
>> > +        }
>> >
>> > -                final ExtractionParameters eps =
>> getExtractionParameters(commandLine);
>> > +        final String[] inputURIs =
>> argumentsToURIs(commandLine.getArgs());
>> > +        final String[] extractorNames = getExtractors(commandLine);
>> >
>> > -                final Any23 any23 = createAny23(extractorNames);
>> > +        try {
>> > +            outputStream  = getOutputStream(commandLine);
>> > +            tripleHandler = getTripleHandler(commandLine, outputStream);
>> > +            tripleHandler = decorateWithLogHandler(commandLine,
>> tripleHandler);
>> > +            tripleHandler = decorateWithStatisticsHandler(commandLine,
>> tripleHandler);
>> >
>> > -                final long start = System.currentTimeMillis();
>> > -                for(String inputURI : inputURIs) {
>> > -                    performExtraction(any23, eps, inputURI,
>> reportingTripleHandler);
>> > -                }
>> > -                final long elapsed = System.currentTimeMillis() - start;
>> > +            benchmarkTripleHandler =
>> > +                    tripleHandler instanceof BenchmarkTripleHandler ?
>> (BenchmarkTripleHandler) tripleHandler : null;
>> >
>> > -                closeAll(tripleHandler, outputStream);
>> > +            tripleHandler =
>> decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
>> >
>> > -                if (benchmarkTripleHandler != null) {
>> > -                    System.err.println( benchmarkTripleHandler.report()
>> );
>> > -                }
>> > +            reportingTripleHandler = new
>> ReportingTripleHandler(tripleHandler);
>> > +            eps = getExtractionParameters(commandLine);
>> > +            any23 = createAny23(extractorNames);
>> >
>> > -                logger.info("Extractors used: " +
>> reportingTripleHandler.getExtractorNames());
>> > -                logger.info(reportingTripleHandler.getTotalTriples() +
>> " triples, " + elapsed + "ms");
>> > -            } finally {
>> > -                closeAll(tripleHandler, outputStream);
>> > -            }
>> > +            return inputURIs;
>> >         } catch (Exception e) {
>> > -            System.err.println(e.getMessage());
>> > -            final int exitCode = e instanceof SpecificExitException ?
>> ((SpecificExitException) e).exitCode : 1;
>> > -            if(verbose) e.printStackTrace(System.err);
>> > -            return exitCode;
>> > +            closeStreams();
>> > +            throw e;
>> >         }
>> > -        return 0;
>> >     }
>> >
>> > -    private Options createOptions() {
>> > +    protected Options createOptions() {
>> >         final Options options = new Options();
>> >         options.addOption(
>> >                 new Option("v", "verbose", false, "Show debug and
>> progress information.")
>> > @@ -178,13 +175,7 @@ public class Rover implements Tool {
>> >                         "f",
>> >                         "Output format",
>> >                         true,
>> > -                        "[" +
>> > -                                TURTLE_FORMAT  + " (default), " +
>> > -                                NTRIPLE_FORMAT + ", " +
>> > -                                RDFXML_FORMAT  + ", " +
>> > -                                NQUADS_FORMAT  + ", " +
>> > -                                URIS_FORMAT    +
>> > -                        "]"
>> > +                        "[" +  printFormats(FORMATS,
>> DEFAULT_FORMAT_INDEX) + "]"
>> >                 )
>> >         );
>> >         options.addOption(
>> > @@ -208,11 +199,51 @@ public class Rover implements Tool {
>> >         return options;
>> >     }
>> >
>> > +    protected void performExtraction(DocumentSource documentSource) {
>> > +        performExtraction(any23, eps, documentSource,
>> reportingTripleHandler);
>> > +    }
>> > +
>> > +    protected void performExtraction(String[] inputURIs) throws
>> URISyntaxException, IOException {
>> > +        try {
>> > +            final long start = System.currentTimeMillis();
>> > +            for (String inputURI : inputURIs) {
>> > +                performExtraction( any23.createDocumentSource(inputURI)
>> );
>> > +            }
>> > +            final long elapsed = System.currentTimeMillis() - start;
>> > +
>> > +            if (benchmarkTripleHandler != null) {
>> > +                System.err.println(benchmarkTripleHandler.report());
>> > +            }
>> > +
>> > +            logger.info("Extractors used: " +
>> reportingTripleHandler.getExtractorNames());
>> > +            logger.info(reportingTripleHandler.getTotalTriples() + "
>> triples, " + elapsed + "ms");
>> > +        } finally {
>> > +            closeStreams();
>> > +        }
>> > +    }
>> > +
>> > +    protected String printReports() {
>> > +        final StringBuilder sb = new StringBuilder();
>> > +        if(benchmarkTripleHandler != null) sb.append(
>> benchmarkTripleHandler.report() ).append('\n');
>> > +        if(reportingTripleHandler != null) sb.append(
>> reportingTripleHandler.printReport() ).append('\n');
>> > +        return sb.toString();
>> > +    }
>> > +
>> >     private void printHelp() {
>> >         HelpFormatter formatter = new HelpFormatter();
>> >         formatter.printHelp("[{<url>|<file>}]+", options, true);
>> >     }
>> >
>> > +    private String printFormats(String[] formats, int defaultIndex) {
>> > +        final StringBuilder sb = new StringBuilder();
>> > +        for (int i = 0; i < formats.length; i++) {
>> > +            sb.append(formats[i]);
>> > +            if(i == defaultIndex) sb.append(" (default)");
>> > +            if(i < formats.length - 1) sb.append(", ");
>> > +        }
>> > +        return sb.toString();
>> > +    }
>> > +
>> >     private String argumentToURI(String uri) {
>> >         uri = uri.trim();
>> >         if (uri.toLowerCase().startsWith("http:") ||
>> uri.toLowerCase().startsWith("https:")) {
>> > @@ -268,27 +299,17 @@ public class Rover implements Tool {
>> >
>> >     private TripleHandler getTripleHandler(CommandLine cl, OutputStream
>> os) {
>> >         final String FORMAT_OPTION = "f";
>> > -        String format = DEFAULT_FORMAT;
>> > +        String format = FORMATS[DEFAULT_FORMAT_INDEX];
>> >         if (cl.hasOption(FORMAT_OPTION)) {
>> > -            format = cl.getOptionValue(FORMAT_OPTION);
>> > +            format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
>> >         }
>> > -        final TripleHandler outputHandler;
>> > -        if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new TurtleWriter(os);
>> > -        } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new NTriplesWriter(os);
>> > -        } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new RDFXMLWriter(os);
>> > -        } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new NQuadsWriter(os);
>> > -        } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
>> > -            outputHandler = new URIListWriter(os);
>> > -        } else {
>> > +        try {
>> > +            return
>> WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
>> > +        } catch (Exception e) {
>> >             throw new IllegalArgumentException(
>> >                     String.format("Invalid option value '%s' for option
>> %s", format, FORMAT_OPTION)
>> >             );
>> >         }
>> > -        return outputHandler;
>> >     }
>> >
>> >     private TripleHandler
>> decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
>> > @@ -346,44 +367,54 @@ public class Rover implements Tool {
>> >         return any23;
>> >     }
>> >
>> > -    private void performExtraction(Any23 any23, ExtractionParameters
>> eps, String documentURI, TripleHandler th) {
>> > +    private void performExtraction(
>> > +            Any23 any23, ExtractionParameters eps, DocumentSource
>> documentSource, TripleHandler th
>> > +    ) {
>> >         try {
>> > -            if (! any23.extract(eps, documentURI,
>> th).hasMatchingExtractors()) {
>> > -                throw new SpecificExitException("No suitable extractors
>> found.", 2);
>> > +            if (! any23.extract(eps, documentSource,
>> th).hasMatchingExtractors()) {
>> > +                throw new ExitCodeException("No suitable extractors
>> found.", 2);
>> >             }
>> >         } catch (ExtractionException ex) {
>> > -            throw new SpecificExitException("Exception while extracting
>> metadata.", ex, 3);
>> > +            throw new ExitCodeException("Exception while extracting
>> metadata.", ex, 3);
>> >         } catch (IOException ex) {
>> > -            throw new SpecificExitException("Exception while producing
>> output.", ex, 4);
>> > +            throw new ExitCodeException("Exception while producing
>> output.", ex, 4);
>> >         }
>> >     }
>> >
>> > -    private void closeHandler(TripleHandler th) {
>> > -        if(th == null) return;
>> > +    private void closeHandler() {
>> > +        if(tripleHandler == null) return;
>> >         try {
>> > -            th.close();
>> > +            tripleHandler.close();
>> >         } catch (TripleHandlerException the) {
>> > -            throw new SpecificExitException("Error while closing
>> TripleHandler", the, 5);
>> > +            throw new ExitCodeException("Error while closing
>> TripleHandler", the, 5);
>> >         }
>> >     }
>> >
>> > -    private void closeAll(TripleHandler th, PrintStream os) {
>> > -             closeHandler(th);
>> > -            if(os != null) os.close();
>> > +    private void closeStreams() {
>> > +             closeHandler();
>> > +            if(outputStream != null) outputStream.close();
>> >     }
>> >
>> > -    private class SpecificExitException extends RuntimeException {
>> > +    protected class ExitCodeException extends RuntimeException {
>> >
>> >         private final int exitCode;
>> >
>> > -        public SpecificExitException(String message, Throwable cause,
>> int exitCode) {
>> > +        public ExitCodeException(String message, Throwable cause, int
>> exitCode) {
>> >             super(message, cause);
>> >             this.exitCode = exitCode;
>> >         }
>> > -        public SpecificExitException(String message, int exitCode) {
>> > +        public ExitCodeException(String message, int exitCode) {
>> >             super(message);
>> >             this.exitCode = exitCode;
>> >         }
>> > +        public ExitCodeException(int exitCode) {
>> > +            super();
>> > +            this.exitCode = exitCode;
>> > +        }
>> > +
>> > +        protected int getExitCode() {
>> > +            return exitCode;
>> > +        }
>> >     }
>> >
>> >  }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -29,6 +29,13 @@ import java.util.Collection;
>> >  public interface ExtractorFactory<T extends Extractor<?>> extends
>> ExtractorDescription {
>> >
>> >     /**
>> > +     * Returns the extractor type.
>> > +     *
>> > +     * @return the not <code>null</code> extractor class.
>> > +     */
>> > +    Class<T> getExtractorType();
>> > +
>> > +    /**
>> >      * Creates an extractor instance.
>> >      *
>> >      * @return an instance of the extractor associated to this factory.
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
>> >  import org.deri.any23.extractor.rdf.NQuadsExtractor;
>> >  import org.deri.any23.extractor.rdf.NTriplesExtractor;
>> >  import org.deri.any23.extractor.rdf.RDFXMLExtractor;
>> > +import org.deri.any23.extractor.rdf.TriXExtractor;
>> >  import org.deri.any23.extractor.rdf.TurtleExtractor;
>> >  import org.deri.any23.extractor.rdfa.RDFa11Extractor;
>> >  import org.deri.any23.extractor.rdfa.RDFaExtractor;
>> > @@ -79,6 +80,7 @@ public class ExtractorRegistry {
>> >                 instance.register(TurtleExtractor.factory);
>> >                 instance.register(NTriplesExtractor.factory);
>> >                 instance.register(NQuadsExtractor.factory);
>> > +                instance.register(TriXExtractor.factory);
>> >
>> if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
>> >                     instance.register(RDFa11Extractor.factory);
>> >                 } else {
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
>> >         return supportedMIMETypes;
>> >     }
>> >
>> > +    @Override
>> > +    public Class<T> getExtractorType() {
>> > +        return extractorClass;
>> > +    }
>> > +
>> >     /**
>> >      * @return an instance of type T concrete implementation of {@link
>> org.deri.any23.extractor.Extractor}
>> >      */
>> > +    @Override
>> >     public T createExtractor() {
>> >         try {
>> >             return extractorClass.newInstance();
>> > @@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
>> >     /**
>> >      * @return an input example
>> >      */
>> > +    @Override
>> >     public String getExampleInput() {
>> >         return exampleInput;
>> >     }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
>> >                     Arrays.asList(
>> >                             "text/csv;q=0.1"
>> >                     ),
>> > -                    null,
>> > +                    "example-csv.csv",
>> >                     CSVExtractor.class
>> >             );
>> >
>> > @@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
>> >     }
>> >
>> >     /**
>> > +     * Check whether a number is an integer.
>> > +     *
>> > +     * @param number
>> > +     * @return
>> > +     */
>> > +    private boolean isInteger(String number) {
>> > +        try {
>> > +            Integer.valueOf(number);
>> > +            return true;
>> > +        } catch (NumberFormatException e) {
>> > +            return false;
>> > +        }
>> > +    }
>> > +
>> > +    /**
>> > +     * Check whether a number is a float.
>> > +     *
>> >      * @param number
>> >      * @return
>> >      */
>> > -    private boolean isNumber(String number) {
>> > +    private boolean isFloat(String number) {
>> >         try {
>> > -            Double.valueOf(number);
>> > +            Float.valueOf(number);
>> >             return true;
>> >         } catch (NumberFormatException e) {
>> >             return false;
>> > @@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
>> >             object = new URIImpl(cell);
>> >         } else {
>> >             URI datatype = XMLSchema.STRING;
>> > -            if (isNumber(cell)) {
>> > +            if (isInteger(cell)) {
>> >                 datatype = XMLSchema.INTEGER;
>> > +            } else if(isFloat(cell)) {
>> > +                datatype = XMLSchema.FLOAT;
>> >             }
>> >             object = new LiteralImpl(cell, datatype);
>> >         }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
>> >                     "html-mf-adr",
>> >                     PopularPrefixes.createSubset("rdf", "vcard"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-adr.html",
>> >                     AdrExtractor.class
>> >             );
>> >  }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
>> >                 "html-mf-geo",
>> >                 PopularPrefixes.createSubset("rdf", "vcard"),
>> >                 Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                null,
>> > +                "example-mf-geo.html",
>> >                 GeoExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -53,7 +53,7 @@ public class HCalendarExtractor extends
>> >                     "html-mf-hcalendar",
>> >                     PopularPrefixes.createSubset("rdf", "ical"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hcalendar.html",
>> >                     HCalendarExtractor.class);
>> >
>> >     private static final String[] Components = {"Vevent", "Vtodo",
>> "Vjournal", "Vfreebusy"};
>> > @@ -116,7 +116,7 @@ public class HCalendarExtractor extends
>> >     private boolean extractComponent(Node node, Resource cal, String
>> component) throws ExtractionException {
>> >         HTMLDocument compoNode = new HTMLDocument(node);
>> >         BNode evt = valueFactory.createBNode();
>> > -        addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
>> > +        addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
>> >         addTextProps(compoNode, evt);
>> >         addUrl(compoNode, evt);
>> >         addRRule(compoNode, evt);
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
>> >                     "html-mf-hcard",
>> >                     PopularPrefixes.createSubset("rdf", "vcard"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hcard.html",
>> >                     HCardExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -82,7 +82,7 @@ public class HListingExtractor extends E
>> >                     "html-mf-hlisting",
>> >                     PopularPrefixes.createSubset("rdf", "hlisting"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hlisting.html",
>> >                     HListingExtractor.class
>> >             );
>> >
>> > @@ -106,7 +106,7 @@ public class HListingExtractor extends E
>> >         out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
>> >
>> >         for (String action : findActions(fragment)) {
>> > -            out.writeTriple(listing, hLISTING.action,
>> hLISTING.getResource(action));
>> > +            out.writeTriple(listing, hLISTING.action,
>> hLISTING.getClass(action));
>> >         }
>> >         out.writeTriple(listing, hLISTING.lister, addLister() );
>> >         addItem(listing);
>> > @@ -154,7 +154,7 @@ public class HListingExtractor extends E
>> >                     String value = node.getNodeValue();
>> >                     // do not use conditionallyAdd, it won't work cause
>> of evaluation rules
>> >                     if (!(null == value || "".equals(value))) {
>> > -                        URI property =
>> hLISTING.getPropertyCamelized(klass);
>> > +                        URI property =
>> hLISTING.getPropertyCamelCase(klass);
>> >                         conditionallyAddLiteralProperty(
>> >                                 node,
>> >                                 blankItem, property,
>> valueFactory.createLiteral(value)
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
>> >                     "html-mf-hrecipe",
>> >                     PopularPrefixes.createSubset("rdf", "hrecipe"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hrecipe.html",
>> >                     HRecipeExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -48,7 +48,7 @@ public class HResumeExtractor extends En
>> >                     "html-mf-hresume",
>> >                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hresume.html",
>> >                     HResumeExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -53,7 +53,7 @@ public class HReviewExtractor extends En
>> >                     "html-mf-hreview",
>> >                     PopularPrefixes.createSubset("rdf", "vcard", "rev"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-hreview.html",
>> >                     HReviewExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
>> >                     "html-head-links",
>> >                     PopularPrefixes.createSubset("xhtml", "dcterms"),
>> >                     Arrays.asList("text/html;q=0.05",
>> "application/xhtml+xml;q=0.05"),
>> > -                    null,
>> > +                    "example-head-link.html",
>> >                     HeadLinkExtractor.class);
>> >  }
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
>> >                     "html-head-icbm",
>> >                     PopularPrefixes.createSubset("geo", "rdf"),
>> >                     Arrays.asList("text/html;q=0.01",
>> "application/xhtml+xml;q=0.01"),
>> > -                    null,
>> > +                    "example-icbm.html",
>> >                     ICBMExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -51,7 +51,7 @@ public class LicenseExtractor implements
>> >                     "html-mf-license",
>> >                     PopularPrefixes.createSubset("xhtml"),
>> >                     Arrays.asList("text/html;q=0.01",
>> "application/xhtml+xml;q=0.01"),
>> > -                    null,
>> > +                    "example-mf-license.html",
>> >                     LicenseExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
>> >                     "html-mf-species",
>> >                     PopularPrefixes.createSubset("rdf", "wo"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-mf-species.html",
>> >                     SpeciesExtractor.class
>> >             );
>> >
>> > @@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
>> >
>> >     private URI resolveClassName(String clazz) {
>> >         String upperCaseClass = clazz.substring(0, 1);
>> > -        return vWO.getResource(
>> > +        return vWO.getClass(
>> >                 String.format("%s%s",
>> >                         upperCaseClass.toUpperCase(),
>> >                         clazz.substring(1)
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
>> >                     NAME,
>> >                     PopularPrefixes.get(),
>> >                     Arrays.asList("text/html;q=0.02",
>> "application/xhtml+xml;q=0.02"),
>> > -                    null,
>> > +                    "example-script-turtle.html",
>> >                     TurtleHTMLExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
>> >                 "html-mf-xfn",
>> >                 PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
>> >                 Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                null,
>> > +                "example-mf-xfn.html",
>> >                 XFNExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
>> >                     "html-microdata",
>> >                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
>> >                     Arrays.asList("text/html;q=0.1",
>> "application/xhtml+xml;q=0.1"),
>> > -                    null,
>> > +                    "example-microdata.html",
>> >                     MicrodataExtractor.class
>> >             );
>> >
>> >
>> > Modified:
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> > URL:
>> http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
>> >
>> ==============================================================================
>> > ---
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> (original)
>> > +++
>> incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
>> Tue Jan 10 16:32:28 2012
>> > @@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
>> >  import org.deri.any23.extractor.ErrorReporter;
>> >  import org.deri.any23.extractor.ExtractionContext;
>> >  import org.deri.any23.extractor.ExtractionResult;
>> > -import org.deri.any23.parser.NQuadsParser;
>> > +import org.deri.any23.io.nquads.NQuadsParser;
>> >  import org.deri.any23.rdf.Any23ValueFactoryWrapper;
>> >  import org.openrdf.model.impl.ValueFactoryImpl;
>> >  import org.openrdf.rio.ParseErrorListener;
>> > @@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
>> >  import org.openrdf.rio.RDFParser;
>> >  import org.openrdf.rio.ntriples.NTriplesParser;
>> >  import org.openrdf.rio.rdfxml.RDFXMLParser;
>> > +import org.openrdf.rio.trix.TriXParser;
>> >  import org.openrdf.rio.turtle.TurtleParser;
>> >  import org.slf4j.Logger;
>> >  import org.slf4j.LoggerFactory;
>> > @@ -38,7 +39,7 @@ import java.io.Reader;
>> >
>> >  /**
>> >  * This factory provides a common logic for creating and configuring
>> correctly
>> > - * any RDF parser used within the library.
>> > + * any <i>RDF</i> parser used within the library.
>> >  *
>> >  * @author Michele Mostarda (mostarda@fbk.eu)
>> >  */
>> > @@ -119,7 +120,7 @@ public class RDFParserFactory {
>> >     }
>> >
>> >     /**
>> > -     * Returns a new instance of a configured {@link
>> org.deri.any23.parser.NQuadsParser}.
>> > +     * Returns a new instance of a configured {@link
>> org.deri.any23.io.nquads.NQuadsParser}.
>> >      *
>> >      * @param verifyDataType data verification enable if
>> <code>true</code>.
>> >      * @param stopAtFirstError the parser stops at first error if
>> <code>true</code>.
>> > @@ -139,6 +140,26 @@ public class RDFParserFactory {
>> >     }
>> >
>> >     /**
>> > +     * Returns a new instance of a configured {@link TriXParser}.
>> > +     *
>> > +     * @param verifyDataType data verification enable if
>> <code>true</code>.
>> > +     * @param stopAtFirstError the parser stops at first error if
>> <code>true</code>.
>> > +     * @param extractionContext the extraction context where the parser
>> is used.
>> > +     * @param extractionResult the output extraction result.
>> > +     * @return a new instance of a configured TriX parser.
>> > +     */
>> > +    public TriXParser getTriXParser(
>> > +            final boolean verifyDataType,
>> > +            final boolean stopAtFirstError,
>> > +            final ExtractionContext extractionContext,
>> > +            final ExtractionResult extractionResult
>> > +    ) {
>> > +        final TriXParser parser = new TriXParser();
>> > +        configureParser(parser, verifyDataType, stopAtFirstError,
>> extractionContext, extractionResult);
>> > +        return parser;
>> > +    }
>> > +
>> > +    /**
>> >      * Configures the given parser on the specified extraction result
>> >      * setting the policies for data verification and error handling.
>> >      *
>> >
>> >
>>
>
>
>
> --
> Michele Mostarda
> Senior Software Engineer
> skype: michele.mostarda
> twitter: micmos
> mail: me@michelemostarda.com
> site : http://www.michelemostarda.com

Mime
View raw message