incubator-any23-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mosta...@apache.org
Subject svn commit: r1229627 [1/5] - in /incubator/any23/trunk: ./ any23-core/ any23-core/bin/ any23-core/src/main/java/org/deri/any23/ any23-core/src/main/java/org/deri/any23/cli/ any23-core/src/main/java/org/deri/any23/eval/ any23-core/src/main/java/org/deri...
Date Tue, 10 Jan 2012 16:32:33 GMT
Author: mostarda
Date: Tue Jan 10 16:32:28 2012
New Revision: 1229627

URL: http://svn.apache.org/viewvc?rev=1229627&view=rev
Log:
This commit synchronizes the dismissed Any23 Google Code SVN repo [1]
with the current Apache Any23 SVN repo, including the issues
developed during the initial import transition phase. 
Such issues have been tracked on the original Any23 Google Code Issue Tracker [2].
Below the extract of the original repository commit log.

This commit is related to issue ANY23-27.

[1] http://any23.googlecode.com/svn/trunk/
[2] http://code.google.com/p/any23/issues/list

==== BEGIN: Original Log ====

------------------------------------------------------------------------
r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line

Improved numeric datatype assigment. This commit fixes issue #208.
------------------------------------------------------------------------
hardest-mac:gcode-svn hardest$ svn log -r 1548:HEAD
------------------------------------------------------------------------
r1548 | michele.mostarda | 2011-11-25 01:51:00 +0100(Ven, 25 Nov 2011) | 1 line

Improved numeric datatype assigment. This commit fixes issue #208.
------------------------------------------------------------------------
r1549 | michele.mostarda | 2011-11-26 13:48:29 +0100(Sab, 26 Nov 2011) | 1 line

Changed SINDICE vocab namespace to 'http://vocab.sindice.net/any23#'. Fixed HTMLMetaExtractorTest.java to match this new
namespace. Discovered and fixed issue in SINDICE.java vocabulary, NS declared as resource instead that as a URI. Fixed
RDFSchemaUtilsTest.java which sizes were wrong due wrong NS declaration. This commit is related to issue #203.
------------------------------------------------------------------------
r1550 | michele.mostarda | 2011-11-26 15:37:32 +0100(Sab, 26 Nov 2011) | 1 line

Improved glossary in Vocab.java, replaced 'Resource' with 'Class'. Found wrong declaration of Class(Resource) in WO.java
voca. Fixed and updated RDFSchemaUtils.java test. This commit is related to issue #198.
------------------------------------------------------------------------
r1551 | michele.mostarda | 2011-11-26 18:36:11 +0100(Sab, 26 Nov 2011) | 1 line

Added utility method.
------------------------------------------------------------------------
r1552 | michele.mostarda | 2011-11-26 18:39:46 +0100(Sab, 26 Nov 2011) | 1 line

Improved Vocabulary.java class: added support for comments to any resource. Improved RDFSchemaUtils.java serialization
support, added separators to RDFXML serialization. This commit is related to issue #198.
------------------------------------------------------------------------
r1553 | michele.mostarda | 2011-11-27 20:03:17 +0100(Dom, 27 Nov 2011) | 1 line

Added new OGP vocabulary (Open Graph Protocol http://ogp.me ). Improved prefix declaration parsing in RDFa11Parser, this
new parser is more tolerant on RDFa 1.0 and RDFa 1.1 prefix declarations. Fixed support for prefix mapping resolution in
RDFa11Parser, this allows the correct support for the structured properties introduced by the latest version of the Open
Graph Protocol (http://ogp.me/#structured). Updated RDFSchemaUtilsTest to the new output of vocabularies serialization.
Updated Any23PluginManagerTest to include a new class. This commit is related to issue #206.
------------------------------------------------------------------------
r1554 | michele.mostarda | 2011-11-27 20:55:46 +0100(Dom, 27 Nov 2011) | 1 line

Restricted scope of testGetClassesFromClasspath to avoid updating it every time a new class is added.
------------------------------------------------------------------------
r1555 | michele.mostarda | 2011-11-28 20:12:27 +0100(Lun, 28 Nov 2011) | 1 line

Improved validation mode support. Improved descriptions of Validation and Report fields. This commit is related to issue
#209.
------------------------------------------------------------------------
r1556 | michele.mostarda | 2011-11-28 21:22:49 +0100(Lun, 28 Nov 2011) | 1 line

Improved Any23 Service XML Report format documentation.
------------------------------------------------------------------------
r1557 | michele.mostarda | 2011-11-28 23:28:37 +0100(Lun, 28 Nov 2011) | 1 line

Added URL encoding to the source location path. This commit fixes issue #205. Chosen not to write a formal test which
requires the creation of folders with spaces
------------------------------------------------------------------------
r1558 | michele.mostarda | 2011-11-28 23:38:48 +0100(Lun, 28 Nov 2011) | 1 line

Removed obsolete section.
------------------------------------------------------------------------
r1559 | michele.mostarda | 2011-12-09 17:32:32 +0100(Ven, 09 Dic 2011) | 1 line

Improved Any23 facade, added method createDocumentSource() to simplify the extraction setup.
------------------------------------------------------------------------
r1560 | michele.mostarda | 2011-12-09 17:38:57 +0100(Ven, 09 Dic 2011) | 1 line

Refactored Rover CLI class to made it extensible from other CLI implementations.
------------------------------------------------------------------------
r1561 | michele.mostarda | 2011-12-10 14:23:54 +0100(Sab, 10 Dic 2011) | 1 line

Upload by wagon-svn
------------------------------------------------------------------------
r1562 | michele.mostarda | 2011-12-10 14:32:41 +0100(Sab, 10 Dic 2011) | 1 line

Upload by wagon-svn
------------------------------------------------------------------------
r1563 | michele.mostarda | 2011-12-10 14:37:52 +0100(Sab, 10 Dic 2011) | 1 line

Upload by wagon-svn
------------------------------------------------------------------------
r1564 | michele.mostarda | 2011-12-10 14:38:28 +0100(Sab, 10 Dic 2011) | 1 line

Upload by wagon-svn
------------------------------------------------------------------------
r1565 | michele.mostarda | 2011-12-10 14:44:13 +0100(Sab, 10 Dic 2011) | 3 lines

Removed wrong artifact name.


------------------------------------------------------------------------
r1566 | michele.mostarda | 2011-12-10 14:44:45 +0100(Sab, 10 Dic 2011) | 1 line

Upload by wagon-svn
------------------------------------------------------------------------
r1567 | michele.mostarda | 2011-12-10 14:45:21 +0100(Sab, 10 Dic 2011) | 1 line

Upload by wagon-svn
------------------------------------------------------------------------
r1568 | michele.mostarda | 2011-12-10 16:24:09 +0100(Sab, 10 Dic 2011) | 1 line

Removed no longer used jspf lib. Added crawler4j dependencies. Added README. This commit is related to issue #211.
------------------------------------------------------------------------
r1569 | michele.mostarda | 2011-12-10 16:26:47 +0100(Sab, 10 Dic 2011) | 1 line

Changed attributes visibility to facilitate the class extensibility.
------------------------------------------------------------------------
r1570 | michele.mostarda | 2011-12-10 16:28:26 +0100(Sab, 10 Dic 2011) | 1 line

Added helper methods to extract file lines as list of strings. Improved javadoc.
------------------------------------------------------------------------
r1571 | michele.mostarda | 2011-12-10 16:47:03 +0100(Sab, 10 Dic 2011) | 1 line

Added first version of basic-crawler plugin. This commit is related to issue #211.
------------------------------------------------------------------------
r1572 | michele.mostarda | 2011-12-10 16:48:51 +0100(Sab, 10 Dic 2011) | 1 line

Added plugins README.
------------------------------------------------------------------------
r1573 | michele.mostarda | 2011-12-10 16:54:01 +0100(Sab, 10 Dic 2011) | 1 line

Updated main README, added references to plugin and lib.
------------------------------------------------------------------------
r1574 | michele.mostarda | 2011-12-10 16:57:04 +0100(Sab, 10 Dic 2011) | 1 line

Fixed assembly name.
------------------------------------------------------------------------
r1575 | michele.mostarda | 2011-12-10 18:21:57 +0100(Sab, 10 Dic 2011) | 1 line

Fixed Tool signature. This commit is related to #211.
------------------------------------------------------------------------
r1576 | michele.mostarda | 2011-12-10 18:26:46 +0100(Sab, 10 Dic 2011) | 1 line

Improved logging.
------------------------------------------------------------------------
r1577 | michele.mostarda | 2011-12-10 18:31:54 +0100(Sab, 10 Dic 2011) | 1 line

Included plugin basic-crawler in reactor. Improved ToolRunner and Any23PluginManager tests to be compliant to the new
plugin classes. This commit is related to issue #211.
------------------------------------------------------------------------
r1578 | michele.mostarda | 2011-12-10 18:41:24 +0100(Sab, 10 Dic 2011) | 1 line

Fixed Crawler4j group id. Related to issue #211.
------------------------------------------------------------------------
r1579 | michele.mostarda | 2011-12-11 15:25:43 +0100(Dom, 11 Dic 2011) | 1 line

Improved plugin documentation. Introduced Office Scraper specific page. This commit is related to issue #213.
------------------------------------------------------------------------
r1580 | michele.mostarda | 2011-12-11 15:26:32 +0100(Dom, 11 Dic 2011) | 1 line

Fixed POST method documentation. Related to issue #213.
------------------------------------------------------------------------
r1581 | michele.mostarda | 2011-12-11 15:43:34 +0100(Dom, 11 Dic 2011) | 1 line

Fixed code snippets, prettified, added missing finalization logic. See issue #187.
------------------------------------------------------------------------
r1582 | michele.mostarda | 2011-12-11 16:08:39 +0100(Dom, 11 Dic 2011) | 1 line

Fixed var name. See #187.
------------------------------------------------------------------------
r1583 | michele.mostarda | 2011-12-11 16:09:34 +0100(Dom, 11 Dic 2011) | 1 line

Updated code snippets and tutorial, added explicit TripleHandler closure. This commit is related to issue #187.
------------------------------------------------------------------------
r1584 | michele.mostarda | 2011-12-11 16:34:48 +0100(Dom, 11 Dic 2011) | 1 line

Fixed data type handling management in NQuadsParser. This commit is related to issue #210.
------------------------------------------------------------------------
r1585 | michele.mostarda | 2011-12-11 17:03:34 +0100(Dom, 11 Dic 2011) | 1 line

Added missing JSON output format. See #214.
------------------------------------------------------------------------
r1586 | michele.mostarda | 2011-12-11 23:43:39 +0100(Dom, 11 Dic 2011) | 1 line

Added Sesame RIO TriX dependency. Added TriXWriter. Added TriX output format support to Rover. This commit is related to
issue #215.
------------------------------------------------------------------------
r1587 | michele.mostarda | 2011-12-12 00:00:10 +0100(Lun, 12 Dic 2011) | 1 line

Added Sesame TriX IO dependency. This commit is related to #215.
------------------------------------------------------------------------
r1588 | michele.mostarda | 2011-12-12 00:17:35 +0100(Lun, 12 Dic 2011) | 1 line

Some suppressed suppressed have been reactivated as Ignored.
------------------------------------------------------------------------
r1589 | michele.mostarda | 2011-12-12 00:37:41 +0100(Lun, 12 Dic 2011) | 1 line

Added TriX output format to the Any23 Service. Commit related to issue #215.
------------------------------------------------------------------------
r1590 | michele.mostarda | 2011-12-12 23:35:48 +0100(Lun, 12 Dic 2011) | 1 line

Improved FormatWriter management, added WriterRegistry. Improved Writer format management in Rover and WebResponder.
This commit is related to issues #215 and #216.
------------------------------------------------------------------------
r1591 | michele.mostarda | 2011-12-13 23:50:01 +0100(Mar, 13 Dic 2011) | 6 lines

Added TriXExtractor and textual example (example-trix.trx), added trix support in RDFParserFactory.
Registered TriXExtractor to the ExtractorRegistry.
Added TriX mimetype support in TikaMIMETypeDetector (through mimetypes.xml) and added specific test.
Added support and doc to TriX format in Any23 Service web page (form.html).
This commit is related to issue #215.

------------------------------------------------------------------------
r1592 | michele.mostarda | 2011-12-14 11:37:37 +0100(Mer, 14 Dic 2011) | 1 line

Fixed number of extractors (+1 after adding TriXExtractor). Commit related to issue #215.
------------------------------------------------------------------------
r1593 | michele.mostarda | 2011-12-17 14:21:59 +0100(Sab, 17 Dic 2011) | 1 line

Added method getExtractorType() .
------------------------------------------------------------------------
r1594 | michele.mostarda | 2011-12-17 14:24:14 +0100(Sab, 17 Dic 2011) | 4 lines

Improved ExtractorDocumentation support, added missing format examples.
Improved output layout. This commit is related to issue #194.


------------------------------------------------------------------------
r1595 | michele.mostarda | 2011-12-17 15:52:53 +0100(Sab, 17 Dic 2011) | 1 line

Improved classpath management in Any23PluginManager. Renamed getClasses\* in loadClasses\* . This commit is related to
issue #212.
------------------------------------------------------------------------
r1596 | michele.mostarda | 2011-12-17 17:29:27 +0100(Sab, 17 Dic 2011) | 1 line

Separated log messages from specific outout data.
------------------------------------------------------------------------
r1597 | michele.mostarda | 2011-12-17 17:31:06 +0100(Sab, 17 Dic 2011) | 1 line

Added human readable report printing support in ReportingTripleHandler and Rover.
------------------------------------------------------------------------
r1598 | michele.mostarda | 2011-12-17 17:38:03 +0100(Sab, 17 Dic 2011) | 1 line

Fixed major issue in output generation, added final activity report, help prettification. This commit is related to
issue #211.
------------------------------------------------------------------------
r1599 | michele.mostarda | 2011-12-17 17:56:01 +0100(Sab, 17 Dic 2011) | 1 line

Upgraded to Sesame 2.6.1 See issue #217.
------------------------------------------------------------------------
r1600 | michele.mostarda | 2011-12-17 18:03:10 +0100(Sab, 17 Dic 2011) | 1 line

Moved org.deri.any23.LogUtil to org.deri.any23.util.LogUtils . See issue #216
------------------------------------------------------------------------
r1601 | michele.mostarda | 2011-12-17 18:13:49 +0100(Sab, 17 Dic 2011) | 1 line

Moved org.deri.any23.parser to org.deri.any23.io.nquads . See issue #216.
------------------------------------------------------------------------
r1602 | michele.mostarda | 2011-12-18 13:55:23 +0100(Dom, 18 Dic 2011) | 1 line

Added specific Crawler CLI documentation. Updated general CLI documentation. This commit is related to issue #211.
------------------------------------------------------------------------
r1603 | michele.mostarda | 2011-12-18 14:34:07 +0100(Dom, 18 Dic 2011) | 4 lines

The Eval CLI Tool has been removed as well as the org.deri.any23.eval package classes related to it.
Updated tests verifying CLI tool detection.
This commit is related to issue #218.

------------------------------------------------------------------------
r1604 | michele.mostarda | 2011-12-18 17:11:24 +0100(Dom, 18 Dic 2011) | 5 lines

Added MimeDetector CLI Tool and test case, removed main() from
TikaMIMETypeDetector. Updated ToolRunnerTest to verify this new tool.
Updated CLI doc.
This commit is related to issue #219.

------------------------------------------------------------------------
r1605 | michele.mostarda | 2012-01-06 10:33:04 +0100(Ven, 06 Gen 2012) | 1 line

Added support for comment serialization. Related to issue #158.
------------------------------------------------------------------------
r1606 | michele.mostarda | 2012-01-06 10:35:26 +0100(Ven, 06 Gen 2012) | 1 line

Add support for annotation writing in FormatWriter implementations. This commit is related to issue #158.
------------------------------------------------------------------------
r1607 | michele.mostarda | 2012-01-06 10:43:41 +0100(Ven, 06 Gen 2012) | 1 line

Added support for 'annotate' flag in Any23 Service.
------------------------------------------------------------------------

==== END  : Original Log ====


Added:
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/TriXExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuads.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsParser.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/NQuadsWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/io/nquads/package-info.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/LogUtils.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/OGP.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TriXWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/Writer.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/WriterRegistry.java
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/csv/example-csv.csv
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-head-link.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-icbm.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-adr.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-geo.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcalendar.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hcard.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hlisting.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hrecipe.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hresume.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-hreview.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-license.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-species.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-mf-xfn.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/html/example-script-turtle.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/microdata/example-microdata.html
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdf/example-trix.trx
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/extractor/rdfa/example-rdfa11.html
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/MimeDetectorTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsParserTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/io/nquads/NQuadsWriterTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/VocabularyTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/writer/WriterRegistryTest.java
    incubator/any23/trunk/any23-core/src/test/resources/application/trix/
    incubator/any23/trunk/any23-core/src/test/resources/application/trix/test1.trx
    incubator/any23/trunk/any23-core/src/test/resources/html/rdfa/opengraph-structured-properties.html
    incubator/any23/trunk/any23-core/src/test/resources/org/deri/any23/extractor/csv/test-type.csv
    incubator/any23/trunk/lib/README.txt
    incubator/any23/trunk/plugins/README.txt
    incubator/any23/trunk/plugins/basic-crawler/
    incubator/any23/trunk/plugins/basic-crawler/pom.xml
    incubator/any23/trunk/plugins/basic-crawler/src/
    incubator/any23/trunk/plugins/basic-crawler/src/main/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/cli/Crawler.java
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/CrawlerListener.java
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/DefaultWebCrawler.java
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SharedData.java
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/SiteCrawler.java
    incubator/any23/trunk/plugins/basic-crawler/src/main/java/org/deri/any23/plugin/crawler/package-info.java
    incubator/any23/trunk/plugins/basic-crawler/src/test/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/Any23OnlineTestBase.java
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/cli/CrawlerTest.java
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/
    incubator/any23/trunk/plugins/basic-crawler/src/test/java/org/deri/any23/plugin/crawler/SiteCrawlerTest.java
    incubator/any23/trunk/src/site/apt/plugin-office-scraper.apt
Removed:
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/LogUtil.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Eval.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/Count.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/LogEvaluator.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/eval/package-info.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuads.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsParser.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/NQuadsWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/parser/package-info.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsParserTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/parser/NQuadsWriterTest.java
Modified:
    incubator/any23/trunk/README.txt
    incubator/any23/trunk/any23-core/bin/any23
    incubator/any23/trunk/any23-core/bin/any23tools
    incubator/any23/trunk/any23-core/pom.xml
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Extractor.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdfa/RDFa11Parser.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/mime/TikaMIMETypeDetector.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/plugin/Any23PluginManager.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/rdf/RDFUtils.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/FileUtils.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StreamUtils.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/util/StringUtils.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/DOAC.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/FOAF.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/GEO.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HLISTING.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/HRECIPE.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/ICAL.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/RDFSchemaUtils.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/SINDICE.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/Vocabulary.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/vocab/WO.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/FormatWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/JSONWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NQuadsWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/NTriplesWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFWriterTripleHandler.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/RDFXMLWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/ReportingTripleHandler.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/TurtleWriter.java
    incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/writer/URIListWriter.java
    incubator/any23/trunk/any23-core/src/main/resources/org/deri/any23/mime/mimetypes.xml
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/Any23Test.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ExtractorDocumentationTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/cli/ToolRunnerTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/csv/CSVExtractorTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/AbstractExtractorTestCase.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/html/HTMLMetaExtractorTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/microdata/MicrodataExtractorTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ExtractorTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/extractor/rdfa/RDFa11ParserTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/mime/TikaMIMETypeDetectorTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/plugin/Any23PluginManagerTest.java
    incubator/any23/trunk/any23-core/src/test/java/org/deri/any23/vocab/RDFSchemaUtilsTest.java
    incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/Servlet.java
    incubator/any23/trunk/any23-service/src/main/java/org/deri/any23/servlet/WebResponder.java
    incubator/any23/trunk/any23-service/src/main/webapp/resources/form.html
    incubator/any23/trunk/any23-service/src/test/java/org/deri/any23/servlet/ServletTest.java
    incubator/any23/trunk/lib/install-deps.sh
    incubator/any23/trunk/plugins/integration-test/src/test/java/org/deri/any23/plugin/PluginIT.java
    incubator/any23/trunk/pom.xml
    incubator/any23/trunk/src/site/apt/any23-plugins.apt
    incubator/any23/trunk/src/site/apt/dev-data-conversion.apt
    incubator/any23/trunk/src/site/apt/dev-data-extraction.apt
    incubator/any23/trunk/src/site/apt/getting-started.apt
    incubator/any23/trunk/src/site/apt/plugin-html-scraper.apt
    incubator/any23/trunk/src/site/apt/service.apt
    incubator/any23/trunk/src/site/apt/supported-formats.apt

Modified: incubator/any23/trunk/README.txt
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/README.txt?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/README.txt (original)
+++ incubator/any23/trunk/README.txt Tue Jan 10 16:32:28 2012
@@ -20,7 +20,8 @@ Distribution Content
 
 any23-core           The library core codebase.
 any23-service        The library HTTP service codebase.
-plugins              Library plugins codebase.
+lib                  Contains the Any23 the external deps (read lib/README.txt for further details).
+plugins              Library plugins codebase (read plugins/README.txt for further details).
 RELEASE-NOTES.txt    File reporting main release notes for every version.
 LICENSE.txt          Applicable project license.
 README.txt           This file.
@@ -240,15 +241,14 @@ Upload the produced packages in download
 
    http://code.google.com/p/any23/downloads/list
 
+--------------------
+Manage External Deps
+--------------------
 
-Fix Release Procedure
----------------------
-
-   Currently the *plugins/integration-test* module is excluded from the parent
-   reactor.
-   To fix it in tag follow procedure as described at issue #171:
-
-        http://code.google.com/p/any23/issues/detail?id=171
+::Developers interest only.::
 
+External Deps are libraries used by some Any23 modules which are
+not available in public Maven repositories. Such libraries are
+managed within the 'lib' dir.
 
 EOF

Modified: incubator/any23/trunk/any23-core/bin/any23
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/bin/any23 (original)
+++ incubator/any23/trunk/any23-core/bin/any23 Tue Jan 10 16:32:28 2012
@@ -9,12 +9,12 @@
 ANY23_ROOT="$(cd "$(dirname "$0")"; pwd -P)/.."
 
 if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then 
-    echo "Generating executable JAR..."
-    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
+    echo "Generating executable JAR..." >&2
+    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
         ||\
-    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
+    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
     	||\
-    { echo "Error while generating commandline assembly."; exit 1; }
+    { echo "Error while generating commandline assembly."  >&2; exit 1; }
 fi
 
 SEP=':'

Modified: incubator/any23/trunk/any23-core/bin/any23tools
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/bin/any23tools?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/bin/any23tools (original)
+++ incubator/any23/trunk/any23-core/bin/any23tools Tue Jan 10 16:32:28 2012
@@ -11,12 +11,12 @@ ANY23_ROOT="$(cd "$(dirname "$0")"; pwd 
 PLUGINS_DIR=plugins
 
 if [ ! -e $ANY23_ROOT/target/*-jar-with-dependencies.jar ]; then 
-    echo "Generating executable JAR..."
-    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
+    echo "Generating executable JAR..." >&2
+    mvn -o -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
         ||\
-    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly\
+    mvn    -f $ANY23_ROOT/pom.xml -Dmaven.test.skip=true clean assembly:assembly >&2 \
     	||\
-    { echo "Error while generating commandline assembly."; exit 1; }
+    { echo "Error while generating commandline assembly." >&2; exit 1; }
 fi
 
 SEP=':'
@@ -30,6 +30,7 @@ done
 # Plugins classpath.
 for jar in $(find $ANY23_ROOT/../$PLUGINS_DIR/*/target -name "*-plugin.jar" -depth 1)
 do
+  echo Detected plugin $(basename $jar) [$(dirname $jar)] >&2
   if [ ! -e "$jar" ]; then continue; fi
   CP="$CP$SEP$jar"
 done

Modified: incubator/any23/trunk/any23-core/pom.xml
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/pom.xml?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/pom.xml (original)
+++ incubator/any23/trunk/any23-core/pom.xml Tue Jan 10 16:32:28 2012
@@ -92,6 +92,10 @@
         </dependency>
         <dependency>
             <groupId>org.openrdf.sesame</groupId>
+            <artifactId>sesame-rio-trix</artifactId>
+        </dependency>
+        <dependency>
+            <groupId>org.openrdf.sesame</groupId>
             <artifactId>sesame-repository-sail</artifactId>
         </dependency>
         <dependency>

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/Any23.java Tue Jan 10 16:32:28 2012
@@ -258,6 +258,28 @@ public class Any23 {
     }
 
     /**
+     * Returns the most appropriate {@link DocumentSource} for the given<code>documentURI</code>.
+     *
+     * @param documentURI the document <i>URI</i>.
+     * @return a new instance of DocumentSource.
+     * @throws URISyntaxException if an error occurs while parsing the <code>documentURI</code> as a <i>URI</i>.
+     * @throws IOException if an error occurs while initializing the internal {@link HTTPClient}.
+     */
+    public DocumentSource createDocumentSource(String documentURI) throws URISyntaxException, IOException {
+        if(documentURI == null) throw new NullPointerException("documentURI cannot be null.");
+        if (documentURI.toLowerCase().startsWith("file:")) {
+            return new FileDocumentSource( new File(new URI(documentURI)) );
+        }
+        if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
+            return new HTTPDocumentSource(getHTTPClient(), documentURI);
+        }
+        throw new IllegalArgumentException(
+                String.format("Unsupported protocol for document URI: '%s' .", documentURI)
+        );
+    }
+
+
+    /**
      * Performs metadata extraction from the content of the given
      * <code>in</code> document source, sending the generated events
      * to the specified <code>outputHandler</code>.
@@ -363,13 +385,7 @@ public class Any23 {
     public ExtractionReport extract(ExtractionParameters eps, String documentURI, TripleHandler outputHandler)
     throws IOException, ExtractionException {
         try {
-            if (documentURI.toLowerCase().startsWith("file:")) {
-                return extract(eps, new FileDocumentSource(new File(new URI(documentURI))), outputHandler);
-            }
-            if (documentURI.toLowerCase().startsWith("http:") || documentURI.toLowerCase().startsWith("https:")) {
-                return extract(eps, new HTTPDocumentSource(getHTTPClient(), documentURI), outputHandler);
-            }
-            throw new ExtractionException("Not a valid absolute URI: " + documentURI);
+            return extract(eps, createDocumentSource(documentURI), outputHandler);
         } catch (URISyntaxException ex) {
             throw new ExtractionException("Error while extracting data from document URI.", ex);
         }

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/ExtractorDocumentation.java Tue Jan 10 16:32:28 2012
@@ -16,7 +16,7 @@
 
 package org.deri.any23.cli;
 
-import org.deri.any23.LogUtil;
+import org.deri.any23.util.LogUtils;
 import org.deri.any23.extractor.ExampleInputOutput;
 import org.deri.any23.extractor.ExtractionException;
 import org.deri.any23.extractor.Extractor;
@@ -60,7 +60,7 @@ public class ExtractorDocumentation impl
     }
 
     public int run(String[] args) {
-        LogUtil.setDefaultLogging();
+        LogUtils.setDefaultLogging();
         try {
             if (args.length == 0) {
                 printUsage();
@@ -145,8 +145,8 @@ public class ExtractorDocumentation impl
      * Prints the list of all the available extractors.
      */
     public void printExtractorList() {
-        for (String extractorName : ExtractorRegistry.getInstance().getAllNames()) {
-            System.out.println(extractorName);
+        for(ExtractorFactory factory : ExtractorRegistry.getInstance().getExtractorGroup()) {
+            System.out.println( String.format("%25s [%15s]", factory.getExtractorName(), factory.getExtractorType()));
         }
     }
 
@@ -194,16 +194,20 @@ public class ExtractorDocumentation impl
             ExtractorFactory<?> factory = ExtractorRegistry.getInstance().getFactory(extractorName);
             ExampleInputOutput example = new ExampleInputOutput(factory);
             System.out.println("Extractor: " + extractorName);
-            System.out.println("  type: " + getType(factory));
-            String output = example.getExampleOutput();
-            if (output == null) {
-                System.out.println("(no example output)");
+            System.out.println("\ttype: " + getType(factory));
+            System.out.println();
+            final String exampleInput = example.getExampleInput();
+            if(exampleInput == null) {
+                System.out.println("(No Example Available)");
             } else {
-                System.out.println("-------- example output --------");
-                System.out.println(output);
+                System.out.println("-------- Example Input  --------");
+                System.out.println(exampleInput);
+                System.out.println("-------- Example Output --------");
+                String output = example.getExampleOutput();
+                System.out.println(output == null || output.trim().length() == 0 ? "(No Output Generated)" : output);
             }
-            System.out.println();
             System.out.println("================================");
+            System.out.println();
         }
     }
 

Added: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java?rev=1229627&view=auto
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java (added)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/MimeDetector.java Tue Jan 10 16:32:28 2012
@@ -0,0 +1,113 @@
+/*
+ * Copyright 2008-2010 Digital Enterprise Research Institute (DERI)
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *          http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.deri.any23.cli;
+
+import org.deri.any23.configuration.DefaultConfiguration;
+import org.deri.any23.http.DefaultHTTPClient;
+import org.deri.any23.http.HTTPClient;
+import org.deri.any23.http.HTTPClientConfiguration;
+import org.deri.any23.mime.MIMEType;
+import org.deri.any23.mime.MIMETypeDetector;
+import org.deri.any23.mime.TikaMIMETypeDetector;
+import org.deri.any23.source.DocumentSource;
+import org.deri.any23.source.FileDocumentSource;
+import org.deri.any23.source.HTTPDocumentSource;
+import org.deri.any23.source.StringDocumentSource;
+
+import java.io.File;
+import java.net.URISyntaxException;
+
+/**
+ * Commandline tool to detect <b>MIME Type</b>s from
+ * file, HTTP and direct input sources.
+ * The implementation of this tool is based on {@link TikaMIMETypeDetector}.
+ *
+ * @author Michele Mostarda (mostarda@fbk.eu)
+ */
+@ToolRunner.Description("MIME Type Detector Tool.")
+public class MimeDetector implements Tool{
+
+    public static final String FILE_DOCUMENT_PREFIX   = "file://";
+    public static final String INLINE_DOCUMENT_PREFIX = "inline://";
+    public static final String URL_DOCUMENT_RE        = "^https?://.*";
+
+    public static void main(String[] args) {
+        System.exit( new MimeDetector().run(args) );
+    }
+
+    @Override
+    public int run(String[] args) {
+          if(args.length != 1) {
+            System.err.println("USAGE: {http://path/to/resource.html|file:///path/to/local.file|inline:// some inline content}");
+            return 1;
+        }
+
+        final String document = args[0];
+        try {
+            final DocumentSource documentSource = createDocumentSource(document);
+            final MIMETypeDetector detector = new TikaMIMETypeDetector();
+            final MIMEType mimeType = detector.guessMIMEType(
+                    documentSource.getDocumentURI(),
+                    documentSource.openInputStream(),
+                    MIMEType.parse(documentSource.getContentType())
+            );
+            System.out.println(mimeType);
+            return 0;
+        } catch (Exception e) {
+            System.err.print("Error while detecting MIME Type.");
+            e.printStackTrace(System.err);
+            return 1;
+        }
+    }
+
+    private DocumentSource createDocumentSource(String document) throws URISyntaxException {
+        if(document.startsWith(FILE_DOCUMENT_PREFIX)) {
+            return new FileDocumentSource(
+                    new File(
+                            document.substring(FILE_DOCUMENT_PREFIX.length())
+                    )
+            );
+        }
+        if(document.startsWith(INLINE_DOCUMENT_PREFIX)) {
+            return new StringDocumentSource(
+                    document.substring(INLINE_DOCUMENT_PREFIX.length()),
+                    ""
+            );
+        }
+        if(document.matches(URL_DOCUMENT_RE)) {
+            final HTTPClient client = new DefaultHTTPClient();
+            // TODO: anonymous config class also used in Any23. centralize.
+            client.init(new HTTPClientConfiguration() {
+                public String getUserAgent() {
+                    return DefaultConfiguration.singleton().getPropertyOrFail("any23.http.user.agent.default");
+                }
+                public String getAcceptHeader() {
+                    return "";
+                }
+                public int getDefaultTimeout() {
+                    return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.timeout");
+                }
+                public int getMaxConnections() {
+                    return DefaultConfiguration.singleton().getPropertyIntOrFail("any23.http.client.max.connections");
+                }
+            });
+            return new HTTPDocumentSource(client, document);
+        }
+        throw new IllegalArgumentException("Unsupported protocol for document " + document);
+    }
+
+}

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/cli/Rover.java Tue Jan 10 16:32:28 2012
@@ -23,7 +23,7 @@ import org.apache.commons.cli.Option;
 import org.apache.commons.cli.Options;
 import org.apache.commons.cli.PosixParser;
 import org.deri.any23.Any23;
-import org.deri.any23.LogUtil;
+import org.deri.any23.util.LogUtils;
 import org.deri.any23.configuration.Configuration;
 import org.deri.any23.configuration.DefaultConfiguration;
 import org.deri.any23.extractor.ExtractionException;
@@ -31,16 +31,13 @@ import org.deri.any23.extractor.Extracti
 import org.deri.any23.extractor.SingleDocumentExtraction;
 import org.deri.any23.filter.IgnoreAccidentalRDFa;
 import org.deri.any23.filter.IgnoreTitlesOfEmptyDocuments;
+import org.deri.any23.source.DocumentSource;
 import org.deri.any23.writer.BenchmarkTripleHandler;
 import org.deri.any23.writer.LoggingTripleHandler;
-import org.deri.any23.writer.NQuadsWriter;
-import org.deri.any23.writer.NTriplesWriter;
-import org.deri.any23.writer.RDFXMLWriter;
 import org.deri.any23.writer.ReportingTripleHandler;
 import org.deri.any23.writer.TripleHandler;
 import org.deri.any23.writer.TripleHandlerException;
-import org.deri.any23.writer.TurtleWriter;
-import org.deri.any23.writer.URIListWriter;
+import org.deri.any23.writer.WriterRegistry;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -51,6 +48,7 @@ import java.io.OutputStream;
 import java.io.PrintStream;
 import java.io.PrintWriter;
 import java.net.MalformedURLException;
+import java.net.URISyntaxException;
 import java.net.URL;
 
 import static org.deri.any23.extractor.ExtractionParameters.ValidationMode;
@@ -59,107 +57,106 @@ import static org.deri.any23.extractor.E
  * A default rover implementation. Goes and fetches a URL using an hint
  * as to what format should require, then tries to convert it to RDF.
  *
- * @author Gabriele Renzi
- * @author Richard Cyganiak (richard@cyganiak.de)
  * @author Michele Mostarda (mostarda@fbk.eu)
+ * @author Richard Cyganiak (richard@cyganiak.de)
+ * @author Gabriele Renzi
  */
 @ToolRunner.Description("Any23 Command Line Tool.")
 public class Rover implements Tool {
 
-    // Supported formats.
-    private static final String TURTLE_FORMAT  = "turtle";
-    private static final String NTRIPLE_FORMAT = "ntriples";
-    private static final String RDFXML_FORMAT  = "rdfxml";
-    private static final String NQUADS_FORMAT  = "nquads";
-    private static final String URIS_FORMAT    = "uris";
-
-    private static final String DEFAULT_FORMAT = TURTLE_FORMAT;
+    private static final String[] FORMATS = WriterRegistry.getInstance().getIdentifiers();
+    private static final int DEFAULT_FORMAT_INDEX = 0;
 
     private static final Logger logger = LoggerFactory.getLogger(Rover.class);
 
-    private static Options options;
+    private Options options;
 
-    public static void main(String[] args) {
-        System.exit( new Rover().run(args) );
-    }
+    private CommandLine commandLine;
 
-    public int run(String[] args) {
-        final CommandLineParser parser = new PosixParser();
-        final CommandLine commandLine;
+    private boolean verbose = false;
 
-        boolean verbose = false;
-        try {
-            options = createOptions();
-            commandLine = parser.parse(options, args);
+    private PrintStream outputStream;
+    private TripleHandler tripleHandler;
+    private ReportingTripleHandler reportingTripleHandler;
+    private BenchmarkTripleHandler benchmarkTripleHandler;
 
-            if (commandLine.hasOption("h")) {
-                printHelp();
-                return 0;
-            }
+    private ExtractionParameters eps;
+    private Any23 any23;
 
-            if (commandLine.hasOption('v')) {
-                verbose = true;
-                LogUtil.setVerboseLogging();
-            } else {
-                LogUtil.setDefaultLogging();
-            }
-
-            if (commandLine.getArgs().length < 1) {
-                printHelp();
-                throw new IllegalArgumentException("Expected at least 1 argument.");
-            }
+    protected boolean isVerbose() {
+        return verbose;
+    }
 
-            final String[] inputURIs      = argumentsToURIs(commandLine.getArgs());
-            final String[] extractorNames = getExtractors(commandLine);
+    public static void main(String[] args) {
+        System.exit( new Rover().run(args) );
+    }
 
-            PrintStream outputStream    = null;
-            TripleHandler tripleHandler = null;
-            try {
-                outputStream  = getOutputStream(commandLine);
+    public int run(String[] args) {
+        try {
+            final String[] uris = configure(args);
+            performExtraction(uris);
+            return 0;
+        } catch (Exception e) {
+            System.err.println( e.getMessage() );
+            final int exitCode = e instanceof ExitCodeException ? ((ExitCodeException) e).exitCode : 1;
+            if(verbose) e.printStackTrace(System.err);
+            return exitCode;
+        }
+    }
 
-                tripleHandler = getTripleHandler(commandLine, outputStream);
+    protected CommandLine getCommandLine() {
+        if(commandLine == null) throw new IllegalStateException("Rover must be configured first.");
+        return commandLine;
+    }
 
-                tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
+    protected String[] configure(String[] args) throws Exception {
+        final CommandLineParser parser = new PosixParser();
+        options = createOptions();
+        commandLine = parser.parse(options, args);
 
-                tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
-                final BenchmarkTripleHandler benchmarkTripleHandler =
-                        tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
+        if (commandLine.hasOption("h")) {
+            printHelp();
+            throw new ExitCodeException(0);
+        }
 
-                tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
+        if (commandLine.hasOption('v')) {
+            verbose = true;
+            LogUtils.setVerboseLogging();
+        } else {
+            LogUtils.setDefaultLogging();
+        }
 
-                final ReportingTripleHandler reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
+        if (commandLine.getArgs().length < 1) {
+            printHelp();
+            throw new IllegalArgumentException("Expected at least 1 argument.");
+        }
 
-                final ExtractionParameters eps = getExtractionParameters(commandLine);
+        final String[] inputURIs = argumentsToURIs(commandLine.getArgs());
+        final String[] extractorNames = getExtractors(commandLine);
 
-                final Any23 any23 = createAny23(extractorNames);
+        try {
+            outputStream  = getOutputStream(commandLine);
+            tripleHandler = getTripleHandler(commandLine, outputStream);
+            tripleHandler = decorateWithLogHandler(commandLine, tripleHandler);
+            tripleHandler = decorateWithStatisticsHandler(commandLine, tripleHandler);
 
-                final long start = System.currentTimeMillis();
-                for(String inputURI : inputURIs) {
-                    performExtraction(any23, eps, inputURI, reportingTripleHandler);
-                }
-                final long elapsed = System.currentTimeMillis() - start;
+            benchmarkTripleHandler =
+                    tripleHandler instanceof BenchmarkTripleHandler ? (BenchmarkTripleHandler) tripleHandler : null;
 
-                closeAll(tripleHandler, outputStream);
+            tripleHandler = decorateWithAccidentalTriplesFilter(commandLine, tripleHandler);
 
-                if (benchmarkTripleHandler != null) {
-                    System.err.println( benchmarkTripleHandler.report() );
-                }
+            reportingTripleHandler = new ReportingTripleHandler(tripleHandler);
+            eps = getExtractionParameters(commandLine);
+            any23 = createAny23(extractorNames);
 
-                logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
-                logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
-            } finally {
-                closeAll(tripleHandler, outputStream);
-            }
+            return inputURIs;
         } catch (Exception e) {
-            System.err.println(e.getMessage());
-            final int exitCode = e instanceof SpecificExitException ? ((SpecificExitException) e).exitCode : 1;
-            if(verbose) e.printStackTrace(System.err);
-            return exitCode;
+            closeStreams();
+            throw e;
         }
-        return 0;
     }
 
-    private Options createOptions() {
+    protected Options createOptions() {
         final Options options = new Options();
         options.addOption(
                 new Option("v", "verbose", false, "Show debug and progress information.")
@@ -178,13 +175,7 @@ public class Rover implements Tool {
                         "f",
                         "Output format",
                         true,
-                        "[" +
-                                TURTLE_FORMAT  + " (default), " +
-                                NTRIPLE_FORMAT + ", " +
-                                RDFXML_FORMAT  + ", " +
-                                NQUADS_FORMAT  + ", " +
-                                URIS_FORMAT    +
-                        "]"
+                        "[" +  printFormats(FORMATS, DEFAULT_FORMAT_INDEX) + "]"
                 )
         );
         options.addOption(
@@ -208,11 +199,51 @@ public class Rover implements Tool {
         return options;
     }
 
+    protected void performExtraction(DocumentSource documentSource) {
+        performExtraction(any23, eps, documentSource, reportingTripleHandler);
+    }
+
+    protected void performExtraction(String[] inputURIs) throws URISyntaxException, IOException {
+        try {
+            final long start = System.currentTimeMillis();
+            for (String inputURI : inputURIs) {
+                performExtraction( any23.createDocumentSource(inputURI) );
+            }
+            final long elapsed = System.currentTimeMillis() - start;
+
+            if (benchmarkTripleHandler != null) {
+                System.err.println(benchmarkTripleHandler.report());
+            }
+
+            logger.info("Extractors used: " + reportingTripleHandler.getExtractorNames());
+            logger.info(reportingTripleHandler.getTotalTriples() + " triples, " + elapsed + "ms");
+        } finally {
+            closeStreams();
+        }
+    }
+
+    protected String printReports() {
+        final StringBuilder sb = new StringBuilder();
+        if(benchmarkTripleHandler != null) sb.append( benchmarkTripleHandler.report() ).append('\n');
+        if(reportingTripleHandler != null) sb.append( reportingTripleHandler.printReport() ).append('\n');
+        return sb.toString();
+    }
+
     private void printHelp() {
         HelpFormatter formatter = new HelpFormatter();
         formatter.printHelp("[{<url>|<file>}]+", options, true);
     }
 
+    private String printFormats(String[] formats, int defaultIndex) {
+        final StringBuilder sb = new StringBuilder();
+        for (int i = 0; i < formats.length; i++) {
+            sb.append(formats[i]);
+            if(i == defaultIndex) sb.append(" (default)");
+            if(i < formats.length - 1) sb.append(", ");
+        }
+        return sb.toString();
+    }
+
     private String argumentToURI(String uri) {
         uri = uri.trim();
         if (uri.toLowerCase().startsWith("http:") || uri.toLowerCase().startsWith("https:")) {
@@ -268,27 +299,17 @@ public class Rover implements Tool {
 
     private TripleHandler getTripleHandler(CommandLine cl, OutputStream os) {
         final String FORMAT_OPTION = "f";
-        String format = DEFAULT_FORMAT;
+        String format = FORMATS[DEFAULT_FORMAT_INDEX];
         if (cl.hasOption(FORMAT_OPTION)) {
-            format = cl.getOptionValue(FORMAT_OPTION);
+            format = cl.getOptionValue(FORMAT_OPTION).toLowerCase();
         }
-        final TripleHandler outputHandler;
-        if (TURTLE_FORMAT.equalsIgnoreCase(format)) {
-            outputHandler = new TurtleWriter(os);
-        } else if (NTRIPLE_FORMAT.equalsIgnoreCase(format)) {
-            outputHandler = new NTriplesWriter(os);
-        } else if (RDFXML_FORMAT.equalsIgnoreCase(format)) {
-            outputHandler = new RDFXMLWriter(os);
-        } else if (NQUADS_FORMAT.equalsIgnoreCase(format)) {
-            outputHandler = new NQuadsWriter(os);
-        } else if (URIS_FORMAT.equalsIgnoreCase(format)) {
-            outputHandler = new URIListWriter(os);
-        } else {
+        try {
+            return WriterRegistry.getInstance().getWriterInstanceByIdentifier(format, os);
+        } catch (Exception e) {
             throw new IllegalArgumentException(
                     String.format("Invalid option value '%s' for option %s", format, FORMAT_OPTION)
             );
         }
-        return outputHandler;
     }
 
     private TripleHandler decorateWithAccidentalTriplesFilter(CommandLine cl, TripleHandler in) {
@@ -346,44 +367,54 @@ public class Rover implements Tool {
         return any23;
     }
 
-    private void performExtraction(Any23 any23, ExtractionParameters eps, String documentURI, TripleHandler th) {
+    private void performExtraction(
+            Any23 any23, ExtractionParameters eps, DocumentSource documentSource, TripleHandler th
+    ) {
         try {
-            if (! any23.extract(eps, documentURI, th).hasMatchingExtractors()) {
-                throw new SpecificExitException("No suitable extractors found.", 2);
+            if (! any23.extract(eps, documentSource, th).hasMatchingExtractors()) {
+                throw new ExitCodeException("No suitable extractors found.", 2);
             }
         } catch (ExtractionException ex) {
-            throw new SpecificExitException("Exception while extracting metadata.", ex, 3);
+            throw new ExitCodeException("Exception while extracting metadata.", ex, 3);
         } catch (IOException ex) {
-            throw new SpecificExitException("Exception while producing output.", ex, 4);
+            throw new ExitCodeException("Exception while producing output.", ex, 4);
         }
     }
 
-    private void closeHandler(TripleHandler th) {
-        if(th == null) return;
+    private void closeHandler() {
+        if(tripleHandler == null) return;
         try {
-            th.close();
+            tripleHandler.close();
         } catch (TripleHandlerException the) {
-            throw new SpecificExitException("Error while closing TripleHandler", the, 5);
+            throw new ExitCodeException("Error while closing TripleHandler", the, 5);
         }
     }
 
-    private void closeAll(TripleHandler th, PrintStream os) {
-             closeHandler(th);
-            if(os != null) os.close();
+    private void closeStreams() {
+             closeHandler();
+            if(outputStream != null) outputStream.close();
     }
 
-    private class SpecificExitException extends RuntimeException {
+    protected class ExitCodeException extends RuntimeException {
 
         private final int exitCode;
 
-        public SpecificExitException(String message, Throwable cause, int exitCode) {
+        public ExitCodeException(String message, Throwable cause, int exitCode) {
             super(message, cause);
             this.exitCode = exitCode;
         }
-        public SpecificExitException(String message, int exitCode) {
+        public ExitCodeException(String message, int exitCode) {
             super(message);
             this.exitCode = exitCode;
         }
+        public ExitCodeException(int exitCode) {
+            super();
+            this.exitCode = exitCode;
+        }
+
+        protected int getExitCode() {
+            return exitCode;
+        }
     }
 
 }

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorFactory.java Tue Jan 10 16:32:28 2012
@@ -29,6 +29,13 @@ import java.util.Collection;
 public interface ExtractorFactory<T extends Extractor<?>> extends ExtractorDescription {
 
     /**
+     * Returns the extractor type.
+     *
+     * @return the not <code>null</code> extractor class.
+     */
+    Class<T> getExtractorType();
+
+    /**
      * Creates an extractor instance.
      *
      * @return an instance of the extractor associated to this factory.

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/ExtractorRegistry.java Tue Jan 10 16:32:28 2012
@@ -39,6 +39,7 @@ import org.deri.any23.extractor.microdat
 import org.deri.any23.extractor.rdf.NQuadsExtractor;
 import org.deri.any23.extractor.rdf.NTriplesExtractor;
 import org.deri.any23.extractor.rdf.RDFXMLExtractor;
+import org.deri.any23.extractor.rdf.TriXExtractor;
 import org.deri.any23.extractor.rdf.TurtleExtractor;
 import org.deri.any23.extractor.rdfa.RDFa11Extractor;
 import org.deri.any23.extractor.rdfa.RDFaExtractor;
@@ -79,6 +80,7 @@ public class ExtractorRegistry {
                 instance.register(TurtleExtractor.factory);
                 instance.register(NTriplesExtractor.factory);
                 instance.register(NQuadsExtractor.factory);
+                instance.register(TriXExtractor.factory);
                 if(conf.getFlagProperty("any23.extraction.rdfa.programmatic")) {
                     instance.register(RDFa11Extractor.factory);
                 } else {

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/SimpleExtractorFactory.java Tue Jan 10 16:32:28 2012
@@ -83,9 +83,15 @@ public class SimpleExtractorFactory<T ex
         return supportedMIMETypes;
     }
 
+    @Override
+    public Class<T> getExtractorType() {
+        return extractorClass;
+    }
+
     /**
      * @return an instance of type T concrete implementation of {@link org.deri.any23.extractor.Extractor}
      */
+    @Override
     public T createExtractor() {
         try {
             return extractorClass.newInstance();
@@ -99,6 +105,7 @@ public class SimpleExtractorFactory<T ex
     /**
      * @return an input example
      */
+    @Override
     public String getExampleInput() {
         return exampleInput;
     }

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/csv/CSVExtractor.java Tue Jan 10 16:32:28 2012
@@ -62,7 +62,7 @@ public class CSVExtractor implements Ext
                     Arrays.asList(
                             "text/csv;q=0.1"
                     ),
-                    null,
+                    "example-csv.csv",
                     CSVExtractor.class
             );
 
@@ -124,12 +124,29 @@ public class CSVExtractor implements Ext
     }
 
     /**
+     * Check whether a number is an integer.
+     *
+     * @param number
+     * @return
+     */
+    private boolean isInteger(String number) {
+        try {
+            Integer.valueOf(number);
+            return true;
+        } catch (NumberFormatException e) {
+            return false;
+        }
+    }
+
+    /**
+     * Check whether a number is a float.
+     *
      * @param number
      * @return
      */
-    private boolean isNumber(String number) {
+    private boolean isFloat(String number) {
         try {
-            Double.valueOf(number);
+            Float.valueOf(number);
             return true;
         } catch (NumberFormatException e) {
             return false;
@@ -236,8 +253,10 @@ public class CSVExtractor implements Ext
             object = new URIImpl(cell);
         } else {
             URI datatype = XMLSchema.STRING;
-            if (isNumber(cell)) {
+            if (isInteger(cell)) {
                 datatype = XMLSchema.INTEGER;
+            } else if(isFloat(cell)) {
+                datatype = XMLSchema.FLOAT;
             }
             object = new LiteralImpl(cell, datatype);
         }

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/AdrExtractor.java Tue Jan 10 16:32:28 2012
@@ -97,7 +97,7 @@ public class AdrExtractor extends Entity
                     "html-mf-adr",
                     PopularPrefixes.createSubset("rdf", "vcard"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-adr.html",
                     AdrExtractor.class
             );
 }

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/GeoExtractor.java Tue Jan 10 16:32:28 2012
@@ -47,7 +47,7 @@ public class GeoExtractor extends Entity
                 "html-mf-geo",
                 PopularPrefixes.createSubset("rdf", "vcard"),
                 Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                null,
+                "example-mf-geo.html",
                 GeoExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCalendarExtractor.java Tue Jan 10 16:32:28 2012
@@ -53,7 +53,7 @@ public class HCalendarExtractor extends 
                     "html-mf-hcalendar",
                     PopularPrefixes.createSubset("rdf", "ical"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-hcalendar.html",
                     HCalendarExtractor.class);
 
     private static final String[] Components = {"Vevent", "Vtodo", "Vjournal", "Vfreebusy"};
@@ -116,7 +116,7 @@ public class HCalendarExtractor extends 
     private boolean extractComponent(Node node, Resource cal, String component) throws ExtractionException {
         HTMLDocument compoNode = new HTMLDocument(node);
         BNode evt = valueFactory.createBNode();
-        addURIProperty(evt, RDF.TYPE, vICAL.getResource(component));
+        addURIProperty(evt, RDF.TYPE, vICAL.getClass(component));
         addTextProps(compoNode, evt);
         addUrl(compoNode, evt);
         addRRule(compoNode, evt);

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HCardExtractor.java Tue Jan 10 16:32:28 2012
@@ -61,7 +61,7 @@ public class HCardExtractor extends Enti
                     "html-mf-hcard",
                     PopularPrefixes.createSubset("rdf", "vcard"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-hcard.html",
                     HCardExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HListingExtractor.java Tue Jan 10 16:32:28 2012
@@ -82,7 +82,7 @@ public class HListingExtractor extends E
                     "html-mf-hlisting",
                     PopularPrefixes.createSubset("rdf", "hlisting"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-hlisting.html",
                     HListingExtractor.class
             );
 
@@ -106,7 +106,7 @@ public class HListingExtractor extends E
         out.writeTriple(listing, RDF.TYPE, hLISTING.Listing);
 
         for (String action : findActions(fragment)) {
-            out.writeTriple(listing, hLISTING.action, hLISTING.getResource(action));
+            out.writeTriple(listing, hLISTING.action, hLISTING.getClass(action));
         }
         out.writeTriple(listing, hLISTING.lister, addLister() );
         addItem(listing);
@@ -154,7 +154,7 @@ public class HListingExtractor extends E
                     String value = node.getNodeValue();
                     // do not use conditionallyAdd, it won't work cause of evaluation rules
                     if (!(null == value || "".equals(value))) {
-                        URI property = hLISTING.getPropertyCamelized(klass);
+                        URI property = hLISTING.getPropertyCamelCase(klass);
                         conditionallyAddLiteralProperty(
                                 node,
                                 blankItem, property, valueFactory.createLiteral(value)

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HRecipeExtractor.java Tue Jan 10 16:32:28 2012
@@ -29,7 +29,7 @@ public class HRecipeExtractor extends En
                     "html-mf-hrecipe",
                     PopularPrefixes.createSubset("rdf", "hrecipe"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-hrecipe.html",
                     HRecipeExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HResumeExtractor.java Tue Jan 10 16:32:28 2012
@@ -48,7 +48,7 @@ public class HResumeExtractor extends En
                     "html-mf-hresume",
                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-hresume.html",
                     HResumeExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HReviewExtractor.java Tue Jan 10 16:32:28 2012
@@ -53,7 +53,7 @@ public class HReviewExtractor extends En
                     "html-mf-hreview",
                     PopularPrefixes.createSubset("rdf", "vcard", "rev"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-hreview.html",
                     HReviewExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/HeadLinkExtractor.java Tue Jan 10 16:32:28 2012
@@ -98,6 +98,6 @@ public class HeadLinkExtractor implement
                     "html-head-links",
                     PopularPrefixes.createSubset("xhtml", "dcterms"),
                     Arrays.asList("text/html;q=0.05", "application/xhtml+xml;q=0.05"),
-                    null,
+                    "example-head-link.html",
                     HeadLinkExtractor.class);
 }

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/ICBMExtractor.java Tue Jan 10 16:32:28 2012
@@ -50,7 +50,7 @@ public class ICBMExtractor implements Ta
                     "html-head-icbm",
                     PopularPrefixes.createSubset("geo", "rdf"),
                     Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
-                    null,
+                    "example-icbm.html",
                     ICBMExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/LicenseExtractor.java Tue Jan 10 16:32:28 2012
@@ -51,7 +51,7 @@ public class LicenseExtractor implements
                     "html-mf-license",
                     PopularPrefixes.createSubset("xhtml"),
                     Arrays.asList("text/html;q=0.01", "application/xhtml+xml;q=0.01"),
-                    null,
+                    "example-mf-license.html",
                     LicenseExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/SpeciesExtractor.java Tue Jan 10 16:32:28 2012
@@ -44,7 +44,7 @@ public class SpeciesExtractor extends En
                     "html-mf-species",
                     PopularPrefixes.createSubset("rdf", "wo"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-mf-species.html",
                     SpeciesExtractor.class
             );
 
@@ -147,7 +147,7 @@ public class SpeciesExtractor extends En
 
     private URI resolveClassName(String clazz) {
         String upperCaseClass = clazz.substring(0, 1);
-        return vWO.getResource(
+        return vWO.getClass(
                 String.format("%s%s",
                         upperCaseClass.toUpperCase(),
                         clazz.substring(1)

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/TurtleHTMLExtractor.java Tue Jan 10 16:32:28 2012
@@ -56,7 +56,7 @@ public class TurtleHTMLExtractor impleme
                     NAME,
                     PopularPrefixes.get(),
                     Arrays.asList("text/html;q=0.02", "application/xhtml+xml;q=0.02"),
-                    null,
+                    "example-script-turtle.html",
                     TurtleHTMLExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/html/XFNExtractor.java Tue Jan 10 16:32:28 2012
@@ -61,7 +61,7 @@ public class XFNExtractor implements Tag
                 "html-mf-xfn",
                 PopularPrefixes.createSubset("rdf", "foaf", "xfn"),
                 Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                null,
+                "example-mf-xfn.html",
                 XFNExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/microdata/MicrodataExtractor.java Tue Jan 10 16:32:28 2012
@@ -68,7 +68,7 @@ public class MicrodataExtractor implemen
                     "html-microdata",
                     PopularPrefixes.createSubset("rdf", "doac", "foaf"),
                     Arrays.asList("text/html;q=0.1", "application/xhtml+xml;q=0.1"),
-                    null,
+                    "example-microdata.html",
                     MicrodataExtractor.class
             );
 

Modified: incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java
URL: http://svn.apache.org/viewvc/incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java?rev=1229627&r1=1229626&r2=1229627&view=diff
==============================================================================
--- incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java (original)
+++ incubator/any23/trunk/any23-core/src/main/java/org/deri/any23/extractor/rdf/RDFParserFactory.java Tue Jan 10 16:32:28 2012
@@ -19,7 +19,7 @@ package org.deri.any23.extractor.rdf;
 import org.deri.any23.extractor.ErrorReporter;
 import org.deri.any23.extractor.ExtractionContext;
 import org.deri.any23.extractor.ExtractionResult;
-import org.deri.any23.parser.NQuadsParser;
+import org.deri.any23.io.nquads.NQuadsParser;
 import org.deri.any23.rdf.Any23ValueFactoryWrapper;
 import org.openrdf.model.impl.ValueFactoryImpl;
 import org.openrdf.rio.ParseErrorListener;
@@ -28,6 +28,7 @@ import org.openrdf.rio.RDFParseException
 import org.openrdf.rio.RDFParser;
 import org.openrdf.rio.ntriples.NTriplesParser;
 import org.openrdf.rio.rdfxml.RDFXMLParser;
+import org.openrdf.rio.trix.TriXParser;
 import org.openrdf.rio.turtle.TurtleParser;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -38,7 +39,7 @@ import java.io.Reader;
 
 /**
  * This factory provides a common logic for creating and configuring correctly
- * any RDF parser used within the library.
+ * any <i>RDF</i> parser used within the library.
  *
  * @author Michele Mostarda (mostarda@fbk.eu)
  */
@@ -119,7 +120,7 @@ public class RDFParserFactory {
     }
 
     /**
-     * Returns a new instance of a configured {@link org.deri.any23.parser.NQuadsParser}.
+     * Returns a new instance of a configured {@link org.deri.any23.io.nquads.NQuadsParser}.
      *
      * @param verifyDataType data verification enable if <code>true</code>.
      * @param stopAtFirstError the parser stops at first error if <code>true</code>.
@@ -139,6 +140,26 @@ public class RDFParserFactory {
     }
 
     /**
+     * Returns a new instance of a configured {@link TriXParser}.
+     *
+     * @param verifyDataType data verification enable if <code>true</code>.
+     * @param stopAtFirstError the parser stops at first error if <code>true</code>.
+     * @param extractionContext the extraction context where the parser is used.
+     * @param extractionResult the output extraction result.
+     * @return a new instance of a configured TriX parser.
+     */
+    public TriXParser getTriXParser(
+            final boolean verifyDataType,
+            final boolean stopAtFirstError,
+            final ExtractionContext extractionContext,
+            final ExtractionResult extractionResult
+    ) {
+        final TriXParser parser = new TriXParser();
+        configureParser(parser, verifyDataType, stopAtFirstError, extractionContext, extractionResult);
+        return parser;
+    }
+
+    /**
      * Configures the given parser on the specified extraction result
      * setting the policies for data verification and error handling.
      *



Mime
View raw message