incubator-any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Cyganiak <richard.cygan...@deri.org>
Subject N-Quads in Re: Upgrade to Tika 1.2 [WAS] Re: [ANNOUNCE] Welcome Peter Ansell as Any23 PPMC member and committer
Date Wed, 08 Aug 2012 09:33:23 GMT
Hi Michele,

On 8 Aug 2012, at 10:12, Michele Mostarda wrote:
> the only thing I would stress is to avoid breaking the support
> for IRI in N-Quads[0] present in the current Any23 version of the parser. 
> 
> I know it is not compliant with the N-Quads standard but we introduced such feature 
> because Sindice[1] (which uses Any23 to distill RDF content from collected pages) 
> is constantly crawling a lot of N-Quads documents written with IRI encoding.

I'm not sure what you mean when you say that the IRI support in Any23 isn't compliant with
the N-Quads standard. Can you elaborate?

I'd say that N-Quads as defined in [0] supports IRIs.

Best,
Richard




> 
> What I suggest as general approach is to add flags to enforce validation or just to produce
> warnings when non standard data is detected instead than avoid supporting non fully standard
data at all.
> 
> I would also suggest the promotion for a standard upgrade to pass from URI to IRI support
for N-Quads.
> Richard, any advice about this?
> 
> The best.
> Mic
> 
> [0] http://sw.deri.org/2008/07/n-quads/
> [1] http://sindice.com/
> 

Mime
View raw message