any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From govind nitk <govind.n...@gmail.com>
Subject Re: [DISCUSS] Release Any23 2.3?
Date Fri, 30 Nov 2018 14:21:30 GMT
one observation with cli tool:
*any23 2.2 *
*./bin/any23 rover "https://www.bbc.com/sport/football/46377603
<https://www.bbc.com/sport/football/46377603>" -o /tmp/any23_2.2*
------------------------------------------------------------------------
Apache Any23 :: rover
------------------------------------------------------------------------

Nov 30, 2018 7:45:32 PM
org.apache.tika.config.InitializableProblemHandler$3
handleInitializableProblem
WARNING: JBIG2ImageReader not loaded. jbig2 files will be ignored
See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
for optional dependencies.
TIFFImageWriter not loaded. tiff files will not be processed
See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
for optional dependencies.
J2KImageReader not loaded. JPEG2000 files will not be processed.
See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
for optional dependencies.

Nov 30, 2018 7:45:32 PM
org.apache.tika.config.InitializableProblemHandler$3
handleInitializableProblem
WARNING: org.xerial's sqlite-jdbc is not loaded.
Please provide the jar on your classpath to parse sqlite files.
See tika-parsers/pom.xml for the correct version.
0    [main] INFO  org.apache.any23.rdf.PopularPrefixes  - Loading prefixes
from /org/apache/any23/prefixes/prefixes.properties
1113 [main] INFO  org.apache.any23.extractor.SingleDocumentExtraction  -
Processing https://www.bbc.com/sport/football/46377603
3127 [main] INFO  org.apache.any23.cli.Rover  - Extractors used:
[html-head-meta, html-head-title, html-rdfa11]
3127 [main] INFO  org.apache.any23.cli.Rover  - 55 triples, 3083ms

------------------------------------------------------------------------
Apache Any23 SUCCESS
Total time: 4s
Finished at: Fri Nov 30 19:45:35 IST 2018
Final Memory: 40M/143M
------------------------------------------------------------------------



*with any23 2.3 snapshot cli released locally:*
*/bin/any23 rover "https://www.bbc.com/sport/football/46377603
<https://www.bbc.com/sport/football/46377603>" -o /tmp/any23_2.3*

1    [main] ERROR org.apache.any23.writer.WriterFactoryRegistry  - Found
error loading a WriterFactory
java.util.ServiceConfigurationError: org.apache.any23.writer.WriterFactory:
Provider org.apache.any23.cli.flows.PeopleExtractorFactory not found
at java.util.ServiceLoader.fail(ServiceLoader.java:239)
at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at
org.apache.any23.writer.WriterFactoryRegistry.<init>(WriterFactoryRegistry.java:90)
at
org.apache.any23.writer.WriterFactoryRegistry$InstanceHolder.<clinit>(WriterFactoryRegistry.java:54)
at
org.apache.any23.writer.WriterFactoryRegistry.getInstance(WriterFactoryRegistry.java:129)
at org.apache.any23.cli.Rover.<clinit>(Rover.java:76)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at org.apache.any23.cli.ToolRunner.execute(ToolRunner.java:95)
at org.apache.any23.cli.ToolRunner.execute(ToolRunner.java:72)
at org.apache.any23.cli.ToolRunner.main(ToolRunner.java:68)

------------------------------------------------------------------------
Apache Any23 :: rover
------------------------------------------------------------------------

2244 [main] WARN  org.apache.http.client.protocol.ResponseProcessCookies  -
Invalid cookie header: "Set-Cookie:
BBC-UID=ca727e6c3a3b33f842e8878f6fafd0e83567ff24f7978b58d536e1eec83ce2590Any23-CLI;
expires=Tue, 29 Nov 2022 14:15:41 GMT; path=/; domain=.bbc.com". Invalid
'expires' attribute: Tue, 29 Nov 2022 14:15:41 GMT
4384 [main] INFO  org.apache.any23.cli.Rover  - Extractors used:
[html-head-meta, html-scraper, html-head-title, html-rdfa11]
4384 [main] INFO  org.apache.any23.cli.Rover  - 59 triples, 2568ms

------------------------------------------------------------------------
Apache Any23 SUCCESS
Total time: 4s
Finished at: Fri Nov 30 19:45:43 IST 2018
Final Memory: 75M/187M
------------------------------------------------------------------------


with snapshot released locally, it starts with
*[main] ERROR org.apache.any23.writer.WriterFactoryRegistry  - Found error
loading a WriterFactory*




On Thu, Nov 29, 2018 at 11:56 PM lewis john mcgibbney <lewismc@apache.org>
wrote:

> Hi dev@,
> Is there anything else we want to include in the 2.3 development drive or
> can we go ahead and produce a release candidate?
> Thanks
> Lewis
>
> --
> http://home.apache.org/~lewismc/
> http://people.apache.org/keys/committer/lewismc
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message