Return-Path: X-Original-To: apmail-any23-dev-archive@www.apache.org Delivered-To: apmail-any23-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A479010F9C for ; Wed, 17 Jul 2013 19:59:49 +0000 (UTC) Received: (qmail 65271 invoked by uid 500); 17 Jul 2013 19:59:49 -0000 Delivered-To: apmail-any23-dev-archive@any23.apache.org Received: (qmail 65230 invoked by uid 500); 17 Jul 2013 19:59:48 -0000 Mailing-List: contact dev-help@any23.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@any23.apache.org Delivered-To: mailing list dev@any23.apache.org Received: (qmail 65061 invoked by uid 99); 17 Jul 2013 19:59:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 19:59:46 +0000 Date: Wed, 17 Jul 2013 19:59:46 +0000 (UTC) From: "Andrey Kutuzov (JIRA)" To: dev@any23.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ANY23-165) "Invalid content" error if TITLE precedes encoding declaration in the document MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ANY23-165?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13711= 544#comment-13711544 ]=20 Andrey Kutuzov commented on ANY23-165: -------------------------------------- Perhaps, it has something to do with https://issues.apache.org/jira/browse/= ANY23-115 =20 > "Invalid content" error if TITLE precedes encoding declaration in the doc= ument > -------------------------------------------------------------------------= ----- > > Key: ANY23-165 > URL: https://issues.apache.org/jira/browse/ANY23-165 > Project: Apache Any23 > Issue Type: Bug > Components: encoding > Affects Versions: 0.8.0 > Environment: Linux 2.6.18-308.11.1.el5 #1 SMP Tue Jul 10 08:48:43= EDT 2012 x86_64 x86_64 x86_64 GNU/Linux > Reporter: Andrey Kutuzov > Labels: encoding > Fix For: 0.9.0 > > Attachments: kinopoisk.html.gz > > > When any23 is asked to extract semantics from a web document which is not= in UTF-8 and where TITLE precedes encoding declaration, any23 fails with e= rror "Invalid content '" > Example of such an URL: > http://www.kinopoisk.ru/film/565993/ > Compressed dump of this page is attached. > any23 http://www.kinopoisk.ru/film/565993/ > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further= details. > ------------------------------------------------------------------------ > Apache Any23 :: rover > ------------------------------------------------------------------------ > @prefix dcterms: . > dcterms:title "=C3=8F=C3=A8=C3=B0= =C3=A0=C3=AD=C3=BC=C3=A8 3DD" . > ------------------------------------------------------------------------ > Apache Any23 FAILURE > Execution terminated with errors: Invalid content '' > Total time: 1s > Finished at: Mon Jul 15 20:31:14 MSK 2013 > Final Memory: 67M/479M > ------------------------------------------------------------------------ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira