From user-return-33853-archive-asf-public=cust-asf.ponee.io@nutch.apache.org Fri Feb 9 21:45:20 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 3019D180654 for ; Fri, 9 Feb 2018 21:45:20 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1F9A4160C5E; Fri, 9 Feb 2018 20:45:20 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3CC47160C4C for ; Fri, 9 Feb 2018 21:45:19 +0100 (CET) Received: (qmail 93283 invoked by uid 500); 9 Feb 2018 20:45:16 -0000 Mailing-List: contact user-help@nutch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@nutch.apache.org Delivered-To: mailing list user@nutch.apache.org Received: (qmail 92987 invoked by uid 99); 9 Feb 2018 20:45:16 -0000 Received: from ui-eu-01.ponee.io (HELO localhost) (176.9.59.70) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Feb 2018 20:45:16 +0000 x-ponymail-agent: PonyMail Composer/0.2 References: <77BEE181-F2B8-42FE-888D-46520B1E85AE@zion.com> In-Reply-To: <77BEE181-F2B8-42FE-888D-46520B1E85AE@zion.com> Message-ID: Subject: Re: NUTCH-1129, Any23, microdata parsing, indexing, and extraction? x-ponymail-sender: b7497055821405926d63668ab1112e0f108e2346 Date: Fri, 09 Feb 2018 20:45:12 -0000 From: Lewis John McGibbney Content-Type: text/plain; charset="iso-8859-1" X-Mailer: LuaSocket 3.0-rc1 MIME-Version: 1.0 To: Hi David, We are in the process of releasing Any23 2.2, this will include the fix. We can then come back to Nutch and make the upgrade and you should be all set. Hopefully this will be achieved within around 72hrs. In the meantime, you can clone, build and deploy Any23 master. This will do the trick. Lewis On 2018/02/09 07:31:10, David Ferrero wrote: > Thank you for this information. Since this is very much related to Any23 and microdata parsing, I’m going to ask what I believe is a related question but keep this same thread so it will be organized in one place: > > I noticed a lot of job boards such as dice.com , monster.com , etc use http://schema.org/JobPosting information, however many seem to use