any23-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: Extracting Meta Tags
Date Tue, 08 Dec 2015 03:59:02 GMT
Hi Frank,

On Mon, Dec 7, 2015 at 3:50 PM, <user-digest-help@any23.apache.org> wrote:

>
> I'm trying to extract meta tags from webpages.  I'm using the code below
> but am finding that only a small subset of meta tags are being returned.
> There are meta tags like those for facebook open graph that i am interested
> in that are not being returned?
>

By default Any23 Configuration [0] defines that HTML head meta tags should
be extracted by default. There is therefore no need to change this
behaviour as extraction of HTML meta tags 'should' be happening by default.
You are also correctly defining this within your code as below!
Can you please post an example of a URL we can test against?
Thanks
Lewis

[0]
https://github.com/apache/any23/blob/master/api/src/main/resources/default-configuration.properties#L70

Mime
View raw message