forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [DRAFT] request Apache Forrest TLP setup
Date Tue, 01 Jun 2004 17:11:58 GMT
Antonio Gallardo wrote:

> Stefano Mazzocchi dijo:
> 
>>Antonio Gallardo wrote:
>>
>>
>>>I am aware there can be more pages generated but at least 6 links for free
>>>on the net is good deal, right? ;-)
>>
>>Antonio, having a link on the net is useless in google-sense if there is
> 
> no <a href=""> tag around it.
> 
> Not sure. I think they index even outside <a href="">. As a sample see:
> 
> http://www.google.com.ni/search?hl=en&ie=UTF-8&q=%22www.emmss.com%22+cocoon+wiki&btnG=Search
> 
> The third answer is a page with no <a href="">:
> 
> http://www.svg.org/wiki/ow.asp?p=OtherImplementations&a=diff
> 
>>Don't get me wrong, I'm the first to think that the web is becoming too
> 
> big of a place for open wikis to work (just like to open email to work),
> but I think you are just irrationally overreacting if you think they get
> that much benefit out of it.
> 
> I cannot get you wrong. ;-)
> 
> It don't takes me too much time. Just curious about that and I wonder why
> they are doing this? I don't read chineese. I don't know what they write.
> Why they are using robots to do that? If not more pages results, then what
> is the deal behind that?
> 
> Why i think they are using robots:
> 
> I wrote a mail 15 days ago [1]. At that time they had cca. 19,300 Now they
> have 69,000 results on google [2]. I don't believe people can do that by
> hand.  Doing simple maths (and supposing people use 6 hours daily for
> sleep and eat), then they need to put a link every 20 secs. It must be a
> robot!

of course it's a robot. My point is: the only useful attack to pagerank 
is the "linking attack" and using robots to hack wikis is a great 
strategy to improve the ranking of your site.

Now, in order to do this, you have to have the page including an <a 
href="">...</a> tag pointing to it, otherwise google doesn't use that 
information for its graph analysis pagerank system.

So, this means that their web site URL gets tokenized and indexed (as 
you demonstrate above), but that will *NOT* help increasing the pagerank 
value of the pages hosted on that URL.

Again, my point is: it doesn't matter if they have their URL tokenized a 
million time, what's important is that their pagerank value of that URL 
remains rubbish. And for this we are safe if and only if the wiki diff 
emails don't get represented with <a href=""> tags around URLs [but here 
I'm not sure if we do that!]

-- 
Stefano.


Mime
View raw message