forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Brondsema <d...@brondsema.net>
Subject Re: [DRAFT] request Apache Forrest TLP setup
Date Tue, 01 Jun 2004 17:30:32 GMT
On Tue, 1 Jun 2004, Stefano Mazzocchi wrote:

> Antonio Gallardo wrote:
>
> > Stefano Mazzocchi dijo:
> >
> >>Antonio Gallardo wrote:
> >>
> >>
> >>>I am aware there can be more pages generated but at least 6 links for free
> >>>on the net is good deal, right? ;-)
> >>
> >>Antonio, having a link on the net is useless in google-sense if there is
> >
> > no <a href=""> tag around it.
> >
> > Not sure. I think they index even outside <a href="">. As a sample see:
> >
> > http://www.google.com.ni/search?hl=en&ie=UTF-8&q=%22www.emmss.com%22+cocoon+wiki&btnG=Search
> >
> > The third answer is a page with no <a href="">:
> >
> > http://www.svg.org/wiki/ow.asp?p=OtherImplementations&a=diff
> >
> >>Don't get me wrong, I'm the first to think that the web is becoming too
> >
> > big of a place for open wikis to work (just like to open email to work),
> > but I think you are just irrationally overreacting if you think they get
> > that much benefit out of it.
> >
> > I cannot get you wrong. ;-)
> >
> > It don't takes me too much time. Just curious about that and I wonder why
> > they are doing this? I don't read chineese. I don't know what they write.
> > Why they are using robots to do that? If not more pages results, then what
> > is the deal behind that?
> >
> > Why i think they are using robots:
> >
> > I wrote a mail 15 days ago [1]. At that time they had cca. 19,300 Now they
> > have 69,000 results on google [2]. I don't believe people can do that by
> > hand.  Doing simple maths (and supposing people use 6 hours daily for
> > sleep and eat), then they need to put a link every 20 secs. It must be a
> > robot!
>
> of course it's a robot. My point is: the only useful attack to pagerank
> is the "linking attack" and using robots to hack wikis is a great
> strategy to improve the ranking of your site.
>
> Now, in order to do this, you have to have the page including an <a
> href="">...</a> tag pointing to it, otherwise google doesn't use that
> information for its graph analysis pagerank system.
>
> So, this means that their web site URL gets tokenized and indexed (as
> you demonstrate above), but that will *NOT* help increasing the pagerank
> value of the pages hosted on that URL.
>
> Again, my point is: it doesn't matter if they have their URL tokenized a
> million time, what's important is that their pagerank value of that URL
> remains rubbish. And for this we are safe if and only if the wiki diff
> emails don't get represented with <a href=""> tags around URLs [but here
> I'm not sure if we do that!]
>

But even if the diff emails don't help them, the wiki pages themselves and
the history pages still have all the links, with <a> tags around the URLs.
So the vandal still benefits.  And of course our wiki gets defaced and
looks bad until we take the time to fix it.

-- 
Dave Brondsema : dave@brondsema.net
http://www.brondsema.net : personal
http://www.splike.com : programming
http://csx.calvin.edu : student org

Mime
View raw message