cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bert Van Kets <b...@vankets.com>
Subject Re: Search Engine Optimization and Cocoon (long!)
Date Fri, 12 Apr 2002 22:10:16 GMT
SEO is a very vague and ever changing science.  There's a lot to know about 
each and every Search engine.  The rules are different for every one of 
them.  I will only give the major, more general rules to get you going.

----------------------------------------------------------------------------------------------------------------------------------------------------------

To know how your site can be found you have to know how people look for 
your site.  What do you do when you surf the web?  You look for textual 
information!  You enter a keyword or key phrase into a textbox and ask the 
search engine to query his database and come up with web pages that are as 
relevant as possible.
- Remember the importance of the keyword or keyphrase here. -
Now you probably get that the trick is to know two things:
1) what keywords or keyphrases do people actually use
2) what are the rules that make a page relevant to a specific keyword or 
keyphrase (where do you apply the keywords)

Part 1: The keywords used by the visitor
----------------------------------------------------------
A. Try to find them yourself
Brainstorm about ALL possible keywords that can be applied to your 
site.  Write them all down.  You might have about 200 words or 
phrases.  Don't just sum product names or descriptions of products, but 
also think about what people would search for when they don't know your 
product.  People who don't know what an electric drill is don't search for 
a drill but for a "hole in a wall".
Also don't get keywords that are too general.  If your site is about second 
hand cars, you don't want people searching for new cars or car 
parts.  Getting relevant visitors is what we are after.
B. See what your competitors are doing
Check out the sites from your competitors.  See what keywords they are 
using and more importantly the most important words they are using in the 
body of the major pages.
C. Filter out the most relevant keywords.
Go over the list and select the keywords you think are best.  Keep a list 
of about 50 keyphases, knowing that people use keyphrases more than keywords.
D. Test your keywords and keyphrases
Now go out to some of the major search engines like Google, AltaVista, 
Excite, MSN, AOL, Netscape, Northern Light, Lycos, etc. and see how many 
times your keywords is found.  This will give you an idea of the relevance 
of the words in sites.  Don't use this as a definite guide to the 
relevance, only use it as a guide.
If you want to absolutely sure about your keywords you have to use a 
database of keyphases entered in a series of search engines.  WordTracker 
(http://www.wordtracker.com/) is the only service I know of that can do this.

After you have done all this hard work you should have a list of five to 
ten major keywords and ten to twenty secondary.  It is very important that 
you stick with these for a long time, so you'd better be sure you have the 
right ones.  SEO is a slooooow process and it can take 6 months easily to 
see some results of a decision you made.

Part 2: Where do you use the keywords
---------------------------------------------------------
A. What can be indexed?  Text, text and nothing but text.  Text in graphics 
will not be indexed! Why do you think graphics have Alt tags.  Alt tags are 
compulsory, BTW!
So if you have a site with only a teeny weeny bit of text, how do you 
suppose your site can be indexed and found?  The first and most important 
rule of SEO is content, content and some more content.
B. Pick your page
Pick a page you want to use to promote a keyword.  When this page is 
indexed the robot is supposed to think the keyword is very important.  Use 
one page for each keyword you want to promote.   It is nearly impossible to 
promote multiple keywords on one page.
C. Page title
The page title is the most important part of a page.  Just as you judge a 
book by it's title a robot will judge the page by the title.  Most search 
engines like your keyword only once in the title.  Too much and you are 
regarded spam, and that is the last thing you want.
Make sure the title tag is the first tag below the head tag.
D. The meta keywords and meta description tag.
More and more search engines skip these and try to filter out the keywords 
themselves.  Add your keyword and some variations like plurals.
The description tag is important since it will be used by the search engine 
to describe the page.  In the list returned by the search engine most of 
the time you will see the page title with the link on it and below it the 
description.  A good title and a good description can lure visitors by 
themselves (provided you can get your page in the top 30 in some search 
engines).
E. Page body
Keywords are only found relevant when they appear in the body of the 
page.  Use your keyword in the first and the last paragraph. That's where 
they will have the most relevance.  Use about 400 words in the body.
F. Links
A link on the word spaghetti that points to a page called spaghetti.html in 
the directory spaghetti and having a high keyword relevance on spaghetti 
MUST be about spaghetti, no?  Try to implement two to three links in the 
page containing your keyword.
This is VERY important: use ONLY <a href> links for your 
navigation.  Hardly any robot will interpret JavaScript, so window.location 
links are not followed.  Robots don't submit pages either.  Don't rely on 
navigation using <input> tags.
Flash or Applets are not followed either!  So forget about the fancy, 
flashy sites.  They don't get anywhere.  Cocoon makes navigation 
maintenance easy, why use client side scripts?
G. Alt tags
Alt tags are the only way to get the content of your images indexed.  Use them!
If you want to address as many visitors as possible, don't forget the 
visually impaired and give them something they can see in their own 
way.  Braillereaders can only show text, so if there are no alt tags on 
your navigation buttons, these people won't know where there are, or where 
they can go.  Use XSLT to make sure every image gets his alt tag.

Part 3: Site structure
------------------------------
The best way you can structure your site is using one keyword for every 
part of the site.  Use that keyword are the name of the directory where the 
files will be located.  Of course in cocoon you don't really need to place 
the xml files in that directory.  A pipeline that mimics the directory is 
good too.  The robot only need to think the html files are in a certain 
directory.
Links between your pages are VERY important to Google, the most important 
search engine.  A good internal link structure is very important .  Make 
sure you link your pages to each other where possible.  The best thing you 
can do is use a regular html menu structure (like the cocoon site).
The home page is regarded as the most important page in the site (like a 
book cover).  The links to and from that page are therefor important 
too.  So DON'T use splash screens! they will mess up the whole site structure.
Another reason why you shouldn't use splash screens is that robots don't 
follow more than two or three links deep.  So if you use a splash screen 
you lose one level!
Use a sitemap and link it to the homepage.  It's even better if you don't 
call it "sitemap" and add some regular text on the page.  AltaVista won't 
follow or index pages containing only links.  I guess you get the 
picture.  Cocoon can generate a sitemap automatically.

Part 4: external links
------------------------------
Google was the first one to use links between sites as a measure of the 
relevance.  Sites having a lot of links to them are called authorities. 
Sites having a lot of links to other sites are called hubs.  Both of these 
types have a higher relevance.  The last few years Google has added a 
relevance to these links by adding "themes".  Only links to and from your 
site of sites that have the same "theme" are regarded relevant.  So beware 
who links to you and who you link to.
Check the number of links to your site in search engines.  Most of the 
times you can do this by using link:www.yoursite.com
IBM has done a study where they found that only half the web is linked 
together.  The other half has links to or from the central part.  There are 
also some islands that don't even link to that main chunk.  IBM called this 
the bow tie theory.  To the left are the sites that link to the knot.  The 
knot contains sites that are linked both ways and to the right are sites 
that have links from the knot to them.  If you think about the way people 
surf the web, clicking from one page to another, from one site to the next, 
you can see that it is very important to be in the center part.  Get as 
many *relevant* links to and from your site as possible.

Part 5: submit
---------------------
A. Submit to major SE's
That you need to submit your pages is pretty relevant, but don't overdo 
it.  If you submit too many times you can get on the spam list!  Check the 
server logs to see if your site is indexed.  If not, compare it with the 
search engine's help to see how long it normally takes to index a site.
Don't assume that your site will be indexed because you submitted.  Keep 
track of your submissions!
In an ideal situation you submit only the home page or the domain.  Let the 
robot find the rest, even if it takes some time.  This will give a higher 
relevance to the pages the robot found himself.
B. Don't submit to thousands of SE's
Most of these SE's are simple lists and can't even be queried.  Some of 
them are SE's for specific topics.  You don't want a business site to be 
listed in a humor search engine, do you?
Focus on the 18 biggest search engines and you'll be amazed what that can 
do for your site.

Part 6: the results
--------------------------
Use the server logs to see
- what keywords are used by people to find your site
- what are the most popular pages (don't change these)
- which page do you need to adjust to get a higher ranking
- what search engine gives a lot of hits
- where are people coming from
- how long are people staying on the site
- what route are they following, adjust your navigation and page content to 
manipulate this
- through what page do they enter the site, find out why
- What robots visited the site

Part 7: other things you should know
-----------------------------------------------------
A. Querystrings (everything behind a ? in the URL)
Most major search engines hate querystrings.  They assume that the query 
strings are used for database access and dynamic page generation.  This can 
give them a "black hole" where they eventually index a complete 
database.  Altavista clearly states that they will index a page with 
querystrings, but won't follow any links.  Google is one of the first to 
start indexing pages with querystrings.  They are very coutious and will go 
only a certain levels.
B. page extentions
This is a good one for M$ haters.  ASP pages are not indexed by 
AltaVista.  Other search engines are a bit cautious too.  Asp clearly 
states: "server side scripting", so dynamic pages and possible hell for robots.
C. Give the robots what they want: well structured HTML
Cocoon is a perfect platform for this.  Through XSLT you can create perfect 
pages each and every time.  Providing your XSL files are perfect, of 
course.  When pages are created manually, chances are that some human error 
is made on some page and that the HTML doesn't render correctly.
D. Don't try to fool robots
It's very easy in Cocoon to provide different content to robots than you 
provide a regular visitor.  Robots have their own client name.  Google's 
robot is called "googlebot".  The robot AltaVista is using is called 
"scooter".  If you provide a similar but optimized content you could get 
away with it, but if you use a different, more popular, content you are 
luring visitors to your site under false pretenses.  Search engines hate 
this since their users don't get what they look for.  They look for "sex" 
(the most important keyword on the net) and get a site about 
spaghetti.  Sites that use this technique will be put on a spam list and be 
banned from the search engines.  If the spam violations keep coming the IP 
address can get banned.  You can guess how happy your webmaster will be 
when he hears he has to move all his virtual hosts to another IP 
address.  You can start looking for another host right there and then.
E. Getting content
One of the major concerns in site creation is getting the content.  This is 
the string point of Cocoon.  By separating presentation, logic and content 
it's a LOT easier.  Using the right XML editor, like XMLSpy it's even easy 
for a customer to enter the texts themselves without corrupting the logic 
or navigation.  If you let the client edit the navigation XML files (ex. 
the book.xml files in the cocoon documentation) the client can update the 
site on his own.
If you add dynamic form creation for updating the navigation files and 
wysiwyg editing for the actual content.  Combine this with user 
authentication and you've got one hell of a platform that gives the client 
total control over the site without any need of knowledge of the technology 
behind it. This is what I'm building BTW.
H. Browser support
Using the client detection and different XSLT files it is rather easy to 
create a site that can be viewed by all browsers.  Make sure your site is 
visitable by
- IE 4 and up
- Netscape 4 and up
- Opera 4 and up
- Lynx (for the visually impaired a perfect test)
This way you can be sure you don't miss out on visitors simply because your 
site doesn't look right in their browser.  It's a bit of work, but always 
keep in mind that you must adjust yourself to the visitor and not the other 
way around.
I. Page content
Make sure EVERY page of your site answers these questions when the visitor 
get there
- Where am I?
- What can I find here?
- Where can I go?
If you don't answer one of these questions the visitor will leave, and 
that's NOT what we want.


Part8: Want to know more about SEO, check out these sites
----------------------------------------------------------------------------------------

Firstplace Software
         Lots of info and unique promotion- and optimisation software
         http://www.firstplacesoftware.com

Search Engine Watchb
         Search engine and spider info
         http://www.seaerchenginewatch.com

Search Engine World
         Search engine and spider info
         http://www.searchengineworld.com

Cre8PC
         Web design, tools, tutorials and web promotion
         http://www.cre8PC.com

AIM Pro
         Internet Marketing tips, tools and services
         http://www.aim-pro.com

SmallZine
         http://www.smallzine.nl/

About.com Web desing tips
         Pure web design
         http://webdesign.about.com/cs/designtips/

Webmonkey
         A compilation of information
         http://hotwired.lycos.com/webmonkey/

AnyBrowser
Check your browser compatibility
         http://www.anybrowser.com/

Xenu link checking tool
         http://home.snafu.de/tilman/xenulink.html

Hitboxb
         Web traffice analyser
         http://get.hitbox.com/cgi-bin/getit.cgi?hb&hb_intro

WordTracker
         Find the right keywords for your site
         http://www.wordtracker.com/

I-Marketeer
         Internet Marketing in Belgiƫ
         http://www.i-marketeer.com/

Search Engine Optimization Strategies
         All kind of info regarding SEO
         http://strategies.topsitelistings.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message