commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brad Neuberg <b...@columbia.edu>
Subject [feedparser] Patch to Refactor BlogService Infrastructure
Date Thu, 21 Oct 2004 16:50:52 GMT
The following patch does the following:

Before this patch we had an object named BlogService that encapsulated the 
kinds of BlogServices we deal with, such as Flickr, Blogger, etc. This was 
needed because many services have pecularities that differ from the RSS 
spec.  For example, some major services don't offer RSS autodiscovery, or 
even have incorrect RSS autodiscovery links, such as TextAmerica!  Many 
services, such as Rojo or Bloglines, would still like to be able to 
autodiscover these feeds since they represent major weblog providers that 
users will expect to work.

Before this patch we had two classes, BlogServiceDiscovery and 
ProbeLocator. These two classes used a BlogService object that is 
discovered to attempt to find a feed through aggresive probing if aggresive 
probing is enabled (aggresive probing is turned off by default and can only 
be enabled through a recompilation). Lots of code that is blog-service 
specific is exposed in these classes, which breaks down for two newer 
"problem" services, Flickr and Yahoo Groups.

In this patch I heavily refactored the way this portion of the system 
works, as follows:

Every BlogService object now encapsulates how to find its particular feed 
and how to discover if a given URL is of that BlogService type; instead of 
having the knowledge of how to deal with each kind of service sprinkled 
throughout the BlogServiceDiscovery and ProbeLocator classes, the 
information is now encapsulated in discrete subclasses of BlogService, such 
as org.apache.commons.feedparser.blogservice.Blogger. The 
BlogServiceDiscovery and ProbeLocator classes then use the visitor pattern, 
"asking" each kind of BlogService object to see if it is that weblog type 
and to locate its feed. This significantly cleans up the 
BlogServiceDiscovery and ProbeLocator code, and also makes it possible to 
support services that need more advanced workarounds, such as Flickr.


The following patch updates and adds a new subpackage called 'blogservice' 
in org.apache.commons.feedparser.locate.blogservice:

Index src/java/org/apache/commons/feedparser/locate/blogservice/AOLJournal.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/AOLJournal.java,v
retrieving revision 1.1
diff -u -B -r1.1 AOLJournal.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/AOLJournal.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/AOLJournal.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,78 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the AOL Journal blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class AOLJournal extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "journals.aol.com");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference aolJournalLocations[] =
+            { new FeedReference("atom.xml", FeedReference.ATOM_MEDIA_TYPE),
+              new FeedReference("rss.xml", FeedReference.RSS_MEDIA_TYPE) };
+
+        return aolJournalLocations;
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/BlogService.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/BlogService.java,v
retrieving revision 1.1
diff -u -B -r1.1 BlogService.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/BlogService.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/BlogService.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,234 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import java.net.*;
+import java.util.*;
+import java.util.regex.*;
+
+import org.apache.commons.feedparser.*;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the different kinds of blog services that are available.  This
+ * is needed for two reasons.  First, sometimes it is useful to simply
+ * know what provider a given weblog is being hosted by, such as Blogger
+ * or PMachine, in order to use special, non-standard capabilities.  Second,
+ * many services have "quirks" that don't follow the standards, such as
+ * supporting autodiscovery or supporting it in an incorrect way, and we
+ * therefore need to know what service we are dealing with so that we
+ * can find its feed.
+ *
+ * The BlogService object encapsulates how to determine if a given
+ * weblog is of that type and how to find its feeds.  Concrete subclasses,
+ * such as org.apache.commons.feedparser.locate.blogservice.Blogger,
+ * fill in this class and provide the actual way to determine these
+ * things for each blog service type.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public abstract class BlogService {
+    protected static List blogServices = new ArrayList();
+
+    /** Subclasses should have a static block similar to the following
+     *  <code>
+     *      {
+     *          BlogService.addBlogService(new MyBlogService());
+     *      }
+     *  </code>
+     */
+
+    /** Locates all the generator meta tags
+     *  (i.e. <meta content="generator" content="someGenerator"/>)
+     */
+    protected static Pattern metaTagsPattern =
+                Pattern.compile("<[\\s]*meta[\\w\\s=\"']*name=['\" 
]generator[\"' ][\\w\\s=\"']*[^>]*",
+                                Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
+
+    /**
+     * A regex to find any trailing filename and strip it
+     */
+    protected static Pattern patternToStrip = 
Pattern.compile("[^/](/\\w*\\.\\w*$)");
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public abstract boolean hasValidAutoDiscovery();
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public abstract boolean isThisService(String resource, String content)
+                                                throws FeedParserException;
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public abstract FeedReference[] getFeedLocations(String resource,
+                                                     String content)
+                                                throws FeedParserException;
+
+    /** Determines if the weblog at the given resource is this blog service.
+     *  @param resource A full URI to this resource, such as
+     *  "http//www.codinginparadise.org".
+     *  @throws FeedParserException Thrown if an error occurs while
+     *  determining the type of this weblog.
+     */
+    public boolean isThisService(String resource) throws FeedParserException {
+        return isThisService(resource, null);
+    }
+
+    /** This method takes a resource, such as 
"http//www.codinginparadise.org/myweblog.php",
+     *  and gets the path necessary to build up a feed, such as
+     *  "http//www.codinginparadise.org/".  Basicly it appends a slash
+     *  to the end if there is not one, and removes any file names that
+     *  might be at the end, such as "myweblog.php".
+     *
+     *  There is a special exception for some Blosxom blogs,
+     *  which have things inside of a cgi-script and 'hang' their RSS files
+     *  off of this cgi-bin.  For example,
+     *  http//www.bitbucketheaven.com/cgi-bin/blosxom.cgi has its RSS file
+     *  at http//www.bitbucketheaven.com/cgi-bin/blosxom.cgi/index.rss, so
+     *  we must return the blosxom.cgi at the end as well for this method.
+     *
+     *  @throws MalformedURLException Thrown if the given resource's URL is
+     *  incorrectly formatted.
+     */
+    public String getBaseFeedPath( String resource ) {
+        // strip off any query string or anchors
+        int end = resource.lastIndexOf( "#" );
+
+        if ( end != -1 )
+            resource = resource.substring( 0, end );
+
+        end = resource.lastIndexOf( "?" );
+
+        if ( end != -1 )
+            resource = resource.substring( 0, end );
+
+        Matcher fileMatcher = patternToStrip.matcher(resource);
+        if (fileMatcher.find()) {
+            String stringToStrip = fileMatcher.group(1);
+            int startStrip = resource.indexOf(stringToStrip);
+            resource = resource.substring(0, startStrip);
+        }
+
+        if ( ! resource.endsWith( "/" ) ) {
+            resource = resource + "/";
+        }
+
+        return resource;
+    }
+
+    public String toString() {
+        return this.getClass().getName();
+    }
+
+    public boolean equals(Object obj) {
+        if (obj == null)
+            return false;
+
+        if (obj instanceof BlogService == false)
+            return false;
+
+        return (obj.getClass().equals(this.getClass()));
+    }
+
+    public int hashCode() {
+        return this.getClass().hashCode();
+    }
+
+    /** Gets an array of all of the available BlogService implementations. */
+    public static BlogService[] getBlogServices() {
+        if (blogServices.size() == 0)
+            initializeBlogServices();
+
+        BlogService[] results = new BlogService[blogServices.size()];
+
+        return (BlogService[])blogServices.toArray(results);
+    }
+
+    // **** util code 
***********************************************************
+    // These methods are useful for non-abstract subclasses of this object
+    // to actually implement their functionality.
+
+    /** Determines if the given resource contains the given domain name
+     *  fragment.
+     */
+    protected boolean containsDomain(String resource, String domain) {
+        return (resource.indexOf(domain) != -1);
+    }
+
+    /** Determines if the given content was generated by the given generator
+     *  (i.e. this document contains a meta tag with name="generator" and
+     *  content equal to the generatorType).
+     */
+    protected boolean hasGenerator(String content, String generatorType) {
+        if (content == null) {
+            return false;
+        }
+
+        Matcher metaTagsMatcher = metaTagsPattern.matcher(content);
+        if (metaTagsMatcher.find()) {
+            String metaTag = metaTagsMatcher.group(0).toLowerCase();
+            generatorType = generatorType.toLowerCase();
+            return (metaTag.indexOf(generatorType) != -1);
+        }
+        else {
+            return false;
+        }
+    }
+
+    protected static void initializeBlogServices() {
+        blogServices.add(new AOLJournal());
+        blogServices.add(new Blogger());
+        blogServices.add(new Blosxom());
+        blogServices.add(new DiaryLand());
+        blogServices.add(new ExpressionEngine());
+        blogServices.add(new Flickr());
+        blogServices.add(new GreyMatter());
+        blogServices.add(new iBlog());
+        blogServices.add(new LiveJournal());
+        blogServices.add(new Manila());
+        blogServices.add(new MovableType());
+        blogServices.add(new PMachine());
+        blogServices.add(new RadioUserland());
+        blogServices.add(new TextAmerica());
+        blogServices.add(new TextPattern());
+        blogServices.add(new Typepad());
+        blogServices.add(new WordPress());
+        blogServices.add(new Xanga());
+        blogServices.add(new YahooGroups());
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Blogger.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Blogger.java,v
retrieving revision 1.1
diff -u -B -r1.1 Blogger.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Blogger.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Blogger.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,81 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.*;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Blogger blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Blogger extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "blogspot.com");
+
+        if (results == false) {
+            results = hasGenerator(content, "blogger");
+        }
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                                     String content)
+                                                throws FeedParserException {
+        FeedReference bloggerLocations[] =
+            { new FeedReference("atom.xml", FeedReference.ATOM_MEDIA_TYPE) };
+
+        return bloggerLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Blosxom.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Blosxom.java,v
retrieving revision 1.1
diff -u -B -r1.1 Blosxom.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Blosxom.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Blosxom.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,133 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import java.net.MalformedURLException;
+import java.util.regex.*;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Blosxom blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Blosxom extends BlogService {
+
+    /** A pattern used to discover Blosxom blogs. */
+    private static Pattern blosxomPattern =
+                Pattern.compile("alt=[\"' ]powered by blosxom[\"' ]",
+                                Pattern.CASE_INSENSITIVE);
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        // This is the only kind of blog that we need to check for a
+        // 'Powered by Blosxom'.  We do this with the alt= value on the
+        // Powered By image.
+        // FIXME This might be fragile, but it is used across all of the
+        // Blosxom blogs I have looked at so far. Brad Neuberg, 
bkn3@columbia.edu
+
+        Matcher blosxomMatcher = blosxomPattern.matcher(content);
+        results = blosxomMatcher.find();
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        // there is sometimes an index.rss20 file, but Blosxom has a bug where
+        // it incorrectly responds to HTTP HEAD requests for that file,
+        // saying that it exists when it doesn't.  Most sites don't seem
+        // to have this file so we don't include it here.
+        // Brad Neuberg, bkn3@columbia.edu
+        FeedReference[] blosxomLocations =
+            { new FeedReference("index.rss", FeedReference.RSS_MEDIA_TYPE) };
+
+        return blosxomLocations;
+    }
+
+    /** This method takes a resource, such as 
"http//www.codinginparadise.org/myweblog.php",
+     *  and gets the path necessary to build up a feed, such as
+     *  "http//www.codinginparadise.org/".  Basicly it appends a slash
+     *  to the end if there is not one, and removes any file names that
+     *  might be at the end, such as "myweblog.php".
+     *
+     *  There is a special exception for some Blosxom blogs,
+     *  which have things inside of a cgi-script and 'hang' their RSS files
+     *  off of this cgi-bin.  For example,
+     *  http//www.bitbucketheaven.com/cgi-bin/blosxom.cgi has its RSS file
+     *  at http//www.bitbucketheaven.com/cgi-bin/blosxom.cgi/index.rss, so
+     *  we must return the blosxom.cgi at the end as well for this method.
+     *
+     *  @throws MalformedURLException Thrown if the given resource's URL is
+     *  incorrectly formatted.
+     */
+    public String getBaseFeedPath( String resource ) {
+
+        // strip off any query string or anchors
+        int end = resource.lastIndexOf( "#" );
+
+        if ( end != -1 )
+            resource = resource.substring( 0, end );
+
+        end = resource.lastIndexOf( "?" );
+
+        if ( end != -1 )
+            resource = resource.substring( 0, end );
+
+        if ( ! resource.endsWith( "/" ) ) {
+            resource = resource + "/";
+        }
+
+        return resource;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/DiaryLand.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/DiaryLand.java,v
retrieving revision 1.1
diff -u -B -r1.1 DiaryLand.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/DiaryLand.java	21 
Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/DiaryLand.java	21 
Oct 2004 011223 -0000
@@ -0,0 +1,77 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the DiaryLand blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class DiaryLand extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "diaryland.com");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        // Diaryland doesn't offer feeds
+        FeedReference diaryLandLocations[] = new FeedReference[0];
+
+        return diaryLandLocations;
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/ExpressionEngine.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/ExpressionEngine.java,v
retrieving revision 1.1
diff -u -B -r1.1 ExpressionEngine.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/ExpressionEngine.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/ExpressionEngine.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,72 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the ExpressionEngine blog service, encapsulating whether a given 
weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class ExpressionEngine extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        // FIXME No way to detect this type of weblog right now
+        return false;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        // FIXME Implement
+        return new FeedReference[0];
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Flickr.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Flickr.java,v
retrieving revision 1.1
diff -u -B -r1.1 Flickr.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Flickr.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Flickr.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,100 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Flickr image blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Flickr extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        return resource.indexOf( "flickr.com" ) != -1;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        resource = getBaseFeedPath(resource);
+        //  * Input http//flickr.com/photos/tags/cats/
+        //  *
+        //  * Output 
http//flickr.com/services/feeds/photos_public.gne?tags=cats&format=atom_03
+
+        if ( resource == null )
+            return new FeedReference[0];
+
+        int begin = resource.indexOf( "/tags/" );
+
+        //we can't continue here.
+        if ( begin == -1 )
+            return new FeedReference[0];
+
+        begin += 6;
+
+        int end = resource.lastIndexOf( "/" );
+        if ( end == -1 || end < begin )
+            end = resource.length();
+
+        String tag = resource.substring( begin, end );
+
+        String location = 
"http//flickr.com/services/feeds/photos_public.gne?tags=" +
+                          tag +
+                          "&format=atom_03";
+
+        FeedReference flickrLocations[] =
+                { new FeedReference(location,
+                                    FeedReference.ATOM_MEDIA_TYPE) };
+
+        return flickrLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/GreyMatter.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/GreyMatter.java,v
retrieving revision 1.1
diff -u -B -r1.1 GreyMatter.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/GreyMatter.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/GreyMatter.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,75 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the GreyMatter blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class GreyMatter extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = hasGenerator(content, "greymatter");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        // FIXME Implement
+        return new FeedReference[0];
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/LiveJournal.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/LiveJournal.java,v
retrieving revision 1.1
diff -u -B -r1.1 LiveJournal.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/LiveJournal.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/LiveJournal.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,78 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the LiveJournal blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class LiveJournal extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "livejournal.com");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference liveJournalLocations[] =
+            { new FeedReference("data/atom", FeedReference.ATOM_MEDIA_TYPE),
+              new FeedReference("data/rss", FeedReference.RSS_MEDIA_TYPE) };
+
+        return liveJournalLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Manila.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Manila.java,v
retrieving revision 1.1
diff -u -B -r1.1 Manila.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Manila.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Manila.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,75 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Manila blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Manila extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        // FIXME No way to detect this type of weblog right now
+        return false;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference manilaLocations[] =
+            { new FeedReference("xml/rss.xml", FeedReference.RSS_MEDIA_TYPE),
+              new FeedReference("rss.xml", FeedReference.RSS_MEDIA_TYPE) };
+
+        return manilaLocations;
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/MovableType.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/MovableType.java,v
retrieving revision 1.1
diff -u -B -r1.1 MovableType.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/MovableType.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/MovableType.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,75 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the MovableType blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class MovableType extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = hasGenerator(content, "movabletype");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        // FIXME Implement
+        return new FeedReference[0];
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/PMachine.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/PMachine.java,v
retrieving revision 1.1
diff -u -B -r1.1 PMachine.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/PMachine.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/PMachine.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,85 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+import java.util.regex.*;
+
+/**
+ * Models the PMachine blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class PMachine extends BlogService {
+
+    /** A pattern used to discover PMachine blogs. */
+    private static Pattern pmachinePattern =
+                Pattern.compile("pmachine", Pattern.CASE_INSENSITIVE);
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        Matcher pmachineMatcher = pmachinePattern.matcher(resource);
+
+        results = pmachineMatcher.find();
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference pmachineLocations[] =
+            { new FeedReference("index.xml", FeedReference.RSS_MEDIA_TYPE) };
+
+        return pmachineLocations;
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/RadioUserland.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/RadioUserland.java,v
retrieving revision 1.1
diff -u -B -r1.1 RadioUserland.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/RadioUserland.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/RadioUserland.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,81 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Radio Userland blog service, encapsulating whether a given 
weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class RadioUserland extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "radio.userland.com");
+
+        if (results == false) {
+            results = containsDomain(resource, "radio.weblogs.com");
+        }
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference radioUserlandLocations[] =
+            { new FeedReference("rss.xml", FeedReference.RSS_MEDIA_TYPE) };
+
+        return radioUserlandLocations;
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/TextAmerica.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/TextAmerica.java,v
retrieving revision 1.1
diff -u -B -r1.1 TextAmerica.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/TextAmerica.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/TextAmerica.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,77 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the TextAmerica blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class TextAmerica extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return false;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "textamerica.com");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference textAmericaLocations[] =
+            { new FeedReference("rss.aspx", FeedReference.RSS_MEDIA_TYPE) };
+
+        return textAmericaLocations;
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/TextPattern.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/TextPattern.java,v
retrieving revision 1.1
diff -u -B -r1.1 TextPattern.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/TextPattern.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/TextPattern.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,78 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the TextPattern blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class TextPattern extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = hasGenerator(content, "textpattern");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference textPatternLocations[] =
+            { new FeedReference("?atom=1", FeedReference.ATOM_MEDIA_TYPE),
+              new FeedReference("?rss=1", FeedReference.RSS_MEDIA_TYPE) };
+
+        return textPatternLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Typepad.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Typepad.java,v
retrieving revision 1.1
diff -u -B -r1.1 Typepad.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Typepad.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Typepad.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,82 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the TypePad blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Typepad extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "typepad.com");
+
+        if (results == false) {
+            results = hasGenerator(content, "typepad");
+        }
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference typepadLocations[] =
+            { new FeedReference("atom.xml", FeedReference.ATOM_MEDIA_TYPE),
+              new FeedReference("index.rdf", FeedReference.RSS_MEDIA_TYPE) };
+
+        return typepadLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Unknown.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Unknown.java,v
retrieving revision 1.1
diff -u -B -r1.1 Unknown.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Unknown.java 
21 Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Unknown.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,79 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models an unknown blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Unknown extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        //FIXME Will this break things because it is false?
+        return false;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        return false;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference unknownLocations[] =
+            { new FeedReference("atom.xml",FeedReference.ATOM_MEDIA_TYPE),
+              new FeedReference("index.rss", FeedReference.RSS_MEDIA_TYPE),
+              new FeedReference("rss.xml", FeedReference.RSS_MEDIA_TYPE),
+              new FeedReference("index.rdf", FeedReference.RSS_MEDIA_TYPE),
+              new FeedReference("index.xml", FeedReference.RSS_MEDIA_TYPE),
+              new FeedReference("xml/rss.xml", 
FeedReference.RSS_MEDIA_TYPE) };
+
+        return unknownLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/WordPress.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/WordPress.java,v
retrieving revision 1.1
diff -u -B -r1.1 WordPress.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/WordPress.java	21 
Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/WordPress.java	21 
Oct 2004 011223 -0000
@@ -0,0 +1,79 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the WordPress blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class WordPress extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = hasGenerator(content, "wordpress");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference wordPressLocations[] =
+            { new FeedReference("wp-atom.php", FeedReference.ATOM_MEDIA_TYPE),
+              new FeedReference("wp-rss2.php", FeedReference.RSS_MEDIA_TYPE),
+              new FeedReference("wp-rss.php", FeedReference.RSS_MEDIA_TYPE) };
+
+        return wordPressLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/Xanga.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/Xanga.java,v
retrieving revision 1.1
diff -u -B -r1.1 Xanga.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/Xanga.java	21 
Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/Xanga.java	21 
Oct 2004 011223 -0000
@@ -0,0 +1,101 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import java.net.*;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Xanga blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class Xanga extends BlogService {
+
+    /**
+     * A regex to extract the user from a Xanga URL
+     */
+    private static Pattern xangaURLPattern = Pattern.compile(".*user=(\\w*)");
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain(resource, "xanga.com");
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        // Xanga feeds have to be handled specially since they put their
+        // feeds at the location http//www.xanga.com/rss.aspx?user=username
+        String user = getXangaUser(resource);
+        FeedReference xangaLocations[] =
+            { new FeedReference("rss.aspx?user=" + user,
+                                FeedReference.RSS_MEDIA_TYPE) };
+
+        return xangaLocations;
+    }
+
+    /** Xanga's feed locations are dependent on the 'user' attribute in a
+     *  Xanga URI.  This method helps extract the user element from an
+     *  existing URI, such as http//www.xanga.com/home.aspx?user=wdfphillz.
+     */
+    protected String getXangaUser(String resource) {
+        Matcher xangaMatcher = xangaURLPattern.matcher(resource);
+        xangaMatcher.matches();
+
+        return xangaMatcher.group(1);
+    }
+}
Index 
src/java/org/apache/commons/feedparser/locate/blogservice/YahooGroups.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/YahooGroups.java,v
retrieving revision 1.1
diff -u -B -r1.1 YahooGroups.java
--- 
src/java/org/apache/commons/feedparser/locate/blogservice/YahooGroups.java 
21 Oct 2004 010611 -0000	1.1
+++ 
src/java/org/apache/commons/feedparser/locate/blogservice/YahooGroups.java 
21 Oct 2004 011223 -0000
@@ -0,0 +1,100 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the Yahoo Groups service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class YahooGroups extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return false;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        boolean results = false;
+
+        results = containsDomain( resource, "groups.yahoo.com" );
+
+        return results;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        //  * Input http//groups.yahoo.com/group/aggregators/
+        //  *
+        //  * Output http//rss.groups.yahoo.com/group/aggregators/rss
+        String location;
+
+        if ( resource == null )
+            return new FeedReference[0];
+
+        if ( resource.indexOf( "/group/" ) == -1  ||
+             resource.indexOf( "groups.yahoo.com" ) == -1 )
+            return new FeedReference[0];
+
+        location = "http//rss." +
+            resource.substring( "http//".length(), resource.length() )
+            ;
+
+        if ( location.endsWith( "/" ) ) {
+            location += "rss";
+        } else {
+            location += "/rss";
+        }
+
+        FeedReference yahooGroupsLocations[] =
+                { new FeedReference(location,
+                                    FeedReference.RSS_MEDIA_TYPE) };
+
+        return yahooGroupsLocations;
+    }
+}
Index src/java/org/apache/commons/feedparser/locate/blogservice/iBlog.java
===================================================================
RCS file 
/home/cvspublic/jakarta-commons-sandbox/feedparser/src/java/org/apache/commons/feedparser/locate/blogservice/iBlog.java,v
retrieving revision 1.1
diff -u -B -r1.1 iBlog.java
--- src/java/org/apache/commons/feedparser/locate/blogservice/iBlog.java	21 
Oct 2004 010611 -0000	1.1
+++ src/java/org/apache/commons/feedparser/locate/blogservice/iBlog.java	21 
Oct 2004 011223 -0000
@@ -0,0 +1,74 @@
+/*
+ * Copyright 1999,2004 The Apache Software Foundation.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http//www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.feedparser.locate.blogservice;
+
+import org.apache.commons.feedparser.FeedParserException;
+import org.apache.commons.feedparser.locate.*;
+
+/**
+ * Models the iBlog blog service, encapsulating whether a given weblog
+ * is this type of service and where it usually keeps its feeds.
+ *
+ * @author Brad Neuberg, bkn3@columbia.edu
+ */
+public class iBlog extends BlogService {
+
+    /** Returns whether we can trust the results of this blog service's
+     *  autodiscovery links.  For example, TextAmerica returns invalid
+     *  autodiscovery results.
+     */
+    public boolean hasValidAutoDiscovery() {
+        return true;
+    }
+
+    /** Determines if the weblog at the given resource and with the given
+     *  content is this blog service.
+     * @param resource A full URI to this resource, such as
+     * "http//www.codinginparadise.org".
+     * @param content The full HTML content at the resource's URL.
+     * @throws FeedParserException Thrown if an error occurs while
+     * determining the type of this weblog.
+     */
+    public boolean isThisService(String resource, String content)
+                                                throws FeedParserException {
+        // FIXME No way to detect this type of weblog right now
+        return false;
+    }
+
+    /**
+     * Returns an array of FeedReferences that contains information on the
+     * usual locations this blog service contains its feed.  The feeds should
+     * be ordered by quality, so that higher quality feeds come before lower
+     * quality ones (i.e. you would want to have an Atom FeedReference
+     * object come before an RSS 0.91 FeedReference object in this list).
+     * @param resource A URL to the given weblog that might be used to build
+     * up where feeds are usually located.
+     * @param content The full content of the resource URL, which might
+     * be useful to determine where feeds are usually located.  This can be
+     * null.
+     * @throws FeedParserException Thrown if an error occurs while trying
+     * to determine the usual locations of feeds for this service.
+     */
+    public FeedReference[] getFeedLocations(String resource,
+                                            String content)
+                                                throws FeedParserException {
+        FeedReference iBlogLocations[] =
+            { new FeedReference("rss.xml", FeedReference.RSS_MEDIA_TYPE) };
+
+        return iBlogLocations;
+    }
+}


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message