Return-Path: Delivered-To: apmail-incubator-abdera-user-archive@locus.apache.org Received: (qmail 42856 invoked from network); 23 Oct 2007 07:05:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Oct 2007 07:05:08 -0000 Received: (qmail 43097 invoked by uid 500); 23 Oct 2007 07:04:56 -0000 Delivered-To: apmail-incubator-abdera-user-archive@incubator.apache.org Received: (qmail 43081 invoked by uid 500); 23 Oct 2007 07:04:56 -0000 Mailing-List: contact abdera-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: abdera-user@incubator.apache.org Delivered-To: mailing list abdera-user@incubator.apache.org Received: (qmail 43072 invoked by uid 99); 23 Oct 2007 07:04:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2007 00:04:56 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jasnell@gmail.com designates 209.85.198.190 as permitted sender) Received: from [209.85.198.190] (HELO rv-out-0910.google.com) (209.85.198.190) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Oct 2007 07:04:57 +0000 Received: by rv-out-0910.google.com with SMTP id k20so1509614rvb for ; Tue, 23 Oct 2007 00:04:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:x-enigmail-version:content-type:content-transfer-encoding; bh=v095jtWg1XmIwWqAJTLMHNV8ejh0Qjc7GVq5drLDDnQ=; b=YTM8O03U0FE92ApQwPXcqWdVNb1yiT8GxXF94AA7UYI7621QThh55fLZbqlv1KpNOGOEodfDjFZk4xgHCs3vF8uuU1idjsFV171EIDWS338srHJmmQ+PhfJxuGBIr/vloy1shu8puYTek5Em2TxThCuDNjl+TlrDL3N81dLLSKQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:x-enigmail-version:content-type:content-transfer-encoding; b=WFhoOu/CDqgHXaVTrFIfr9XccPp4OaSrnL/U4K1tU2U0ZeeORB9Y5z9SuhW+Dw2roLisUZQdLhM6xK2eEDhEvnqNjn29fckaXRxeYeh+aQOQAl688Tdcu2j+0Bq2kaYckq/BFM17ChjQe58Sb8wJ1KB3Wsrhcaox7bSi1pE6Rbs= Received: by 10.141.14.14 with SMTP id r14mr2831238rvi.1193123076227; Tue, 23 Oct 2007 00:04:36 -0700 (PDT) Received: from ?192.168.1.2? ( [67.181.218.96]) by mx.google.com with ESMTPS id b39sm11084718rvf.2007.10.23.00.04.34 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 23 Oct 2007 00:04:34 -0700 (PDT) Message-ID: <471D9D00.4060606@gmail.com> Date: Tue, 23 Oct 2007 00:04:32 -0700 From: James M Snell User-Agent: Thunderbird 2.0.0.6 (X11/20071008) MIME-Version: 1.0 To: abdera-user@incubator.apache.org Subject: Re: Error parsing certain feeds - troubleshooting steps? References: <13358334.post@talk.nabble.com> In-Reply-To: <13358334.post@talk.nabble.com> X-Enigmail-Version: 0.95.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hello Kiran, These steps are good. Whenever I come across something that Abdera cannot parse, my first step is to always run it through the feedvalidator. It almost invariably ends up being some error in the formation of the feed. If the feed passes the validator, I always retrieve the feed manually using either curl or a browser to check over the feed myself. - James Kiran Subbaraman wrote: > I noticed that some of the feeds are not parsed by Abdera, and also noticed > some discussion on the list related to Feed validation. Based on this, these > are the steps I follow to help determine if issues related to parsing a feed > is an Abdera one, or just the malformed feed. > > These are the feeds that I tried: > - http://www.feedsfarm.com/frontpage/health/atom - Does not parse > - http://www.feedsfarm.com/frontpage/health/rss - Parses > - http://feeds.feedburner.com/techtarget/tsscom/home - Does not parse > - http://www.oreillynet.com/pub/feed/1?format=rss1 - Parses > and a few more. > > Are these steps sufficient? Or do I need to perform any other checks, that > could help? > Thanks, > Kiran > > > Step 1 > ------- > Use FeedValidator, to check the validity of a feed. FeedValidator is > available here: http://feedvalidator.org > * Valid ATOM feeds are processed by Abdera. > * RSS processing is still being introduced into Abdera, therefore some valid > RSS feeds may still not be processed by Abdera > > Step 2 > ------- > In addition, also use curl to check if the feeds are being served correctly. > curl can be obtained from: http://curl.haxx.se/download.html > Try this: curl -X GET http://news.google.com/?output=atom. In this > particular example, Google will return a 403. Therefore the program needs to > first establish a valid connection with Google, and then proceed with > getting the feed content. > > Step 3 > ------- > Have also created a sample Java program to determine if a feed can be > processed. > > import java.io.InputStream; > import java.net.URL; > > import org.apache.abdera.Abdera; > import org.apache.abdera.model.Document; > import org.apache.abdera.model.Feed; > import org.apache.abdera.parser.Parser; > > public class TestFeed { > > public static void main(String[] args) throws Exception { > > Parser parser = Abdera.getNewParser(); > InputStream input; > try { > input = new URL(args[0]).openStream(); > Document doc = (Document) parser.parse(input); > Feed feed = (Feed) doc.getRoot(); > System.out.println("Feed can be parsed"); > System.out.println("Begin feed content -----"); > feed.writeTo(System.out); > System.out.println("-------End feed content"); > } catch (Exception e) { > e.printStackTrace(); > } > } > > } >