incubator-any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From armon <zhime...@gmail.com>
Subject Re: about the supported input format of any23
Date Thu, 21 Jun 2012 22:10:50 GMT
 yep,so how to solve it, BTW, it still can't work while I save the xml part of the data in
http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
, the xml file is in the attachment file.



armon


On 2012年6月22日星期五 at 上午5:59, Lewis John Mcgibbney wrote:

> No your doing nothing incorrectly. I get pretty dismal results both
> with basic-crawler within Any23 please see below
> 
> lewismc@lewismc-HP-Mini-110-3100:~/ASF/trunk/runtime/local$ any23
> rover http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
> [1] 2956
> [2] 2957
> [3] 2958
> lewismc@lewismc-HP-Mini-110-3100:~/ASF/trunk/runtime/local$
> ------------------------------------------------------------------------
> Apache Any23 :: rover
> ------------------------------------------------------------------------
> 
> @prefix dcterms: <http://purl.org/dc/terms/> .
> 
> <http://en.wikipedia.org/w/api.php?action=query> dcterms:title
> "MediaWiki API Result" .
> 
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 2s
> Finished at: Thu Jun 21 22:53:27 BST 2012
> Final Memory: 24M/483M
> ------------------------------------------------------------
> [1] Done any23 rover
> http://en.wikipedia.org/w/api.php?action=query
> [2]- Done list=search
> [3]+ Done srwhat=text
> 
> The problem is that I don't know how crawler4j deals with some
> characters such as '?' within URL strings. and whether it treats them
> as queries or not? By the looks of the log output above, the URL
> string is being treated incorrectly.
> 
> Sitting above all of this is the fact that I don't think the wiki
> markup syntax is not supported within Any23 parser implementations.
> 
> Lewis
> 
> 
> On Thu, Jun 21, 2012 at 10:29 PM, armon <zhimeng9@gmail.com (mailto:zhimeng9@gmail.com)>
wrote:
> > and even when I copy the xml part of data in the url as the input content,
> > it still can't work well, but when I try a rdf file, it works well, is
> > there anything I do incorrectly?
> > 
> > 
> > 2012/6/22 armon <zhimeng9@gmail.com (mailto:zhimeng9@gmail.com)>
> > 
> > > Hi Lewis, thanks very much for your reply, I am sorry to interrupt you so
> > > late,
> > > 
> > > the url I used was:
> > > 
> > > 
> > > http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
> > > 
> > > 
> > > and then I used command: ./any23 rover url(showed above) to run the
> > > result.
> > > 
> > > thanks.
> > > 
> > > armon
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 2012/6/22 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com (mailto:lewis.mcgibbney@gmail.com)>
> > > 
> > > > Hi Armon,
> > > > 
> > > > On Thu, Jun 21, 2012 at 4:15 PM, armon <zhimeng9@gmail.com (mailto:zhimeng9@gmail.com)>
wrote:
> > > > > Hi,
> > > > >  I do some data transform currently from xml-format wiki data
> > > > 
> > > > Can you give a small example of this xml?
> > > > 
> > > > > (retrieved by wikipedia API) to turtle,
> > > > 
> > > > Also a small example of your turtle
> > > > 
> > > > > but it seems that the any23 can't
> > > > > work correctly. (I used the command: ./any23 rover url )
> > > > 
> > > > What do you get to std out? I am easily able to use any23 parsers on
> > > > fetching structure from wikipedia pages... but this is not what you
> > > > are referring to... I need some more information from you please.
> > > > 
> > > > > 
> > > > >  Does any23 actually support the xml data retrieved by wikipedia
> > > > API
> > > > > as the input format ?
> > > > 
> > > > Please see above
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > --
> > > > Lewis
> 
> 
> 
> -- 
> Lewis


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message