incubator-any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From armon <zhime...@gmail.com>
Subject Re: about the supported input format of any23
Date Thu, 21 Jun 2012 22:12:35 GMT
and use the xml file as the input data, then use the command ./any23 rover filename

armon


On 2012年6月22日星期五 at 上午6:10, armon wrote:

>  yep,so how to solve it, BTW, it still can't work while I save the xml part of the data
in http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
, the xml file is in the attachment file.
> 
> 
> 
> armon
> 
> 
> On 2012年6月22日星期五 at 上午5:59, Lewis John Mcgibbney wrote:
> 
> > No your doing nothing incorrectly. I get pretty dismal results both
> > with basic-crawler within Any23 please see below
> > 
> > lewismc@lewismc-HP-Mini-110-3100:~/ASF/trunk/runtime/local$ any23
> > rover http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
> > [1] 2956
> > [2] 2957
> > [3] 2958
> > lewismc@lewismc-HP-Mini-110-3100:~/ASF/trunk/runtime/local$
> > ------------------------------------------------------------------------
> > Apache Any23 :: rover
> > ------------------------------------------------------------------------
> > 
> > @prefix dcterms: <http://purl.org/dc/terms/> .
> > 
> > <http://en.wikipedia.org/w/api.php?action=query> dcterms:title
> > "MediaWiki API Result" .
> > 
> > ------------------------------------------------------------------------
> > Apache Any23 SUCCESS
> > Total time: 2s
> > Finished at: Thu Jun 21 22:53:27 BST 2012
> > Final Memory: 24M/483M
> > ------------------------------------------------------------
> > [1] Done any23 rover
> > http://en.wikipedia.org/w/api.php?action=query
> > [2]- Done listDsearch
> > [3]+ Done srwhat=text
> > 
> > The problem is that I don't know how crawler4j deals with some
> > characters such as '?' within URL strings. and whether it treats them
> > as queries or not? By the looks of the log output above, the URL
> > string is being treated incorrectly.
> > 
> > Sitting above all of this is the fact that I don't think the wiki
> > markup syntax is not supported within Any23 parser implementations.
> > 
> > Lewis
> > 
> > 
> > On Thu, Jun 21, 2012 at 10:29 PM, armon <zhimeng9@gmail.com (mailto:zhimeng9@gmail.com)>
wrote:
> > > and even when I copy the xml part of data in the url as the input content,
> > > it still can't work well, but when I try a rdf file, it works well, is
> > > there anything I do incorrectly?
> > > 
> > > 
> > > 2012/6/22 armon <zhimeng9@gmail.com (mailto:zhimeng9@gmail.com)>
> > > 
> > > > Hi Lewis, thanks very much for your reply, I am sorry to interrupt you
so
> > > > late,
> > > > 
> > > > the url I used was:
> > > > 
> > > > 
> > > > http://en.wikipedia.org/w/api.php?action=query&list=search&srwhat=text&srsearch=meaning
> > > > 
> > > > 
> > > > and then I used command: ./any23 rover url(showed above) to run the
> > > > result.
> > > > 
> > > > thanks.
> > > > 
> > > > armon
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 2012/6/22 Lewis John Mcgibbney <lewis.mcgibbney@gmail.com (mailto:lewis.mcgibbney@gmail.com)>
> > > > 
> > > > > Hi Armon,
> > > > > 
> > > > > On Thu, Jun 21, 2012 at 4:15 PM, armon <zhimeng9@gmail.com (mailto:zhimeng9@gmail.com)>
wrote:
> > > > > > Hi,
> > > > > >  I do some data transform currently from xml-format wiki data
> > > > > 
> > > > > Can you give a small example of this xml?
> > > > > 
> > > > > > (retrieved by wikipedia API) to turtle,
> > > > > 
> > > > > Also a small example of your turtle
> > > > > 
> > > > > > but it seems that the any23 can't
> > > > > > work correctly. (I used the command: ./any23 rover url )
> > > > > 
> > > > > What do you get to std out? I am easily able to use any23 parsers
on
> > > > > fetching structure from wikipedia pages... but this is not what you
> > > > > are referring to... I need some more information from you please.
> > > > > 
> > > > > > 
> > > > > >  Does any23 actually support the xml data retrieved by wikipedia
> > > > > API
> > > > > > as the input format ?
> > > > > 
> > > > > Please see above
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > --
> > > > > Lewis
> > 
> > 
> > 
> > -- 
> > Lewis
> 
> 
> Attachments: 
> - search.xml
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message