lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer" <simon.willna...@googlemail.com>
Subject GData-Server: SoC almost finished, server almost done so far....
Date Sun, 06 Aug 2006 19:07:31 GMT
Hey guys,

SummerOfCode finishes at the 22nd of August and all functionality has
been implemented so far. Well, that does not mean that the project has
finished at all. I will reflect the last 3 week a bit on this mail to
get the interested of you a bit of an overview.

The proposal had 5 Milestones and up to number 4 everything has been
implemented.
Milestone 1 was about a basic functionality to serve feeds and enable
all the CRUD actions for the entries of a feed followed by the
authentication component and the versioning stuff. So the last 3 weeks
I was working on integration Lucene as a search / index component
which worked quiet well. I started defining a index schema based on a
xml file (the server configuration) which enables users to index
whatever element of entries they want to in a quiet flexible way.
Analyzers, boost and so one can be defined per field (more on that on
the wiki). The elements, attributes or some other desired xml nodes
are defined using an xpath expression in the index schema. Each
service on the gdata server has its own index and index schema. A
service is a kind of Entry / Feed implementation with defined
extensions. This means that you can create your own feed with special
extension elements  on feeds and entries (more on that later). Each
Service has 'n' feeds and a feed has 'm' entries.
The Gdata Protocol defines a quiet simple query syntax for search
querying the server there are just about a simple overall query, a
query for update range, an author query and a query for categories.
All queries are "AND" queries and not all of them can be used
together. So querying category and author is not defined.
Well I thought that's a bit less and lucene can do much more. So I
decided to provide all of the lucene query syntax as well. I
"translate" the so called Gdata query into a proper lucene query if
needed to let the user choose how sophisticated he wants to search.
All queries are done via http get parameters like fieldname=searchquery.

If you got some time to try it out go for:
http://www.javawithchopsticks.de/gdata-server/feed/weblog/
Query parameters:
q (equals content, is default field),updated,title,summary,content, author
you can use all of the lucene syntax and the additional gdata query
syntax (http://code.google.com/apis/gdata/protocol.html#Queries)

So the next two week I will spend some time on test coverage, clover
says 75 % so far and the documentation on the lucene wiki. I will
provide a gdata-server in 20 minutes tutorial and a FAQ that should be
ok for some users to give it a go.

So is there a place for the gdata server on http://lucene.apache.org/java/?
Should that located in the sandbox of as a contribution?

@Yonik let me know what you think.

best regards to all and thanks for your help during the project.I
guess I will stay around here for a while :). Thanks to all
java-devs!! Some special thanks to Ian Holsman for being my mentor on
that project.

best regards simon

Some examples
(gdata like)
?updated-min=2006-08-05T22:35:54.464Z&updated-max=2006-09-05T22:35:54.464Z

?title=eval* (all terms beginning with eval)
?title=g*data (to find gdata or g-data)
?title=te?t (to find test or text)

Fuzzy Queries:

?q=sophistikatet~ (for sophisticated)

or boost a term to influence the scoring:
q=gdata&title=apache^4

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message