From Marvin Humphrey <>
Subject Re: [lucy-dev] Generalize Tutorial for multiple host languages
Date Sun, 07 Nov 2010 16:06:08 GMT
On Fri, Nov 05, 2010 at 04:39:08PM +0100, Simon Willnauer wrote:
> > Our multi-chapter Tutorial is still Perl-specific, however.  It would be great
> > if we could adapt it for use across multiple host languages -- but right now,
> > it has dependencies which will not be available for every language/platform
> > combination.
> That would be absolutely awesome I am not really into perl (shame on
> me I know) and something like that would definitly help me a lot. I
> find myself in the situation where I am gonna need it sooner or later
> :)

We're not yet at the point where C is a viable host language binding for Lucy,
but each dependency we eliminate brings us closer.

> > The only downside is that easily-customizable sample applications are
> > compelling (see Ruby on Rails), and we'll be taking our "instant web search"
> > kit and making it less handy.
> I know it would be an overhead but could we maintain that aside of the
> getting started example?

One thing I'm realizing is that I really don't want to contribute or maintain
C sample code which operates in a web context.  C is too prone to security
vulnerabilities, its string handling sucks so you need waaaaay more code, and
things like URI escaping and HTML tag stripping aren't offered by the standard
library and aren't easy to fake up.  It's the wrong language for a quickie CGI

I think it makes more sense for the C tutorial to operate in a command-line
context, even if the tutorials for other host language bindings target the
web.  But then we have a problem: the current HTML format of our sample corpus
isn't suitable.  The solution, I think, is to change all those docs to plain
text, with the title on the first line: 

    Amendment XIII 

    1. Neither slavery nor involuntary servitude, except as a punishment for
    crime whereof the party shall have been duly convicted, shall exist within
    the United States, or any place subject to their jurisdiction.

    2. Congress shall have power to enforce this article by appropriate

Plain text will work for either web or command-line context, and as a bonus,
for web-context tutorials we no longer have to either pull in an HTML parsing
dependency or do something hackish with regexes.

Marvin Humphrey

