lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <ben...@basistech.com>
Subject Re: Issue with documentation for org.apache.lucene.analysis.synonym.SynonymMap.Builder.add() method
Date Thu, 06 Sep 2012 18:07:19 GMT
On Thu, Sep 6, 2012 at 1:59 PM, Robert Muir <rcmuir@gmail.com> wrote:

> Thanks for reporting this Mark.
>
> I think it was not intended to have actual null characters here (or
> probably anywhere in javadocs).
>
> Our javadocs checkers should be failing on stuff like this...
>
> On Thu, Sep 6, 2012 at 1:52 PM, Mark Parker <godefroi@gmail.com> wrote:
> > I'm building documentation from the Lucene 4.0.0-BETA source (though
> > this was also an issue with the ALPHA source), and the output has null
> > characters in it. I believe that this is because the source looks like
> > this:
> >
> >     /**
> >      * Add a phrase->phrase synonym mapping.
> >      * Phrases are character sequences where words are
> >      * separated with character zero (\u0000).  Empty words
> >      * (two \u0000s in a row) are not allowed in the input nor
> >      * the output!
> >      *
> >      * @param input input phrase
> >      * @param output output phrase
> >      * @param includeOrig true if the original should be included
> >      */
> >
> > These \u0000 characters are converted to null (\0) characters in the
> > output, which are invalid in XML (I'm outputting XML). Indeed, this is
> > a problem in the built documentation at the Apache Lucene site
> > (
> http://lucene.apache.org/core/4_0_0-BETA/analyzers-common/org/apache/lucene/analysis/synonym/SynonymMap.Builder.html
> )
> > where the documentation looks like this (in my browser):
> >
>

Converted to U+000 by what, I wonder? Javadoc shouldn't be doing that. If
it does,  I wonder if we need \\u0000 instead?


> > Add a phrase->phrase synonym mapping. Phrases are character sequences
> > where words are separated with character zero (). Empty words (two s
> > in a row) are not allowed in the input nor the output!
> >
> > The actual HTML file does have null characters at the two locations,
> > which may be technically correct, but not very helpful. I believe the
> > "\u0000" in the source ought to be escaped in some way, so that
> > something more meaningful than \0 ends up in the output. I'd submit a
> > patch, just for the prestige of it, but I don't have the slightest
> > idea what the change should be, not being a Java guy at all.
> >
> > For those interested in why I'm messing with this, then, I'm using
> > IKVM to convert the Java Lucene libraries to .NET assemblies (well,
> > one assembly) and converting the javadoc comments to XML documentation
> > for good IntelliSense in Visual Studio. It works wonderfully, and we
> > use it in very successful commercial software!
> >
> > Note that I'm not subscribed to the list, so please CC me if there are
> > questions.
> >
> > Mark
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
>
>
> --
> lucidworks.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message