lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: svn commit: r1382777 - /lucene/dev/trunk/solr/solrj/src/java/org/apache/solr/common/cloud/Slice.java
Date Mon, 10 Sep 2012 15:25:36 GMT
Basically here is what I'm proposing (if it works on your machine):

Index: dev-tools/scripts/checkJavadocLinks.py
===================================================================
--- dev-tools/scripts/checkJavadocLinks.py	(revision 1382919)
+++ dev-tools/scripts/checkJavadocLinks.py	(working copy)
@@ -24,7 +24,7 @@
 reAtt = re.compile(r"""(?:\s+([a-z]+)\s*=\s*("[^"]*"|'[^']?'|[^'"\s]+))+""",
re.I)

 # Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate
blocks, FFFE, and FFFF. */
-reValidChar = re.compile("^[\u0009\u000A\u000D\u0020-\uD7FF\uE000-\uFFFD\U00010000-\U0010FFFF]*$")
+reValidChar = re.compile("^[^\u0000-\u0008\u000B-\u000C\u000E-\u001F]*$")

 # silly emacs: '



On Mon, Sep 10, 2012 at 11:14 AM, Robert Muir <rcmuir@gmail.com> wrote:
> Hmm this looks my regular expression to look for valid characters (we
> had some javadocs that intended \u0000 and so on but java preprocesses
> these, actually giving us invalid xml).
>
> Can you try removing the supplementary ranges from the regex just as a
> test? I don't really fully understand the state of python's unicode
> support.
>
> On Mon, Sep 10, 2012 at 11:10 AM, Yonik Seeley <yonik@lucidworks.com> wrote:
>> Thanks for fixing that.
>>
>> I'm trying to run javadocs-lint myself, but it's not working:
>>
>> javadocs-lint:
>>      [exec] Traceback (most recent call last):
>>      [exec]   File
>> "/usr/local/bin/../Cellar/python3/3.2/lib/python3.2/functools.py",
>> line 176, in wrapper
>>      [exec]     result = cache[key]
>>      [exec] KeyError: (<class 'str'>, '^[\t\n\r
>> -\ud7ff\ue000-�𐀀-\U0010ffff]*$', 0)
>>      [exec]
>>      [exec] During handling of the above exception, another exception occurred:
>>      [exec]
>>      [exec] Traceback (most recent call last):
>>      [exec]   File
>> "/opt/code/lusolr_clean2/lucene/../dev-tools/scripts/checkJavadocLinks.py",
>> line 27, in <module>
>>      [exec]     reValidChar =
>> re.compile("^[\u0009\u000A\u000D\u0020-\uD7FF\uE000-\uFFFD\U00010000-\U0010FFFF]*$")
>>      [exec]   File
>> "/usr/local/bin/../Cellar/python3/3.2/lib/python3.2/re.py", line 206,
>> in compile
>>      [exec]     return _compile(pattern, flags)
>>
>> Anyone have any pointers?
>>
>> -Yonik
>> http://lucidworks.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> lucidworks.com



-- 
lucidworks.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message