incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis E. Hamilton" <dennis.hamil...@acm.org>
Subject ODF RegExp dependencies (was RE: RegExp replacement (was Re: Some more strange files ...))
Date Thu, 23 Jun 2011 23:06:02 GMT
I'm not versed in all of the ways that a RegExp engine is useful in the internals of OpenOffice.org,
but there are RegExp patterns presumably exposed to users and also carried in the ODF 1.2
document format.  

It struck me that ODF 1.2 is more specific about what the latter RegExp patterns are and we
might want to see if there is a conflict.  (This might be nice to put somewhere beside on
the ooo-Dev list.  Where would that be?)

The OASIS ODF 1.2 Committee Specification 01 (likely to become the final apart from editorial
changes that may be made before OASIS Standard ODF 1.2 is ratified) can be found via this
message:
<http://lists.oasis-open.org/archives/tc-announce/201103/msg00003.html>  
 
(Warning: The HTML pages are single files made from documents with thousands of pages.  You
won't like the page load times and scrolling around will be painful.  I recommend the PDF
or the ODF versions.  The PDFs are bigger but I find that Acrobat Reader provides better navigation
functions, having the equivalent of browser back and re-forward buttons that you can move
to the visible toolbar and use to chase cross-references without losing your place where you
came from.)

 - Dennis

SOME DETAILS (I think I found them all.)

PART 1 SPECIFICATION OF REGULAR EXPRESSION OCCURRENCES IN ODF XML ATTRIBUTES

In Part 1 (the main document model), there is a normative reference to 

[UTR18] Mark Davis, Andy Heninger, Unicode Regular Expressions, Unicode Technical Report #18,
http://www.unicode.org/reports/tr18/tr18-13.html, 2008.

There is a profile characteristic on the use of regular expressions in certain OpenFormula
string comparison operations in section 19.642 table:formula.

19.684 table:operator specified operators that use regular expressions and has this odd sort-of-a
conformance statement:

"Regular expressions are implementation-dependent expressions that, at a minimum, conform
to the requirements of Conformance Clause C1 of [UTR18]."

[I take that to mean that a profile specifying the implementation-dependent expression if
there is meant to be interoperable interchange of documents that rely on those particular
ODF features.]

19.744 table:use-wildcards says more.

PART 2 SPECIFICATION INVOLVING REGULAR EXPRESSIONS IN OPENFORMULA

Section 2.4 on Variances and handling of implementation-defined behaviors is applicable and
regular expressions are mentioned in the last list item, on Database criteria match patterns.

Section 3.4 defines the host-determined features that we see established for ODF in Part 1.

I'm just grabbing the mentions now:

6.9.1 General (Database Functions)
6.13.9 COUNTIF function
6.13.10 COUNTIFS function
6.13.34 VALUE uses a regular expressions in the specification itself. Not sure whose RegExps
they are.
6.14.5 HLOOKUP
6.14.8 LOOKUP
6.14.9 MATCH
6.14.12 VLOOKUP
6.16.63 SUMIFS
6.18.5 AVERAGEIF
6.18.6 AVERAGEIFS
6.19.10 DECIMAL uses regular expression in the specification (odd wording)
6.20.20 SEARCH









-----Original Message-----
From: Greg Stein [mailto:gstein@gmail.com] 
Sent: Thursday, June 23, 2011 13:35
To: giffunip@tutopia.com
Cc: ooo-dev@incubator.apache.org
Subject: Re: RegExp replacement (was Re: Some more strange files in the OOo code)

[ ... ]

I was talking about the C++ wrappers that are part of PCRE itself.

For example:
  http://vcs.pcre.org/viewvc/code/trunk/pcrecpp.h?view=markup


Cheers,
-g


Mime
View raw message