directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pepersack, Bob G" <Bob.Pepers...@GDIT.com>
Subject RE: Directory Studio: Backslash in DN breaks studio
Date Fri, 11 Mar 2016 13:18:36 GMT
Is there a configuration where I can set my keystore settings with a text editor, or some other
kind of editor?

When I build the project from "directory-studio-trunk.zip" with "mvn clean install -Dmaven.test.skip=true",
it throws this exception:

[INFO] Scanning for projects...
[ERROR] Internal error: org.eclipse.tycho.core.osgitools.OsgiManifestParserException: Exception
parsing OSGi MANIFEST C:\bit9prog\dev\Installation\directory-studio-trunk\plugins\aciitemeditor\META-INF\MANIFEST.MF:
Manifest file not found -> [Help 1]
org.apache.maven.InternalErrorException: Internal error: org.eclipse.tycho.core.osgitools.OsgiManifestParserException:
Exception parsing OSGi MANIFEST C:\bit9prog\dev\Installation\directory-studio-tru
nk\plugins\aciitemeditor\META-INF\MANIFEST.MF: Manifest file not found
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:166)
        at org.apache.maven.cli.MavenCli.execute(MavenCli.java:582)
        at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:214)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:158)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.eclipse.tycho.core.osgitools.OsgiManifestParserException: Exception parsing
OSGi MANIFEST C:\bit9prog\dev\Installation\directory-studio-trunk\plugins\aciitemeditor\META-INF\MANIFEST.MF:
 Manifest file not found
        at org.eclipse.tycho.core.osgitools.DefaultBundleReader.loadManifestFromDirectory(DefaultBundleReader.java:95)
        at org.eclipse.tycho.core.osgitools.DefaultBundleReader.doLoadManifest(DefaultBundleReader.java:59)
        at org.eclipse.tycho.core.osgitools.DefaultBundleReader.loadManifest(DefaultBundleReader.java:50)
        at org.eclipse.tycho.core.osgitools.OsgiBundleProject.readArtifactKey(OsgiBundleProject.java:147)
        at org.eclipse.tycho.core.osgitools.OsgiBundleProject.setupProject(OsgiBundleProject.java:142)
        at org.eclipse.tycho.core.resolver.DefaultTychoResolver.setupProject(DefaultTychoResolver.java:74)
        at org.eclipse.tycho.core.maven.TychoMavenLifecycleParticipant.afterProjectsRead(TychoMavenLifecycleParticipant.java:90)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:310)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:154)
        ... 11 more

-----Original Message-----
From: Emmanuel Lécharny [mailto:elecharny@gmail.com] 
Sent: Thursday, March 10, 2016 7:18 PM
To: dev@directory.apache.org; users@directory.apache.org
Subject: Re: Directory Studio: Backslash in DN breaks studio

Le 10/03/16 22:58, Stefan Seelmann a écrit :
> On 03/09/2016 07:59 PM, Emmanuel Lécharny wrote:
>> Le 09/03/16 18:54, Philip Peake a écrit :
>> Can you be a bit more explicit ?
>>
> Probably same cause as in
> https://issues.apache.org/jira/browse/DIRSTUDIO-1087 and
> https://issues.apache.org/jira/browse/DIRSERVER-2109
>
I took some time last week-end to re-think the whole problem. There are a lot of things we
are doing wrong, IMO. Don't get me wrong though :
most of the time, it simply works.

FTR, I send this mail to the dev list, copying it to the users list.

<this is going to be a long mail...>

First of all, we need to distinguish the clients from the server. They are to different beasts,
and we should assume the server *always* receive data that are potentially harmful and incorrect.

Then we also need to distinguish String values and Binary values. The reason we make this
distinction is that String values are going to be encoded in UTF-8 thus using multi-bytes,
and also because we need to convert them from UTF-8 to Unicode (and back).

Let's put aside the binary data at the moment.

The server
==========

Value
-----

We receive UTF-8 Strings, we convert them to Unicode and now we can process them in Java.
We do need this conversion because we need to check the values before injecting them in the
backend. Doing such checks in UTF-8 would be very impracticable.

There is one critical operation that is done on values when we process them : we most of the
time need to compare them to another value :
typically, when we have an index associated with this value, or when we have a search filter.
Comparing two values is not as simple as doing an lexicographic comparison sadly. We need
to 'prepare' the values accordingly to some very specific rules, and we should also 'normalize'
those values accordingly to some syntax.

A comparison is done following this process :

Val 1 -> normalization -> preparation-+
                                       \
                                        .--> Comparison
                                       / Val 2 -> normalization -> preparation-+

We can save some processing if one of the two values has already been normalized or prepared.
Actually, we should do that only once for each value : when they are injected into the server
for the first time. But doing so would also induce some constraint : disk usage (saving many
forms of a data cost space, and time when it comes to read them from disk. This is all about
balance...).

Anyway, most of the time, we get a value and we just need to store it into the backend after
having checked its syntax. And that's the key :
checking the syntax requires some preparation. Here is how we proceed when we just need to
chck teh syntax :

Value --> normalization --> syntax check

There is no string preparation.

The normalization is specific to each AttributeType. The String Preparation is the same for
all the values.


Now, there are two specific use cases : filters, and DN.


Filter
------

A filter always contains a String that needs to be processed to give a tuple : <attributeType,
value>. There are rules that must be applied to transform the incoming filter to this tuple.
Once we have created this tuple, we can normalize and prepare the tuple's value : something
that might be complex, especially when dealing with substring matches.

So for filter, the process is :

fliter -> preProcessing -> Tuple<AttributeType, Value> -> normalization
-> preparation


The String preparation is required because the filter's value will be compared with what we
fetch from the backend.

DN
--

The DN is not a String. It's a list of RDN, where each RDN is a list of AVA, where each AVA
is a tuple <attributeType, Value> Although, as a filter, when it's received, or stored,
it's as a String, and there are some specific rules to follow to get the String being transformed
to RDNs. Bottom line, the DN preprocessing is the following :

DN String --> preProcessing -> Rdns, AVA, Tuple<AttributeType, value> [-> normalization
-> preparation] (for each AVA)

Again, the String preparation is needed because we will store the RDN into an index, and that
requires some comparison (note that it's not always the case, typically for attributeType
with a DN syntax).


Comparing values
----------------

We saw that we need to normalize and prepare values before being able to compare them. A good
question would be : do we need to prepare the String beforehand or when we need to compare
values ? That's quite irrelevant : it's a choice that need to be make at some point, but it
just impacts the performance and the storage size. We can consider that when we start comparing
two values, they are already prepared (either because we have stored a prepared version of
the String, or because we have just prepared teh String on the fly before calling the compare
method).



The Client
==========

I will just talk about the Ldap API here, I'm not interested in any other client.

We have two flavors : schema eware and schema agnostic. We also have to consider two aspects
: when we send data to the server, and when we process the result.


Schema agnostic client
----------------------

There is no so much we can do here. we have no idea about what can be the value's syntax,
so we can't normalize the value. Bottom line, here is the basic processing of a value sent
to the server :

- we don't touch the values. At all. We just convert them from Unicode to UTF-8
- we pre-process filters to feed the SearchRequest. values are unescaped (ie the escaped chars
are replaced by their binary counterpart)
- we don't touch the DN

Whe values are received from the server, we need to process the data this way :

- we don't touch the values, we just convert them from UTF-8 to Unicode
- we don't touch the DN : it's already in String format, we just convert them from UTF-8 to
Unicode

Schema aware client
-------------------

This is more complex, because now, we can process the values before sending them to the server.
This put some load on the client side instead of pounding the server with incorrect data that
will get rejected anyway.

- Values : we normalize them, prepare them and check their syntax. At the end, we convert
the original value from Unicode to UTF-8. As we can see, we lose the normalized and prepared
value.
- Filter : we unescape them, then we convert them to UTF-8
- Dn : we parse it, unescape it, normalizing each value, and at the end, if the DN is valid,
we send the original value as is, after having converting it to UTF-8


As we can see, all what we do is to check the values before sending them to the remote server,
except for the filter.

For the received values, we first convert them to Unicode and that's pretty much it.


Escaping
--------

DN and Filters need some pre-processing called unescaping when we have to transform them from
a String to an internal instance. For Filter, this is always done on the client side, for
the DN is done on the server side. The idea is to transform those values from a String (human
readable) form to a binary form.


What we do wrong
----------------

We will only focus on the schema aware API here. This is what we use on the server side anyway...

* First, we are depending on the same API on both side (client and server). This make things
more complex, because the context is different. For instance, there is no need to parse the
DN on the client, but we still do it.  I'm not sure that we could easily abvoid doing so.
To some extent, we are penalizing the client.

* The most complex situation is when we have to procesds the DN. This is always done in two
phases :
- slice the DN into RDNs, the RDNs into AVAs containg Values
- apply the schema on each value

We coulde easily imagine doing the processing in one single pass.
Actually, this is an error not to do so : this cost time, and the classes are therfore not
immutable.

* One specific problematic point is when we process escaped chars. For instance, something
like : 'cn=a\ \ \ b' is just a cn with a value containing 3 spaces. This is what should be
returned to the user, and not a value with only one space. *But* we will be able to retrieve
this value using one of those filters : (cn=a b) or (cn=a  b) or (cn=
a         b). Actually the number of spaces is irrelevant when comparing
the value, it's not when it comes to send back the value to the user.
Again, it has all to see with the distinction between storing values and comparing values.
For filters, we must unescape the String before sending it to the server. The server does
not handle the Filter as a String.

* The PrepareString class needs to be reviewed. We don't handle spaces the way it's supposed
to be done.

Value Class
-----------

I'm not exactly proud of it. It was a way to avoid having code like :

    if ( value instance of String )
    {
        // This is a String
    }
    else
    {
        // This is a byte[]
    }

so now, we have StringValue and BinaryValue, both of them could be used with an AttributeType
when they are SchemaAware. In retrospect, I think the distinction between String and Binary
values was an error. We should have a Value, holding both, with a flag in it. Chaning that
means we review the entire code, again...



Conclusion
==========

This is not a pleasant situation. We have some cases where we don't handle things correctly,
and this is largely due to some choices made a decade ago. Now, I don't think that this should
be kept as is. Sometime a big refactoring is better than patching this and that...


Now, feel free to express yourself, I would be vert happy to have your opinion.

Many thanks !

Mime
View raw message