db-ddlutils-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Dudziak" <tom...@gmail.com>
Subject Re: ddlutils troubles and workarounds with svn, mysql, postgresql
Date Sun, 27 Jan 2008 22:45:03 GMT
Hi Eric,

answers/comments embedded below

> First, I tried to compile ddlutils from SVN. The
> problem here is that I failed to compile only one
> part of the toolkit. So after installing some of
> the usual packages, the build process told me that
> I would need forrest.apache.org - apparently to
> create the documentation in HTML and PDF. Forrest
> is a really big package... Next it told me that
> it would depend on two DTD thingies, too, luckily
> always mentioning where I can download them. Then
> it complained that I did not have the DTD or similar
> for some part of the data, which was when I gave up.

All these additional packages are required for generating the
documentation. If you really want to work on the trunk version, e.g.
to test out new features (otheriwise there is no reason to check out
from SVN, the source/binary packages will work just fine). then you
can simply build the JAR via

ant jar

There is no need to use 'dist' target if you don't want the full
documentation build.

> Next attempt was to download the 1.0 binary. It
> turned out that the binary already contains three
> things: The docs, the main jar, and most of the
> supplemental jars. Very nice. Of course I only
> found out after having downloaded the separate
> doc download and jar download... ;-).
>
> The binary zip seemed to be very easy to install,
> but I had troubles with the classpath. In the end,
> I just used the following:
>
> ant -d -v -lib ddlutils/lib/ -f my-ant-tasks.xml ...
>
> (I also copied the main ddlutils jar into .../lib/ )

If you look at the ant task documentation
(http://db.apache.org/ddlutils/ant/), you'll see a segment

<path id="runtime-classpath">
  <fileset dir="lib">
    <include name="**/*.jar"/>
    <include name="**/*.zip"/>

  </fileset>
</path>

This defines the classpath for the DdlUtils' Ant tasks, in this case
it says that all relevant libraries are in a local directory lib and
its subdirectories.
There are other ways to do it (such as what you ended up using) -
please refer to the Ant documentation for more info.

As for what downloadable artifact contains what, I filed a JIRA ticket
(http://issues.apache.org/jira/browse/DDLUTILS-191) to make that more
obvious on the web page.

> Now finally I was able to dump the mysql database
> that I had received as proprietary mysql dump file
> and then loaded into a mysql that I have installed
> especially for that purpose :-)
>
> Next, I tried to import that file into my normal
> postgresql database, again using ddlutils. This
> failed, and I had to change 2 things to work
> around the problem:
>
>
> Somehow mysql uses 0000-00-00 00:00:00 and null as
> synonyms for null for timestamps. However, you
> cannot insert such invalid (month/day are 0) date
> timestamps into postgresql databases. As workaround,
> either of the following did work: Search and replace
> all 0000-... timestamps by 1980-01-01 timestamps in
> the XML file, issue SQL commands for each column of
> type timestamp to update values which are null or
> earlier than 1970-01-01 into 1980-... timestamps,
> or search and replace all 0000-... timestamps into
> null timestamps... I did not test the latter as it
> would have been hard to explain to my text editor.

This should already be handled by DdlUtils. Could you file an issue in
JIRA and perhaps attach sample files to reproduce it (e.g. MySql SQL
to create and fill the DB, the Ant snippet that you used to read the
DB with DdlUtils) ?

> Second problem is that bugzilla uses indexes on
> some varchar columns. Postgresql uses btree indexes
> in a way where each value must be at most 8/3 kB.
> This caused errors for rows with long values. The
> solution was to comment out (in the XML file for
> the database schema) the part which tells that an
> index should be made for the longdescs.thetext
> column. I can live without that index ;-).

Mhmm, this looks like a DB-specific optimization (type of index) which
DdlUtils currently cannot handle. But feel free to file an improvement
issue in JIRA for this.

> As connectors, I used:
> mysql-connector-java-5.0.8-bin.jar
> postgresql-8.2-507.jdbc4.jar
> ... although I do not know if JDBC4 is of any
> added value for DDL-UTILS :-)

Not really :-)

> Next, I tried to export some data back from the
> postgresql database to XML files. It turned out
> that ddl-utils failed to read the database schema
> from the live database because of a missing schema
> function. I was unable to get ddl-utils to proceed
> even if I told it only to write the contents...
>
> Luckily, others have has similar problems before:
> ERROR: function information_schema._pg_keypositions() does not exist
> is mentioned on
> http://forum.hibernate.org/viewtopic.php?t=957909&view=next
> as is some workaround: It might make a difference which
> JDBC connector you use, but it always helps to:
>
> CREATE FUNCTION information_schema._pg_keypositions() RETURNS SETOF int4 AS
> $BODY$select g.s
>         from generate_series(1,current_setting('max_index_keys')::int,1)
>         as g(s)$BODY$
>   LANGUAGE 'sql' IMMUTABLE;
> ALTER FUNCTION information_schema._pg_keypositions() OWNER TO postgres;
>
> Note that you have to do this for the database in
> question, not for the global information schema.
> I think the latter only affects databases created
> in the future. The problem only seems to occur in
> postgresql 8.x ... An older postgres 7.4 of mine
> already does have a function of this name, but in
> a "flat" and less portable implementation:
>
> myolddb=> \df+ information_schema._pg_keypositions
>                                                                                List of
functions
>  Result data type |       Schema       |       Name       | Argument data types|  Owner
  | Language |                       Source code | Description
> - ------------------+--------------------+------------------+---------------------+----------+----------+----------------------------------------------------------+-------------
>  setof integer    | information_schema | _pg_keypositions || postgres | sql      | select
1 union all select 2 union all select 3 union all
>         select 4 union all select 5 union all select 6 union all
>         select 7 union all select 8 union all select 9 union all
>         select 10 union all select 11 union all select 12 union all
>         select 13 union all select 14 union all select 15 union all
>         select 16 union all select 17 union all select 18 union all
>         select 19 union all select 20 union all select 21 union all
>         select 22 union all select 23 union all select 24 union all
>         select 25 union all select 26 union all select 27 union all
>         select 28 union all select 29 union all select 30 union all
>         select 31 union all select 32 |
> (1 row)

DdlUtils is in  the same postiion here as Hibernate: this is a problem
between the JDBC driver and the database itself. As far as DdlUtils is
concerned, you can use a slightly older PostgreSql driver (8.0 or - if
it works - 8.1).

> Now things worked quite okay, but I still could not
> find any workaround for my last problem: The files
> always contain ALL data. By using schemapatterns of
> "public" or "p%" I can select between files with
> and without the indexes as part of the schema (no
> idea why it is "p%", just trial and error here),
> but I cannot use any TABLE NAME PATTERN. Would be
> great if ddl-utils could implement that filter!

This is already in the list of upcoming features:
https://issues.apache.org/jira/browse/DDLUTILS-87.

> I guess a sort of workaround would be to copy the
> whole database to another inside postgresql and
> then drop all tables that I do not want in the XML
> files and then use ddl-utils to export the new DB
> into XML files. Or use some good XML file editor
> to remove the unneeded tables, but that will be a
> problem because the data XML files can be 100s of
> megabytes for some of the DB which I want to use
> ddl-utils on... For the latter, it can be useful
> to say ANT_OPTS=-Xmx1536M in the shell first ;-).

You can also use XSLT using Ant's style task. E.g. check out this
article for more info:
http://www.ibm.com/developerworks/xml/library/x-antxsl/

> Last but not least, I was unable to dump some
> databases as "mortal" user because for unknown
> reason, pga_diagrams is only readable for the
> superuser on one postgresql installation while
> it is readily accessible even for low priv users
> on other installations...? It is interesting how
> much metadata is floating around in modern 8.x
> versions of postgresql :-).

Mhmm, you shouldn't have to be superuser to read a DB from postgres
with DdlUtils. Especially since there should be no reason to dump
pga_diagrams as it is (unless I'm mistaken) a system table.

> PS: troubles in the precompiled docs I downloaded
> later: The PDF uses very dark blue compared to the
> HTML and the UML diagrams are not shown in the PDF.
> And there is a content error where the docs say
> that some DB cannot do tinyint in -255..255 range
> but only in (signed byte) range. Some other place
> in the docs says that tinyint can be signed or
> unsigned byte - it does not mention anything like
> signed 9 bit value ;-). By the way, VARCHAR can be
> up to 10*1024*1024 Unicode chars, not 254 ASCII,
> at least in 8.x versions of PostgreSQL :-).

Can you file issues for these (UML not showing in PDF, VARCHAR in
Postgresql 8.x is 10MB max) ?
As for tinyint, databases and JDBC may have different definitions of
the range of this datatype. JDBC specifies tinyint as a value between
0 and 255 (http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html),
whereas databases may have them as -128 to 127 or 0 to 255 or
something similar. If the database has a differenet definition of
tinyint than JDBC, than DdlUtils will expand to the next biggest
JDBC/native datatype where appropriate.

cheers,
Tom

Mime
View raw message