manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Crawling MySQL with latest MySQL connector fails
Date Mon, 20 Aug 2012 13:53:25 GMT
I checked in a fix for this ticket - please let me know if it allows
you to successfully crawl MySQL database tables.

Karl

On Mon, Aug 20, 2012 at 8:10 AM, Karl Wright <daddywri@gmail.com> wrote:
> There's some online chatter about this.  Apparently the JDBC 4.0
> specification was clarified in this regard, and MySQL's implementation
> follows the clarified meaning.  The recommended approach during this
> transition period is to allow the user to select which method they
> want to use to get the column name.  See CONNECTORS-509.
>
> Karl
>
> On Mon, Aug 20, 2012 at 8:00 AM, Karl Wright <daddywri@gmail.com> wrote:
>> Here's some additional info.
>>
>> The JDBC class "ResultSetMetaData" has two methods: getColumnName(),
>> and getColumnLabel().  For all supported databases, getColumnName()
>> returns the right thing, EXCEPT for MySQL, where you have to use
>> getColumnLabel() instead.
>>
>> I've abstracted the logic that does this in the main database classes
>> that underlie the framework, but for the JDBC Connector I tried to
>> make the connector be independent of any special logic, and instead
>> make it the responsibility of the query writer to know how to work
>> with their database.  Unfortunately, that strategy is failing when it
>> comes to MySQL because the JDBC driver is implemented in a way that is
>> inconsistent with the specification.
>>
>> So, we have two ways forward:
>>
>> (1) Change the logic to use getColumnLabel() always.  If we do this,
>> it will be necessary to test the JDBC connector against a PostgreSQL,
>> MySQL, and MSSQL database before we know what the effects are.  It is
>> possible everything will just work, but it is also possible that such
>> a change would break other people's jobs, and that would be no good.
>>
>> (2) Try to conditionalize the logic so that only for MySQL is
>> getColumnLabel() used.  This is less risky but results in messy code
>> that sooner or later would become unmaintainable.
>>
>> Karl
>>
>>
>> On Mon, Aug 20, 2012 at 6:22 AM, Karl Wright <daddywri@gmail.com> wrote:
>>> Hi Shigeki,
>>>
>>> This is critical functionality for ManifoldCF.  Quite a lot of
>>> ManifoldCF stuff won't work on MySQL if this is broken - not just
>>> crawling using the JDBC connector.  Are you successfully crawling with
>>> MySQL as the back-end?  If you are, that means that there is a way to
>>> do this right but the JDBC connector is not using it.
>>>
>>> I am testing with MySQL JDBC connector 5.1.18 here, which would
>>> indicate that that is the case.
>>>
>>> Could you open a ticket describing the problem, and I will look into
>>> this in some detail tonight?  Thanks,
>>> Karl
>>>
>>>
>>> On Mon, Aug 20, 2012 at 4:21 AM, Shigeki Kobayashi
>>> <shigeki.kobayashi3@g.softbank.co.jp> wrote:
>>>> Hi guys.
>>>>
>>>>
>>>> I am not sure if everyone has already noticed this, but this is to share
an
>>>> experimental fact of using MySQL connectors to crawl MySQL data.
>>>>
>>>> Using "AS" in Select queries in SeedQuery and DataQuery causes error
>>>> depending on versions of MySQL connectors.
>>>>
>>>> Env:
>>>> - ManifoldCF0.5
>>>> - Solr3.6
>>>> - MySQL5.5
>>>>
>>>> Example:
>>>>
>>>>  SeedQuery´╝ÜSELECT idfield AS $(IDCOLUMN) FROM documenttable
>>>>
>>>> Error Message:
>>>>   Bad seed query; doesn't return $(IDCOLUMN) column. Try using quotes around
>>>> $(IDCOLUMN) variable, e.g. "$(IDCOLUMN)".
>>>>
>>>> Cause of Error:
>>>>  MySQL connecors of over version 5.1 seem to have a bug that causes error
>>>> when you use "AS" in Select to put an alias for a column.
>>>>
>>>> Versions of MySQL Connector:
>>>>  mysql-connector-java-5.0.8.jar  -> OK
>>>>  mysql-connector-java-5.1.18.jar -> No Good
>>>>  mysql-connector-java-5.1.21.jar -> No Good
>>>>
>>>> Exception:
>>>> Using function  (e.g. sysdate() as) or fixed strings  (e.g. " fixed string"
>>>> as) followed by "as" does not cause error.
>>>>
>>>> Regards,
>>>>
>>>> Shigeki

Mime
View raw message