db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harshvardhan Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-6937) Load the IMDB data set in Derby, obtain and adapt Join order Benchmark queries for use in derby
Date Mon, 05 Jun 2017 17:20:04 GMT

     [ https://issues.apache.org/jira/browse/DERBY-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harshvardhan Gupta updated DERBY-6937:
--------------------------------------
    Attachment: derby_script.sql
                imdb.diff

Please find the attached files. 'derby_script.sql' contains the exact script used by me to
set up the tables. The other files contain the changes in ImportReadData.java. 

There are 2 major changes in ImportReadData - 
1) Handling NULL values as discussed by me earlier.
2) Handling the escape characters.

Errors I saw related to 2) are discussed here as well - 
http://apache-database.10148.n7.nabble.com/Data-found-after-the-stop-delimiter-td100312.html.

I handled escape character in derby only, other solutions like pre-processing data externally
exists as in case of handling NULL values.

'schema_derby.sql'  and  'schematext.sql' that came with dataset are mostly same other than
the fields where the data type is just 'character varying' without specifying max length.
For those columns, I have used 'CLOB' data type in derby as semantically equivalent to that
of 'character varying' of undefined length in postgres.

You should be able to apply the imdb.diff patch and import the data without any problems now.

Thanks,
Vardhan





> Load the IMDB data set in Derby, obtain and adapt Join order Benchmark queries for use
in derby 
> ------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-6937
>                 URL: https://issues.apache.org/jira/browse/DERBY-6937
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Harshvardhan Gupta
>            Assignee: Harshvardhan Gupta
>            Priority: Minor
>         Attachments: derby_script.sql, imdb.diff, schema_derby.sql
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message