pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1229) allow pig to write output into a JDBC db
Date Tue, 06 Apr 2010 10:35:34 GMT

    [ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853843#action_12853843
] 

Ankur commented on PIG-1229:
----------------------------

So accepting the JDBC URL in setStoreLocation() exposes a flaw in Hadoop's Path class and
it causes test case to fail with following exception

java.net.URISyntaxException: Relative path in absolute URI: jdbc:hsqldb:file:/tmp/batchtest;hsqldb.default_table_type=cached;hsqldb.cache_rows=100
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute
URI: jdbc:hsqldb:file:/tmp/batchtest;hsqldb.default_table_type=cached;hsqldb.cache_rows=100
        at org.apache.hadoop.fs.Path.initialize(Path.java:140)
        at org.apache.hadoop.fs.Path.<init>(Path.java:126)
        at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:238)
        at org.apache.pig.StoreFunc.relToAbsPathForStoreLocation(StoreFunc.java:60)
        at org.apache.pig.impl.logicalLayer.parser.QueryParser.StoreClause(QueryParser.java:3587)
...
...
Caused by: java.net.URISyntaxException: Relative path in absolute URI: jdbc:hsqldb:file:/tmp/batchtest;hsqldb.default_table_type=cached;hsqldb.cache_rows=100
        at java.net.URI.checkPath(URI.java:1787)
        at java.net.URI.<init>(URI.java:735)
        at org.apache.hadoop.fs.Path.initialize(Path.java:137)

Looking at the code of Path.java it seems like it extracts scheme based on the first occurrence
of ':', this causes authority and path to be extracted incorrectly resulting in the above
exception thrown java.net.URI. 
However if I try to initialize URI directly with the URL string, no exception is thrown.

As for DB reachability check, I think it is ok to check the availability at the runtime an
fail if its available. We do this prepareToWrite(). 
For performance enhancement, I think we can track that via separate issue.

This patch has taken quite a while now and I wouldn't want to delay it further by depending
on a hadoop fix.

So If a reviewer does not find any blocking issues then my suggestion is to go ahead with
the commit. 

> allow pig to write output into a JDBC db
> ----------------------------------------
>
>                 Key: PIG-1229
>                 URL: https://issues.apache.org/jira/browse/PIG-1229
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Ian Holsman
>            Assignee: Ankur
>            Priority: Minor
>             Fix For: 0.8.0
>
>         Attachments: jira-1229-v2.patch
>
>
> UDF to store data into a DB

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message