hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-501) Make branches/types work under cygwin
Date Fri, 17 Oct 2008 02:36:44 GMT

     [ https://issues.apache.org/jira/browse/PIG-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Daniel Dai updated PIG-501:

    Attachment: javacc.jar

We've already make all unit tests pass under cygwin for trunk (See [PIG-243|https://issues.apache.org/jira/browse/PIG-243]).
We need to do the same for branches/types. Here is the problem I noticed and fixed for branches/types
under cygwin.

1. We can not compile under cygwin. javacc fails in branches/types, error message "Invalid
Escape character". We need to update javacc to the latest snapshot. Javacc 4.1 (latest) release
do not solve the problem. This issue is described in https://javacc.dev.java.net/issues/show_bug.cgi?id=135.
We do not have it before because only in branches/types we will let javacc generate output
to directory "xxx\util", which contains "\u", triggering the problem. Currently I do not notice
any problem with this latest javacc snapshot. One difference I notice is it sometimes generates
a slightly different message upon error, and to me this message become more informative.

2. The tailing '\r' in input txt file cause some problem as before. I modify PigStorage.java
and TextLoader.java to deal with it. However in branches/types, PigStorage is no longer line
based and this make the situation much harder. In order to deal with it, we need to be able
to read ahead, if we see a '\n' ahead, we can discard current '\r'. But in reality we do not
look ahead and changes will be relatively big. So here I simply remove the tailing '\r' for
each field under cygwin. There is a semantic diference but I think the chance that tailing
'\r' carrying meaningful information is low. Correct me if I am wrong.

3. Most testing errors in cygwin is caused by wrong path name just like before. cygwin only
takes the input/output file name in windows form, such as c:\ \ xxxx, and if we need to use
a complete scheme, use "file:/c:\ \ xxxx", notice there is an extra "/" between "file:" and
file path, which do not exists in unix. Many test cases do not notice that and fail in cygwin.
Also we need to use "\ \" not "\" as seperator as described in [PIG-243|https://issues.apache.org/jira/browse/PIG-243].

> Make branches/types work under cygwin
> -------------------------------------
>                 Key: PIG-501
>                 URL: https://issues.apache.org/jira/browse/PIG-501
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>         Environment: cygwin
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: types_branch
>         Attachments: javacc.jar, PIG_cygwin.Patch
> We've already make all unit tests pass under cygwin for trunk (See [PIG-243|https://issues.apache.org/jira/browse/PIG-243]).
We need to do the same for branches/types. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message