hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-563) UDF for parsing the URL
Date Tue, 16 Jun 2009 05:00:07 GMT

    [ https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719926#action_12719926
] 

Zheng Shao commented on HIVE-563:
---------------------------------

Agree with Raghu. While the String comparisons are still OK (I think moving to the static
hashmap will definitely help but it's optional to do), doing "String.split()" is really a
big performance hit (this is part of the reason that scripting languages are somehow slower
- just because people like to use String.split() in those languages)

Can we cache "partToExtract" from last call, and avoid doing String.split again if the "partToExtract"
didn't change (which is the normal case).
Can we do a loop through the query string instead of calling String.split?


> UDF for parsing the URL
> -----------------------
>
>                 Key: HIVE-563
>                 URL: https://issues.apache.org/jira/browse/HIVE-563
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Server Infrastructure
>            Reporter: Suresh Antony
>            Assignee: Suresh Antony
>         Attachments: patch_563.txt, patch_563.txt.1
>
>
> Needs a udf to extract the parts of url from url string. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message