hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunderlin, Mark" <mark.sunder...@teamaol.com>
Subject RE: Issues with Hive 'Like" and multiple '%' wildcard in a pattern
Date Thu, 13 Jun 2013 19:30:06 GMT
Hive users: It took me a while to create a standalone use case, but here we go:

In Oracle, this SQL returns false for both cases:

SELECT CASE
          WHEN '?&icid=mainmoiz%dl' LIKE ('%?icid=main%') THEN 'T'
          ELSE 'F'
       END
          AS case1,
       CASE
          WHEN '?&icid=mainmoiz%dl' LIKE ('%?icid=main%dl%') THEN 'T'
          ELSE 'F'
       END
          AS case2
  FROM DUAL
case1 case2
F	F

But in Hive 0.80, I get one False and one True:

SELECT CASE
          WHEN '?&icid=mainmoiz%dl' LIKE ('%?icid=main%') THEN 'T'
          ELSE 'F'
       END
          AS case1,
       CASE
          WHEN '?&icid=mainmoiz%dl' LIKE ('%?icid=main%dl%') THEN 'T'
          ELSE 'F'
       END
          AS case2
FROM DUAL - A dummy single row, single column table we installed
case1	case2
F	T



---
Mark E. Sunderlin
Solutions Architect |AOL NETWORKS BDM
P: 703-265-6935     |C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way | Dulles, VA | 20166

From: Stephen Sprague [mailto:spragues@gmail.com] 
Sent: Wednesday, June 12, 2013 2:22 PM
To: user@hive.apache.org
Subject: Re: Issues with Hive 'Like" and multiple '%' wildcard in a pattern

Hi Mark,
i'm running v0.80 too and multiple '%'s work as expected for me. so. we're gonna need a see
a definitive test case from you.  
show your full string and show where the like clause fails to match. 
thanks,
Stephen.
PS here's my test:

hisql>select city from junk;
+------------------+
| city             |
+------------------+
| West Haven       |
| West Haven       |
| West Haven       |
| West Haven       |
| West Jordan      |
| West Chester     |
| Westminster      |
| West Chester     |
| West Chester     |
| Westlake         |
| West Chester     |
| West Chester     |
| West Chester     |
| Weston           |
| West Covina      |
| West Chester     |
| West Chester     |
| West Hollywood   |
| West Des Moines  |
| West Covina      |
| Westland         |
| West Valley City |
| West Valley City |
| West Valley City |
| West Valley City |
| Westminster      |
| West Memphis     |
| West Memphis     |
+------------------+
28 affected

hisql>select city from junk where city like '%West%v%';
+-------------+
| city        |
+-------------+
| West Haven  |
| West Haven  |
| West Haven  |
| West Haven  |
| West Covina |
| West Covina |
+-------------+
6 affected

hisql>select city from junk where city like '%West%v%a%';
+-------------+
| city        |
+-------------+
| West Covina |
| West Covina |
+-------------+
2 affected


On Wed, Jun 12, 2013 at 10:26 AM, Sunderlin, Mark <mark.sunderlin@teamaol.com> wrote:
We are using Hive 0.80.
 
---
Mark E. Sunderlin
Solutions Architect |AOL NETWORKS BDM
P: 703-265-6935     |C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way | Dulles, VA | 20166
 
From: Stephen Sprague [mailto:spragues@gmail.com] 
Sent: Wednesday, June 12, 2013 1:00 PM
To: user@hive.apache.org
Subject: Re: Issues with Hive 'Like" and multiple '%' wildcard in a pattern
 
that seems pretty hard to believe. what version of hive are you using?
 
On Wed, Jun 12, 2013 at 6:27 AM, Sunderlin, Mark <mark.sunderlin@teamaol.com> wrote:
This seems to work just fine in other SQLs, but doesn't seem work in hive.

I need to have several wild card characters in my 'like' clause as follows.

In other SQLs, I want:  ....  where page_url_query like '%?icid=main%dl%'

But in Hive that doesn't match.  I have several work arounds.  I can write it as:

 where (page_url_query like '%?icid=main%')  and (page_url_query like '%dl%')

or, I can just use something like where length(regexp_extract(page__url_query, '.*\?cdid-main.*dl.*',0)
<> 0

But more important is two or more '%' wild cards in a Hive like expression a known issue?

---
Mark E. Sunderlin
Solutions Architect   |AOL NETWORKS BDM
P: 703-265-6935       |C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166
 


Mime
View raw message