drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ganesh semalty (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-3946) How to run query with column name containing dot (.)
Date Sun, 18 Oct 2015 03:05:05 GMT

     [ https://issues.apache.org/jira/browse/DRILL-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ganesh semalty updated DRILL-3946:
----------------------------------
    Description: 
Hi,

Hadoop is used to put any raw data and then I am using Apache Drill to query that data. The
data file records are something like this :-

! TICKET_NBR 1 ! GSI 81 ! 3100.2.11.2 1131 ! 3100.2.112.1 14/05/2014 22:45:59 ! 3100.2.22.3
16 ! 3100.2.10.1 4

where, 3100.2.11.2 = 1131
3100.2.112.1 = 14/05/2014 22:45:59
and so on ... means ! mark separate key value pair. 
If I use raw file (after replacing ! with , comma), and put in hadoop and see records from
Drill Explorer, it shows as follows :-

TICKET_NBR 1  <--> GSI 81  <---> 3100.2.11.2 1131 <--->

(I am using <---> to show different columns here). Now I do not know "3100.2.11.2" will
appear in which column in another row... It can be any column.
QUERY 1: So is that possible to query a string in any column of the table for a match ?

Even if I format the data, such that header contains "3100.2.11.2" and so on and then put
each row data in columns below that I am unable to query through sql, as column name contains
dot (.)

select "3100.2.11.2" from TABLE where "3100.2.121.1" = ABC;

Query fails, even if i surround column name within square brackets "[" "]".
I converted my .csv file to .json, still frome drill explorer the data is represented in same
manner and thus query doesnt work,

QUERY 2: So I am not able to understand what we mean when we say, Apache drill can work on
un-structured data as well. Is it necessary to format the data before we query ?

Even when I replaced 3100.2.22.8 with 3100_2_22_8 and so on ... stil error:

0: jdbc:drill:zk=local> select 3100_2_22_8 from hadoop.`/user/hduser/samplecdl_odo_1.json`;
Error: DATA_READ ERROR: Error parsing JSON - Cannot read from the middle of a record. Current
token was START_ARRAY

File  /user/hduser/samplecdl_odo_1.json
Record  1
Fragment 0:0
[Error Id: a2bf3e90-1052-43c3-9a3a-fcff74666591 on ubuntu:31010] (state=,code=0)






  was:
Hi,

Hadoop is used to put any raw data and then I am using Apache Drill to query that data. The
data file records are something like this :-

! TICKET_NBR 1 ! GSI 81 ! 3100.2.11.2 1131 ! 3100.2.112.1 14/05/2014 22:45:59 ! 3100.2.22.3
16 ! 3100.2.10.1 4

where, 3100.2.11.2 = 1131
3100.2.112.1 = 14/05/2014 22:45:59
and so on ... means ! mark separate key value pair. 
If I use raw file (after replacing ! with , comma), and put in hadoop and see records from
Drill Explorer, it shows as follows :-

TICKET_NBR 1  <--> GSI 81  <---> 3100.2.11.2 1131 <--->

(I am using <---> to show different columns here). Now I do not know "3100.2.11.2" will
appear in which column in another row... It can be any column.
QUERY: So is that possible to query a string in any column of the table for a match ?

Even if I format the data, such that header contains "3100.2.11.2" and so on and then put
each row data in columns below that I am unable to query through sql, as column name contains
dot (.)

select "3100.2.11.2" from TABLE where "3100.2.121.1" = ABC;

Query fails, even if i surround column name within square brackets "[" "]".
I converted my .csv file to .json, still frome drill explorer the data is represented in same
manner and thus query doesnt work,


QUERY : So I am not able to understand what we mean when we say, Apache drill can work on
un-structured data as well. Is it necessary to format the data before we query ?




> How to run query with column name containing dot (.)
> ----------------------------------------------------
>
>                 Key: DRILL-3946
>                 URL: https://issues.apache.org/jira/browse/DRILL-3946
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.1.0
>         Environment: Apache Drill 1.1.0
> Ubuntu
>            Reporter: ganesh semalty
>
> Hi,
> Hadoop is used to put any raw data and then I am using Apache Drill to query that data.
The data file records are something like this :-
> ! TICKET_NBR 1 ! GSI 81 ! 3100.2.11.2 1131 ! 3100.2.112.1 14/05/2014 22:45:59 ! 3100.2.22.3
16 ! 3100.2.10.1 4
> where, 3100.2.11.2 = 1131
> 3100.2.112.1 = 14/05/2014 22:45:59
> and so on ... means ! mark separate key value pair. 
> If I use raw file (after replacing ! with , comma), and put in hadoop and see records
from Drill Explorer, it shows as follows :-
> TICKET_NBR 1  <--> GSI 81  <---> 3100.2.11.2 1131 <--->
> (I am using <---> to show different columns here). Now I do not know "3100.2.11.2"
will appear in which column in another row... It can be any column.
> QUERY 1: So is that possible to query a string in any column of the table for a match
?
> Even if I format the data, such that header contains "3100.2.11.2" and so on and then
put each row data in columns below that I am unable to query through sql, as column name contains
dot (.)
> select "3100.2.11.2" from TABLE where "3100.2.121.1" = ABC;
> Query fails, even if i surround column name within square brackets "[" "]".
> I converted my .csv file to .json, still frome drill explorer the data is represented
in same manner and thus query doesnt work,
> QUERY 2: So I am not able to understand what we mean when we say, Apache drill can work
on un-structured data as well. Is it necessary to format the data before we query ?
> Even when I replaced 3100.2.22.8 with 3100_2_22_8 and so on ... stil error:
> 0: jdbc:drill:zk=local> select 3100_2_22_8 from hadoop.`/user/hduser/samplecdl_odo_1.json`;
> Error: DATA_READ ERROR: Error parsing JSON - Cannot read from the middle of a record.
Current token was START_ARRAY
> File  /user/hduser/samplecdl_odo_1.json
> Record  1
> Fragment 0:0
> [Error Id: a2bf3e90-1052-43c3-9a3a-fcff74666591 on ubuntu:31010] (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message