apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Munagala Ramanath <...@datatorrent.com>
Subject Re: [malhar-users] HDFS file read
Date Thu, 01 Oct 2015 15:50:28 GMT
Couple of questions:

(a) Why do you have a backslash in the separator string ? Are you trying to
split on the non-printable ASCII code 1 ?
(b) The first line does not have the sub-string "001", so what are you
getting as the result of split() call and what are you expecting ?
(c) The second line does have the sub-string "001", so again, what are
expecting for this line and what are you getting ?



On Thu, Oct 1, 2015 at 1:32 AM, <kalikrishna.pasumarti@gmail.com> wrote:

>
> Hi,
> Below you can find the row which we are trying to split.
>
> 1855003555798283MFRAPS1858-11-17F1302015-08-282015-08-28
> 18:29:44CHG9003REGCA201508P
>
> Thanks,
> krishna
>
> On Thursday, October 1, 2015 at 1:33:12 PM UTC+5:30, Ashwin Chandra Putta
> wrote:
>>
>> Krishna,
>>
>> Can you paste the line you are trying to split?
>>
>> Regards,
>> Ashwin.
>> On Oct 1, 2015 12:22 AM, <kalikrishn...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am using AbstactFileInputOperator to read HDFS file and once i get the
>>> first row and am trying to separate it by finding \001. There it is unable
>>> to identify that separator.
>>> Below you can have the reference code.
>>>
>>>  String temp=br.readLine();
>>>
>>>             *if*(temp!=*null*){
>>>
>>>             arr=temp.split("\001");
>>>
>>>
>>> Thanks,
>>>
>>> krishna
>>>
>>> On Wednesday, September 30, 2015 at 6:54:26 PM UTC+5:30, Tushar Gosavi
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Moving this thread to dev@apex.
>>>>
>>>> Which operator are you using for reading HDFS files? If you have
>>>> written your own
>>>> operator for parsing, then can you please check your parsing logic
>>>> separately and
>>>> make sure that it works before adding it into the operator.
>>>>
>>>> - Tushar.
>>>>
>>>>
>>>> On Wed, Sep 30, 2015 at 4:11 PM, <kalikrishn...@gmail.com> wrote:
>>>>
>>>>> HI,
>>>>> My requirement is to read HDFS file which has the separator as "\001".
>>>>> While developing the code in data torrent its unable to find the \001
>>>>> separator in the file. Actually that row has 15 columns but its taking
as
>>>>> one column only.
>>>>>
>>>>> Kindly suggest me how to over come this.
>>>>>
>>>>> Below you can find the sample data of HDFS file.
>>>>> 1855003555798283DTVDTV2015-08-07E2600077594.992015-08-282015-08-28
>>>>> 18:29:42CHG9003REGCA201508P
>>>>> 1910001924128448DTVDTV2013-02-07P21407.22015-08-282015-08-28
>>>>> 15:20:24CHG9002REGIL201508P
>>>>>
>>>>> Totally 2 rows each row is having 15 columns.
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Malhar" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to malhar-users...@googlegroups.com.
>>>>> To post to this group, send email to malhar...@googlegroups.com.
>>>>> Visit this group at http://groups.google.com/group/malhar-users.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> “I'd have blown my top, because I want to beat this damn thing,
>>>>  as long as I've gone this far. I can't just leave it after I've found
>>>>  out so much about it. I have to keep going to find out ultimately
>>>> what is the matter with it in the end."
>>>>                 Richard P. Feynman
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Malhar" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to malhar-users...@googlegroups.com.
>>> To post to this group, send email to malhar...@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/malhar-users.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message