hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikant Dindokar <ravikant.i...@gmail.com>
Subject Re: how to assign unique ID (Long Value) in mapper
Date Tue, 30 Jun 2015 16:01:09 GMT
Thanks Gabriel.


On Tue, Jun 30, 2015 at 1:04 AM, gabriel balan <gabriel.balan@oracle.com>
wrote:

>  Hi
>
> Rather than trying to figure out the line number of the current line, you
> can use the byte offset of the current line.
> It's just as unique as the line number, and much easier to obtain:
> TextInputFormat (FileInputFormat) uses it as the key.
>
> Keys are the position in the file, and values are the line of text.
>
> If you have multiple files, you may want to combine the file offset with
> the file name (path) to get a unique id. See here how to get the input
> file name in the mapper
> <http://How%20to%20get%20the%20input%20file%20name%20in%20the%20mapper>.
>
> hth
> Gabriel Balan
>
>
> On 6/26/2015 5:29 AM, Ravikant Dindokar wrote:
>
> The problem can be thought as assigning line number for each line. Is
> there any inbuilt functionality in hadoop which can do this?
>
> On Fri, Jun 26, 2015 at 1:11 PM, Ravikant Dindokar <
> ravikant.iisc@gmail.com> wrote:
>
>> yes , there can be loop in the graph
>>
>> On Fri, Jun 26, 2015 at 9:09 AM, Harshit Mathur <mathursharp@gmail.com>
>> wrote:
>>
>>> Are there loops in your graph?
>>>
>>>
>>> On Thu, Jun 25, 2015 at 10:39 PM, Ravikant Dindokar <
>>> ravikant.iisc@gmail.com> wrote:
>>>
>>>>    Hi Hadoop user,
>>>>
>>>>  I have a file containing one line for each edge in the graph with two
>>>> vertex ids (source & sink).
>>>> sample:
>>>> 1    2 (here 1 is source and 2 is sink node for the edge)
>>>>  1    5
>>>> 2    3
>>>> 4    2
>>>> 4    3
>>>>  I want to assign a unique Id (Long value )to each edge i.e for each
>>>> line of the file.
>>>>
>>>>  How to ensure assignment of unique value in distributed mapper process?
>>>>
>>>>  Note : File size is large, so using only one reducer is not feasible.
>>>>
>>>>  Thanks
>>>>  Ravikant
>>>>
>>>
>>>
>>>
>>>  --
>>> Harshit Mathur
>>>
>>
>>
>
> --
> The statements and opinions expressed here are my own and do not necessarily represent
those of Oracle Corporation.
>
>

Mime
View raw message