hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Costa <psdc1...@gmail.com>
Subject Re: Reduce output is strange
Date Tue, 03 Apr 2012 15:25:16 GMT
What I want to ask is:

- how do I read the values from sequence files that are block, or record
compressed, or uncompressed?

- how do I know if the sequence file is block compressed, record
compressed, or uncompressed?

- how do I know if it's a sequence file or a Textfile?



On 3 April 2012 16:01, Pedro Costa <psdc1978@gmail.com> wrote:

> If I want to compare 2 sequence files to see if they are the same, how do
> I compare?
>
>
>
> On 19 December 2011 14:43, Robert Evans <evans@yahoo-inc.com> wrote:
>
>> Oh I forgot to say that part of the Random Characters are actually random
>> characters.  Sequence files store a set of random characters as synch
>> points within the file.  This allows for splitting the file easily without
>> a high risk that the random sequence appears inside the data itself just by
>> chance.
>>
>> --Bobby Evans
>>
>> On 12/19/11 7:51 AM, "Pedro Costa" <psdc1978@gmail.com> wrote:
>>
>> Hi,
>>
>> In the hadoop MapReduce, I've executed the webdatascan example, and the
>> reduce output is in a SequeceFile. The result is shows here (
>> http://paste.lisp.org/display/126572). What's the trash (random
>> characters), like "u 265
>> 0000100 330 320 252 " \n # ; 374 5 211 V ' 340 376" in the output? Is the
>> output correct?
>>
>>
>> 0000000   S   E   Q 006 031   o   r   g   .   a   p   a   c   h   e   .
>> 0000020   h   a   d   o   o   p   .   i   o   .   T   e   x   t 031   o
>> 0000040   r   g   .   a   p   a   c   h   e   .   h   a   d   o   o   p
>> 0000060   .   i   o   .   T   e   x   t  \0  \0  \0  \0  \0  \0   u 265
>> 0000100 330 320 252   "  \n   #   ; 374   5 211   V   ' 340 376  \0  \0
>> 0000120  \0   X  \0  \0  \0     037   a   p   p   l   e       a   p   p
>> 0000140   l   e       b   a   n   a   n   a       a   p   p   l   e
>> 0000160   a   p   p   l   e       7   c   a   r   r   o   t       c   a
>> 0000200   r   r   o   t       c   a   r   r   o   t       c   a   r   r
>> 0000220   o   t       a   p   p   l   e       b   a   n   a   n   a
>> 0000240   c   a   r   r   o   t       b   a   n   a   n   a
>> 0000256
>>
>>
>> --
>> Thanks,
>>
>>
>
>
> --
> Best regards,
>
>


-- 
Best regards,

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message