hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahesh Balija <balijamahesh....@gmail.com>
Subject Re: discrepancy du in dfs are fs
Date Thu, 29 Nov 2012 15:31:04 GMT
Hi Andy,

       I am not very sure, but you can look what format (I mean bytes/kb/mb
etc) your mysql size is in.
       Based on that you may conclude or may be mysql is storing some
additional metadata which could be the reason for difference.

       One more possibility could be whether your HDFS data is
compressed/sequence data.

Best,
Mahesh Balija,
Calsoft Labs.

On Thu, Nov 29, 2012 at 8:48 PM, Kartashov, Andy <Andy.Kartashov@mpac.ca>wrote:

> I also show some discrepancy Sqoop'ing data from MySQL.  Both MySQL
> "select count(*)  from.." and "sqoop -eval -query "select count(*).."
>  return equal number of rows. But after importing the data into hdfs ,
> hadoop fs -du shows imported data at roughly  1/2 the size of the actual
> table size in the MySQL DB.  Is that normal?
>
> Cheers.
>
>
> -----Original Message-----
> From: "Christoph Böhm" [mailto:listenbruder@gmx.net]
> Sent: Wednesday, November 28, 2012 3:10 PM
> To: user@hadoop.apache.org
> Subject: Re: discrepancy du in dfs are fs
>
>
> You're right.
> "du -b" returns the expected value.
>
> Thanks.
> Chris
>
> -------- Original-Nachricht --------
> > Datum: Wed, 28 Nov 2012 20:17:18 +0530
> > Von: Mahesh Balija <balijamahesh.mca@gmail.com>
> > An: user@hadoop.apache.org
> > Betreff: Re: discrepancy du in dfs are fs
>
> > Hi Chris,
> >
> >           Can you try the following in your local machine,
> >
> >                du -b myfile.txt
> >
> >           and compare this with the hadoop fs -du myfile.txt.
> >
> > Best,
> > Mahesh Balija,
> > Calsoft Labs.
> >
> > On Wed, Nov 28, 2012 at 7:43 PM, <listenbruder@gmx.net> wrote:
> >
> > >
> > > Hi all,
> > >
> > > I wonder wy there is a difference between "du" on HDFS and "get" + "du"
> > on
> > > my local machnine.
> > >
> > > Here is an example:
> > >
> > > hadoop fs -du myfile.txt
> > > > 81355258
> > >
> > > hadoop fs -get myfile.txt .
> > > du myfile.txt
> > > > 34919
> > >
> > > --- nevertheless ---
> > >
> > > hadoop fs -cat  myfile.txt | wc -l
> > > > 4789943
> > >
> > > cat myfile.txt | wc -l
> > > > 4789943
> > >
> > >
> > > Any idea?
> > > Thanks.
> > > Chris
> > >
> NOTICE: This e-mail message and any attachments are confidential, subject
> to copyright and may be privileged. Any unauthorized use, copying or
> disclosure is prohibited. If you are not the intended recipient, please
> delete and contact the sender immediately. Please consider the environment
> before printing this e-mail. AVIS : le présent courriel et toute pièce
> jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur
> et peuvent être couverts par le secret professionnel. Toute utilisation,
> copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le
> destinataire prévu de ce courriel, supprimez-le et contactez immédiatement
> l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent
> courriel
>

Mime
View raw message