htrace-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin P. McCabe" <cmcc...@apache.org>
Subject Re: Trace HBase/HDFS with HTrace
Date Thu, 12 Feb 2015 03:08:25 GMT
No, I think I'm the one who's missing something. :)

I will give that a try next time I'm testing out end-to-end tracing.

thanks guys.
Colin

On Wed, Feb 11, 2015 at 4:36 PM, Enis Söztutar <enis.soz@gmail.com> wrote:
> mvn install just installs it in local cache which you can then use for
> building other projects. So no need to have to define a file based local
> repo. Am I missing something?
>
> Enis
>
> On Wed, Feb 11, 2015 at 12:36 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:
>
>> Oh, I see. I was assuming a local build of Hadoop snapshot installed into
>> the local cache.
>>
>> On Wednesday, February 11, 2015, Colin P. McCabe <cmccabe@apache.org>
>> wrote:
>>
>> > On Wed, Feb 11, 2015 at 11:27 AM, Nick Dimiduk <ndimiduk@gmail.com
>> > <javascript:;>> wrote:
>> > > I don't recall the hadoop release repo restriction being a problem,
>> but I
>> > > haven't tested it lately. See if you can just specify the release
>> version
>> > > with -Dhadoop.version or -Dhadoop-two.version.
>> > >
>> >
>> > Sorry, it's been a while since I did this... I guess the question is
>> > whether 2.7.0-SNAPSHOT is available in Maven-land somewhere?  If so,
>> > then Chunxu should forget all that stuff I said, and just build HBase
>> > with -Dhadoop.version=2.7.0-SNAPSHOT
>> >
>> > > I would go against branch-1.0 as this will be the eminent 1.0.0 release
>> > and
>> > > had HTrace 3.1.0-incubating.
>> >
>> > Thanks.
>> >
>> > Colin
>> >
>> >
>> > >
>> > > -n
>> > >
>> > > On Wed, Feb 11, 2015 at 11:13 AM, Colin P. McCabe <cmccabe@apache.org
>> > <javascript:;>>
>> > > wrote:
>> > >
>> > >> Thanks for trying stuff out!  Sorry that this is a little difficult
at
>> > >> the moment.
>> > >>
>> > >> To really do this right, you would want to be using Hadoop with HTrace
>> > >> 3.1.0, and HBase with HTrace 3.1.0.  Unfortunately, there hasn't been
>> > >> a new release of Hadoop with HTrace 3.1.0.  The only existing releases
>> > >> of Hadoop use an older version of the HTrace library.  So you will
>> > >> have to build from source.
>> > >>
>> > >> If you check out Hadoop's "branch-2" branch (currently, this branch
>> > >> represents what will be in the 2.7 release, when it is cut), and build
>> > >> that, you will get the latest.  Then you have to build a version of
>> > >> HBase against the version of Hadoop you have built.
>> > >>
>> > >> By default, HBase's Maven build will build against upstream release
>> > >> versions of Hadoop only. So just setting
>> > >> -Dhadoop.version=2.7.0-SNAPSHOT is not enough, since it won't know
>> > >> where to find the jars.  To get around this problem, you can create
>> > >> your own local maven repo. Here's how.
>> > >>
>> > >> In hadoop/pom.xml, add these lines to the distributionManagement
>> stanza:
>> > >>
>> > >> +    <repository>
>> > >> +      <id>localdump</id>
>> > >> +      <url>file:///home/cmccabe/localdump/releases</url>
>> > >> +    </repository>
>> > >> +    <snapshotRepository>
>> > >> +      <id>localdump</id>
>> > >> +      <url>file:///home/cmccabe/localdump/snapshots</url>
>> > >> +    </snapshotRepository>
>> > >>
>> > >> Comment out the repositories that are already there.
>> > >>
>> > >> Now run mkdir /home/cmccabe/localdump.
>> > >>
>> > >> Then, in your hadoop tree, run mvn deploy -DskipTests.
>> > >>
>> > >> You should get a localdump directory that has files kind of like this:
>> > >>
>> > >> ...
>> > >> /home/cmccabe/localdump/snapshots/org/apache/hadoop
>> > >> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce
>> > >>
>> > >>
>> >
>> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/maven-metadata.xml.md5
>> > >>
>> > >>
>> >
>> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT
>> > >>
>> > >>
>> >
>> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml.md5
>> > >>
>> > >>
>> >
>> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/hadoop-mapreduce-2.7.0-20121120.230341-1.pom.sha1
>> > >>
>> > >>
>> >
>> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml
>> > >> ...
>> > >>
>> > >> Now, add the following lines to your HBase pom.xml:
>> > >>
>> > >>    <repositories>
>> > >>      <repository>
>> > >> +      <id>localdump</id>
>> > >> +      <url>file:///home/cmccabe/localdump</url>
>> > >> +      <name>Local Dump</name>
>> > >> +      <snapshots>
>> > >> +        <enabled>true</enabled>
>> > >> +      </snapshots>
>> > >> +      <releases>
>> > >> +        <enabled>true</enabled>
>> > >> +      </releases>
>> > >> +    </repository>
>> > >> +    <repository>
>> > >>
>> > >> This will allow you to run something like:
>> > >> mvn test -Dtest=TestMiniClusterLoadSequential -PlocalTests
>> > >> -DredirectTestOutputToFile=true -Dhadoop.profile=2.0
>> > >> -Dhadoop.version=2.7.0-SNAPSHOT -Dcdh.hadoop.version=2.7.0-SNAPSHOT
>> > >>
>> > >> Once we do a new release of Hadoop with HTrace 3.1.0 this will get
a
>> lot
>> > >> easier.
>> > >>
>> > >> Related: Does anyone know what the best git branch to build from for
>> > >> HBase would be for this kind of testing?  I've been meaning to do some
>> > >> end to end testing (it's been on my TODO for a while)
>> > >>
>> > >> best,
>> > >> Colin
>> > >>
>> > >> On Wed, Feb 11, 2015 at 7:55 AM, Chunxu Tang <chunxutang@gmail.com
>> > <javascript:;>> wrote:
>> > >> > Hi all,
>> > >> >
>> > >> > Now I’m exploiting HTrace to trace request level data flows
in HBase
>> > and
>> > >> > HDFS. I have successfully traced HBase and HDFS by using HTrace,
>> > >> > respectively.
>> > >> >
>> > >> > After that, I combine HBase and HDFS together and I want to just
>> send
>> > a
>> > >> > PUT/GET request to HBase, but to trace the whole data flow in
both
>> > HBase
>> > >> > and HDFS. In my opinion, when I send a request such as Get to
HBase,
>> > it
>> > >> > will at last try to read the blocks on HDFS, so I can construct
a
>> > whole
>> > >> > data flow tracing through HBase and HDFS. While, the fact is that
I
>> > can
>> > >> > only get tracing data of HBase, with no data of HDFS.
>> > >> >
>> > >> > Could you give me any suggestions on how to trace the data flow
in
>> > both
>> > >> > HBase and HDFS? Does anyone have similar experience? Do I need
to
>> > modify
>> > >> > the source code? And maybe which part(s) should I touch? If I
need
>> to
>> > >> > modify the code, I will try to create a patch for that.
>> > >> >
>> > >> > Thank you.
>> > >> >
>> > >> > My Configurations:
>> > >> > Hadoop version: 2.6.0
>> > >> > HBase version: 0.99.2
>> > >> > HTrace version: htrace-master
>> > >> > OS: Ubuntu 12.04
>> > >> >
>> > >> >
>> > >> > Joshua
>> > >>
>> >
>>

Mime
View raw message