htrace-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chunxu Tang <chunxut...@gmail.com>
Subject Re: Trace HBase/HDFS with HTrace
Date Thu, 12 Feb 2015 21:23:42 GMT
Hi all,

Thanks for your detailed replies!

Now I have tested end-to-end tracing in two versions of HBase (0.98.10 and
0.99.2), combined with Hadoop 2.6.0 and htrace-master (3.0.4), and both of
them failed. For HBase 0.98.10, it actually has htrace 2.0.4 core, so it's
normal to get no traces. While, HBase 0.99.2 has htrace 3.0.4 core, but I
still cannot get traces of HDFS, I can only get traces of HBase.

I think the first thing I need to make sure is that I use a correct method
to implement end-to-end test. I'm not very sure whether it's good to show
whole source code on the mailing list, so I just put some core code chunks
written in the client code here:

public void run(){
        Configuration conf = HBaseConfiguration.create();
        org.apache.hadoop.hbase.trace.SpanReceiverHost.getInstance(conf);
        org.apache.hadoop.tracing.SpanReceiverHost.getInstance(new
HdfsConfiguration());

        TraceScope ts = Trace.startSpan("Gets", Sampler.ALWAYS);
        HTable table = new HTable(conf, "t1");
        Get get = new Get(Bytes.toBytes("r1"));
        table.get(get);
        ...
}

Now I can only get traces of HBase, ending with HfileReaderV2.readBlock()
function. Is my testing method correct? And because I'm not familiar with
new version of HTrace and HBase/HDFS with new htrace core, could you give
me some suggestions to detect where the error may take place?

Thank you all.

Joshua

2015-02-11 22:08 GMT-05:00 Colin P. McCabe <cmccabe@apache.org>:

> No, I think I'm the one who's missing something. :)
>
> I will give that a try next time I'm testing out end-to-end tracing.
>
> thanks guys.
> Colin
>
> On Wed, Feb 11, 2015 at 4:36 PM, Enis Söztutar <enis.soz@gmail.com> wrote:
> > mvn install just installs it in local cache which you can then use for
> > building other projects. So no need to have to define a file based local
> > repo. Am I missing something?
> >
> > Enis
> >
> > On Wed, Feb 11, 2015 at 12:36 PM, Nick Dimiduk <ndimiduk@gmail.com>
> wrote:
> >
> >> Oh, I see. I was assuming a local build of Hadoop snapshot installed
> into
> >> the local cache.
> >>
> >> On Wednesday, February 11, 2015, Colin P. McCabe <cmccabe@apache.org>
> >> wrote:
> >>
> >> > On Wed, Feb 11, 2015 at 11:27 AM, Nick Dimiduk <ndimiduk@gmail.com
> >> > <javascript:;>> wrote:
> >> > > I don't recall the hadoop release repo restriction being a problem,
> >> but I
> >> > > haven't tested it lately. See if you can just specify the release
> >> version
> >> > > with -Dhadoop.version or -Dhadoop-two.version.
> >> > >
> >> >
> >> > Sorry, it's been a while since I did this... I guess the question is
> >> > whether 2.7.0-SNAPSHOT is available in Maven-land somewhere?  If so,
> >> > then Chunxu should forget all that stuff I said, and just build HBase
> >> > with -Dhadoop.version=2.7.0-SNAPSHOT
> >> >
> >> > > I would go against branch-1.0 as this will be the eminent 1.0.0
> release
> >> > and
> >> > > had HTrace 3.1.0-incubating.
> >> >
> >> > Thanks.
> >> >
> >> > Colin
> >> >
> >> >
> >> > >
> >> > > -n
> >> > >
> >> > > On Wed, Feb 11, 2015 at 11:13 AM, Colin P. McCabe <
> cmccabe@apache.org
> >> > <javascript:;>>
> >> > > wrote:
> >> > >
> >> > >> Thanks for trying stuff out!  Sorry that this is a little
> difficult at
> >> > >> the moment.
> >> > >>
> >> > >> To really do this right, you would want to be using Hadoop with
> HTrace
> >> > >> 3.1.0, and HBase with HTrace 3.1.0.  Unfortunately, there hasn't
> been
> >> > >> a new release of Hadoop with HTrace 3.1.0.  The only existing
> releases
> >> > >> of Hadoop use an older version of the HTrace library.  So you
will
> >> > >> have to build from source.
> >> > >>
> >> > >> If you check out Hadoop's "branch-2" branch (currently, this branch
> >> > >> represents what will be in the 2.7 release, when it is cut), and
> build
> >> > >> that, you will get the latest.  Then you have to build a version
of
> >> > >> HBase against the version of Hadoop you have built.
> >> > >>
> >> > >> By default, HBase's Maven build will build against upstream release
> >> > >> versions of Hadoop only. So just setting
> >> > >> -Dhadoop.version=2.7.0-SNAPSHOT is not enough, since it won't
know
> >> > >> where to find the jars.  To get around this problem, you can create
> >> > >> your own local maven repo. Here's how.
> >> > >>
> >> > >> In hadoop/pom.xml, add these lines to the distributionManagement
> >> stanza:
> >> > >>
> >> > >> +    <repository>
> >> > >> +      <id>localdump</id>
> >> > >> +      <url>file:///home/cmccabe/localdump/releases</url>
> >> > >> +    </repository>
> >> > >> +    <snapshotRepository>
> >> > >> +      <id>localdump</id>
> >> > >> +      <url>file:///home/cmccabe/localdump/snapshots</url>
> >> > >> +    </snapshotRepository>
> >> > >>
> >> > >> Comment out the repositories that are already there.
> >> > >>
> >> > >> Now run mkdir /home/cmccabe/localdump.
> >> > >>
> >> > >> Then, in your hadoop tree, run mvn deploy -DskipTests.
> >> > >>
> >> > >> You should get a localdump directory that has files kind of like
> this:
> >> > >>
> >> > >> ...
> >> > >> /home/cmccabe/localdump/snapshots/org/apache/hadoop
> >> > >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce
> >> > >>
> >> > >>
> >> >
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/maven-metadata.xml.md5
> >> > >>
> >> > >>
> >> >
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT
> >> > >>
> >> > >>
> >> >
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml.md5
> >> > >>
> >> > >>
> >> >
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/hadoop-mapreduce-2.7.0-20121120.230341-1.pom.sha1
> >> > >>
> >> > >>
> >> >
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml
> >> > >> ...
> >> > >>
> >> > >> Now, add the following lines to your HBase pom.xml:
> >> > >>
> >> > >>    <repositories>
> >> > >>      <repository>
> >> > >> +      <id>localdump</id>
> >> > >> +      <url>file:///home/cmccabe/localdump</url>
> >> > >> +      <name>Local Dump</name>
> >> > >> +      <snapshots>
> >> > >> +        <enabled>true</enabled>
> >> > >> +      </snapshots>
> >> > >> +      <releases>
> >> > >> +        <enabled>true</enabled>
> >> > >> +      </releases>
> >> > >> +    </repository>
> >> > >> +    <repository>
> >> > >>
> >> > >> This will allow you to run something like:
> >> > >> mvn test -Dtest=TestMiniClusterLoadSequential -PlocalTests
> >> > >> -DredirectTestOutputToFile=true -Dhadoop.profile=2.0
> >> > >> -Dhadoop.version=2.7.0-SNAPSHOT -Dcdh.hadoop.version=2.7.0-SNAPSHOT
> >> > >>
> >> > >> Once we do a new release of Hadoop with HTrace 3.1.0 this will
get
> a
> >> lot
> >> > >> easier.
> >> > >>
> >> > >> Related: Does anyone know what the best git branch to build from
> for
> >> > >> HBase would be for this kind of testing?  I've been meaning to
do
> some
> >> > >> end to end testing (it's been on my TODO for a while)
> >> > >>
> >> > >> best,
> >> > >> Colin
> >> > >>
> >> > >> On Wed, Feb 11, 2015 at 7:55 AM, Chunxu Tang <chunxutang@gmail.com
> >> > <javascript:;>> wrote:
> >> > >> > Hi all,
> >> > >> >
> >> > >> > Now I’m exploiting HTrace to trace request level data flows
in
> HBase
> >> > and
> >> > >> > HDFS. I have successfully traced HBase and HDFS by using
HTrace,
> >> > >> > respectively.
> >> > >> >
> >> > >> > After that, I combine HBase and HDFS together and I want
to just
> >> send
> >> > a
> >> > >> > PUT/GET request to HBase, but to trace the whole data flow
in
> both
> >> > HBase
> >> > >> > and HDFS. In my opinion, when I send a request such as Get
to
> HBase,
> >> > it
> >> > >> > will at last try to read the blocks on HDFS, so I can construct
a
> >> > whole
> >> > >> > data flow tracing through HBase and HDFS. While, the fact
is
> that I
> >> > can
> >> > >> > only get tracing data of HBase, with no data of HDFS.
> >> > >> >
> >> > >> > Could you give me any suggestions on how to trace the data
flow
> in
> >> > both
> >> > >> > HBase and HDFS? Does anyone have similar experience? Do I
need to
> >> > modify
> >> > >> > the source code? And maybe which part(s) should I touch?
If I
> need
> >> to
> >> > >> > modify the code, I will try to create a patch for that.
> >> > >> >
> >> > >> > Thank you.
> >> > >> >
> >> > >> > My Configurations:
> >> > >> > Hadoop version: 2.6.0
> >> > >> > HBase version: 0.99.2
> >> > >> > HTrace version: htrace-master
> >> > >> > OS: Ubuntu 12.04
> >> > >> >
> >> > >> >
> >> > >> > Joshua
> >> > >>
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message