tvm-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tqc...@apache.org
Subject [incubator-tvm-site] branch asf-site updated: Build at Mon Mar 30 15:47:27 PDT 2020
Date Mon, 30 Mar 2020 22:49:38 GMT
This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new c506653  Build at Mon Mar 30 15:47:27 PDT 2020
c506653 is described below

commit c506653d9088268817ed126e709cd084331b1516
Author: tqchen <tianqi.tchen@gmail.com>
AuthorDate: Mon Mar 30 15:47:27 2020 -0700

    Build at Mon Mar 30 15:47:27 PDT 2020
---
 2018/07/12/vta-release-announcement.html | 10 +++++-----
 2019/03/18/tvm-apache-announcement.html  |  2 +-
 atom.xml                                 | 14 +++++++-------
 rss.xml                                  | 16 ++++++++--------
 vta.html                                 |  4 ++--
 5 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/2018/07/12/vta-release-announcement.html b/2018/07/12/vta-release-announcement.html
index 9304549..08c2b6e 100644
--- a/2018/07/12/vta-release-announcement.html
+++ b/2018/07/12/vta-release-announcement.html
@@ -168,7 +168,7 @@
 
 <p>VTA is more than a standalone accelerator design: it’s an end-to-end solution
that includes drivers, a JIT runtime, and an optimizing compiler stack based on TVM. The current
release includes a behavioral hardware simulator, as well as the infrastructure to deploy
VTA on low-cost FPGA hardware for fast prototyping. By extending the TVM stack with a customizable,
and open source deep learning hardware accelerator design, we are exposing a transparent end-to-end
deep learning stack from [...]
 
-<p style="text-align: center"><img src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png"
alt="image" width="50%" /></p>
+<p style="text-align: center"><img src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png"
alt="image" width="50%" /></p>
 
 <p>The VTA and TVM stack together constitute a blueprint for end-to-end, accelerator-centric
deep learning system that can:</p>
 
@@ -223,7 +223,7 @@ The extendability of the compiler stack, combined with the ability to
modify the
 <p>The Vanilla Tensor Accelerator (VTA) is a generic deep learning accelerator built
around a GEMM core, which performs dense matrix multiplication at a high computational throughput.
 The design is inspired by mainstream deep learning accelerators, of the likes of Google’s
TPU accelerator. The design adopts decoupled access-execute to hide memory access latency
and maximize utilization of compute resources. To a broader extent, VTA can serve as a template
deep learning accelerator design, exposing a clean tensor computation abstraction to the compiler
stack.</p>
 
-<p style="text-align: center"><img src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png"
alt="image" width="60%" /></p>
+<p style="text-align: center"><img src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png"
alt="image" width="60%" /></p>
 
 <p>The figure above presents a high-level overview of the VTA hardware organization.
VTA is composed of four modules that communicate between each other via FIFO queues and single-writer/single-reader
SRAM memory blocks, to allow for task-level pipeline parallelism.
 The compute module performs both dense linear algebra computation with its GEMM core, and
general computation with its tensor ALU.
@@ -240,7 +240,7 @@ The first approach, which doesn’t require special hardware is to run
deep lear
 This simulator back-end is readily available for developers to experiment with.
 The second approach relies on an off-the-shelf and low-cost FPGA development board – the
<a href="http://www.pynq.io/">Pynq board</a>, which exposes a reconfigurable FPGA
fabric and an ARM SoC.</p>
 
-<p style="text-align: center"><img src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png"
alt="image" width="70%" /></p>
+<p style="text-align: center"><img src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png"
alt="image" width="70%" /></p>
 
 <p>The VTA release offers a simple compilation and deployment flow of the VTA hardware
design and TVM workloads on the Pynq platform, with the help of an RPC server interface.
 The RPC server handles FPGA reconfiguration tasks and TVM module invocation offloading onto
the VTA runtime.
@@ -263,7 +263,7 @@ While this platform is meant for prototyping (the 2012 FPGA cannot compete
with
 <p>A popular method used to assess the efficient use of hardware are roofline diagrams:
given a hardware design, how efficiently are different workloads utilizing the hardware compute
and memory resources. The roofline plot below shows the throughput achieved on different convolution
layers of the ResNet-18 inference benchmark. Each layer has a different arithmetic intensity,
i.e. compute to data movement ratio.
 In the left half, convolution layers are bandwidth limited, whereas on the right half, they
are compute limited.</p>
 
-<p style="text-align: center"><img src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png"
alt="image" width="60%" /></p>
+<p style="text-align: center"><img src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png"
alt="image" width="60%" /></p>
 
 <p>The goal behind designing a hardware architecture, and a compiler stack is to bring
each workload as close as possible to the roofline of the target hardware.
 The roofline plot shows the effects of having the hardware and compiler work together to
maximize utilization of the available hardware resources.
@@ -272,7 +272,7 @@ The result is an overall higher utilization of the available compute and
memory
 
 <h3 id="end-to-end-resnet-18-evaluation">End to end ResNet-18 evaluation</h3>
 
-<p style="text-align: center"><img src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png"
alt="image" width="60%" /></p>
+<p style="text-align: center"><img src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png"
alt="image" width="60%" /></p>
 
 <p>A benefit of having a complete compiler stack built for VTA is the ability to run
end-to-end workloads. This is compelling in the context of hardware acceleration because we
need to understand what performance bottlenecks, and Amdahl limitations stand in the way to
obtaining faster performance.
 The bar plot above shows inference performance with and without offloading the ResNet convolutional
layers to the FPGA-based VTA design, on the Pynq board’s ARM Cortex A9 SoC.
diff --git a/2019/03/18/tvm-apache-announcement.html b/2019/03/18/tvm-apache-announcement.html
index 98e350d..b154327 100644
--- a/2019/03/18/tvm-apache-announcement.html
+++ b/2019/03/18/tvm-apache-announcement.html
@@ -168,7 +168,7 @@
 
 <p style="text-align: center"><img src="/images/main/tvm-stack.png" alt="image"
width="70%" /></p>
 
-<p>TVM stack began as a research project at the <a href="https://sampl.cs.washington.edu/">SAMPL
group</a> of Paul G. Allen School of Computer Science &amp; Engineering, University
of Washington. The project uses the loop-level IR and several optimizations from the <a
href="http://halide-lang.org/">Halide project</a>, in addition to <a href="https://tvm.ai/about">a
full deep learning compiler stack</a> to support machine learning workloads for diverse
hardware backends.</p>
+<p>TVM stack began as a research project at the <a href="https://sampl.cs.washington.edu/">SAMPL
group</a> of Paul G. Allen School of Computer Science &amp; Engineering, University
of Washington. The project uses the loop-level IR and several optimizations from the <a
href="http://halide-lang.org/">Halide project</a>, in addition to <a href="https://tvm.apache.org/about">a
full deep learning compiler stack</a> to support machine learning workloads for diverse
hardware backends.</p>
 
 <p>Since its introduction, the project was driven by an open source community involving
multiple industry and academic institutions. Currently, the TVM stack includes a high-level
differentiable programming IR for high-level optimization, a machine learning driven program
optimizer and VTA – a fully open sourced deep learning accelerator. The community brings
innovations from machine learning, compiler systems, programming languages, and computer architecture
to build a full-stack open s [...]
 
diff --git a/atom.xml b/atom.xml
index 4a77194..775f322 100644
--- a/atom.xml
+++ b/atom.xml
@@ -4,7 +4,7 @@
  <title>TVM</title>
  <link href="https://tvm.apache.org" rel="self"/>
  <link href="https://tvm.apache.org"/>
- <updated>2020-03-30T11:16:12-07:00</updated>
+ <updated>2020-03-30T15:47:25-07:00</updated>
  <id>https://tvm.apache.org</id>
  <author>
    <name></name>
@@ -269,7 +269,7 @@ We show that automatic optimization in TVM makes it easy and flexible
to support
 
 &lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;/images/main/tvm-stack.png&quot;
alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;TVM stack began as a research project at the &lt;a href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL
group&lt;/a&gt; of Paul G. Allen School of Computer Science &amp;amp; Engineering,
University of Washington. The project uses the loop-level IR and several optimizations from
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;,
in addition to &lt;a href=&quot;https://tvm.ai/about&quot;&gt;a full deep
learning compiler stack&lt;/a&gt; to support [...]
+&lt;p&gt;TVM stack began as a research project at the &lt;a href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL
group&lt;/a&gt; of Paul G. Allen School of Computer Science &amp;amp; Engineering,
University of Washington. The project uses the loop-level IR and several optimizations from
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;,
in addition to &lt;a href=&quot;https://tvm.apache.org/about&quot;&gt;a full
deep learning compiler stack&lt;/a&gt; to [...]
 
 &lt;p&gt;Since its introduction, the project was driven by an open source community
involving multiple industry and academic institutions. Currently, the TVM stack includes a
high-level differentiable programming IR for high-level optimization, a machine learning driven
program optimizer and VTA – a fully open sourced deep learning accelerator. The community
brings innovations from machine learning, compiler systems, programming languages, and computer
architecture to build a full-stack  [...]
 
@@ -1276,7 +1276,7 @@ support, and can be used to implement convenient converters, such as
 
 &lt;p&gt;VTA is more than a standalone accelerator design: it’s an end-to-end solution
that includes drivers, a JIT runtime, and an optimizing compiler stack based on TVM. The current
release includes a behavioral hardware simulator, as well as the infrastructure to deploy
VTA on low-cost FPGA hardware for fast prototyping. By extending the TVM stack with a customizable,
and open source deep learning hardware accelerator design, we are exposing a transparent end-to-end
deep learning stac [...]
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA and TVM stack together constitute a blueprint for end-to-end, accelerator-centric
deep learning system that can:&lt;/p&gt;
 
@@ -1331,7 +1331,7 @@ The extendability of the compiler stack, combined with the ability to
modify the
 &lt;p&gt;The Vanilla Tensor Accelerator (VTA) is a generic deep learning accelerator
built around a GEMM core, which performs dense matrix multiplication at a high computational
throughput.
 The design is inspired by mainstream deep learning accelerators, of the likes of Google’s
TPU accelerator. The design adopts decoupled access-execute to hide memory access latency
and maximize utilization of compute resources. To a broader extent, VTA can serve as a template
deep learning accelerator design, exposing a clean tensor computation abstraction to the compiler
stack.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The figure above presents a high-level overview of the VTA hardware organization.
VTA is composed of four modules that communicate between each other via FIFO queues and single-writer/single-reader
SRAM memory blocks, to allow for task-level pipeline parallelism.
 The compute module performs both dense linear algebra computation with its GEMM core, and
general computation with its tensor ALU.
@@ -1348,7 +1348,7 @@ The first approach, which doesn’t require special hardware is to run
deep lear
 This simulator back-end is readily available for developers to experiment with.
 The second approach relies on an off-the-shelf and low-cost FPGA development board – the
&lt;a href=&quot;http://www.pynq.io/&quot;&gt;Pynq board&lt;/a&gt;,
which exposes a reconfigurable FPGA fabric and an ARM SoC.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA release offers a simple compilation and deployment flow of the VTA
hardware design and TVM workloads on the Pynq platform, with the help of an RPC server interface.
 The RPC server handles FPGA reconfiguration tasks and TVM module invocation offloading onto
the VTA runtime.
@@ -1371,7 +1371,7 @@ While this platform is meant for prototyping (the 2012 FPGA cannot compete
with
 &lt;p&gt;A popular method used to assess the efficient use of hardware are roofline
diagrams: given a hardware design, how efficiently are different workloads utilizing the hardware
compute and memory resources. The roofline plot below shows the throughput achieved on different
convolution layers of the ResNet-18 inference benchmark. Each layer has a different arithmetic
intensity, i.e. compute to data movement ratio.
 In the left half, convolution layers are bandwidth limited, whereas on the right half, they
are compute limited.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The goal behind designing a hardware architecture, and a compiler stack
is to bring each workload as close as possible to the roofline of the target hardware.
 The roofline plot shows the effects of having the hardware and compiler work together to
maximize utilization of the available hardware resources.
@@ -1380,7 +1380,7 @@ The result is an overall higher utilization of the available compute
and memory
 
 &lt;h3 id=&quot;end-to-end-resnet-18-evaluation&quot;&gt;End to end ResNet-18
evaluation&lt;/h3&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;A benefit of having a complete compiler stack built for VTA is the ability
to run end-to-end workloads. This is compelling in the context of hardware acceleration because
we need to understand what performance bottlenecks, and Amdahl limitations stand in the way
to obtaining faster performance.
 The bar plot above shows inference performance with and without offloading the ResNet convolutional
layers to the FPGA-based VTA design, on the Pynq board’s ARM Cortex A9 SoC.
diff --git a/rss.xml b/rss.xml
index 967dd59..dc52cef 100644
--- a/rss.xml
+++ b/rss.xml
@@ -5,8 +5,8 @@
         <description>TVM - </description>
         <link>https://tvm.apache.org</link>
         <atom:link href="https://tvm.apache.org" rel="self" type="application/rss+xml"
/>
-        <lastBuildDate>Mon, 30 Mar 2020 11:16:12 -0700</lastBuildDate>
-        <pubDate>Mon, 30 Mar 2020 11:16:12 -0700</pubDate>
+        <lastBuildDate>Mon, 30 Mar 2020 15:47:25 -0700</lastBuildDate>
+        <pubDate>Mon, 30 Mar 2020 15:47:25 -0700</pubDate>
         <ttl>60</ttl>
 
 
@@ -264,7 +264,7 @@ We show that automatic optimization in TVM makes it easy and flexible
to support
 
 &lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;/images/main/tvm-stack.png&quot;
alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;TVM stack began as a research project at the &lt;a href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL
group&lt;/a&gt; of Paul G. Allen School of Computer Science &amp;amp; Engineering,
University of Washington. The project uses the loop-level IR and several optimizations from
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;,
in addition to &lt;a href=&quot;https://tvm.ai/about&quot;&gt;a full deep
learning compiler stack&lt;/a&gt; to support [...]
+&lt;p&gt;TVM stack began as a research project at the &lt;a href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL
group&lt;/a&gt; of Paul G. Allen School of Computer Science &amp;amp; Engineering,
University of Washington. The project uses the loop-level IR and several optimizations from
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;,
in addition to &lt;a href=&quot;https://tvm.apache.org/about&quot;&gt;a full
deep learning compiler stack&lt;/a&gt; to [...]
 
 &lt;p&gt;Since its introduction, the project was driven by an open source community
involving multiple industry and academic institutions. Currently, the TVM stack includes a
high-level differentiable programming IR for high-level optimization, a machine learning driven
program optimizer and VTA – a fully open sourced deep learning accelerator. The community
brings innovations from machine learning, compiler systems, programming languages, and computer
architecture to build a full-stack  [...]
 
@@ -1271,7 +1271,7 @@ support, and can be used to implement convenient converters, such as
 
 &lt;p&gt;VTA is more than a standalone accelerator design: it’s an end-to-end solution
that includes drivers, a JIT runtime, and an optimizing compiler stack based on TVM. The current
release includes a behavioral hardware simulator, as well as the infrastructure to deploy
VTA on low-cost FPGA hardware for fast prototyping. By extending the TVM stack with a customizable,
and open source deep learning hardware accelerator design, we are exposing a transparent end-to-end
deep learning stac [...]
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA and TVM stack together constitute a blueprint for end-to-end, accelerator-centric
deep learning system that can:&lt;/p&gt;
 
@@ -1326,7 +1326,7 @@ The extendability of the compiler stack, combined with the ability to
modify the
 &lt;p&gt;The Vanilla Tensor Accelerator (VTA) is a generic deep learning accelerator
built around a GEMM core, which performs dense matrix multiplication at a high computational
throughput.
 The design is inspired by mainstream deep learning accelerators, of the likes of Google’s
TPU accelerator. The design adopts decoupled access-execute to hide memory access latency
and maximize utilization of compute resources. To a broader extent, VTA can serve as a template
deep learning accelerator design, exposing a clean tensor computation abstraction to the compiler
stack.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The figure above presents a high-level overview of the VTA hardware organization.
VTA is composed of four modules that communicate between each other via FIFO queues and single-writer/single-reader
SRAM memory blocks, to allow for task-level pipeline parallelism.
 The compute module performs both dense linear algebra computation with its GEMM core, and
general computation with its tensor ALU.
@@ -1343,7 +1343,7 @@ The first approach, which doesn’t require special hardware is to run
deep lear
 This simulator back-end is readily available for developers to experiment with.
 The second approach relies on an off-the-shelf and low-cost FPGA development board – the
&lt;a href=&quot;http://www.pynq.io/&quot;&gt;Pynq board&lt;/a&gt;,
which exposes a reconfigurable FPGA fabric and an ARM SoC.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA release offers a simple compilation and deployment flow of the VTA
hardware design and TVM workloads on the Pynq platform, with the help of an RPC server interface.
 The RPC server handles FPGA reconfiguration tasks and TVM module invocation offloading onto
the VTA runtime.
@@ -1366,7 +1366,7 @@ While this platform is meant for prototyping (the 2012 FPGA cannot compete
with
 &lt;p&gt;A popular method used to assess the efficient use of hardware are roofline
diagrams: given a hardware design, how efficiently are different workloads utilizing the hardware
compute and memory resources. The roofline plot below shows the throughput achieved on different
convolution layers of the ResNet-18 inference benchmark. Each layer has a different arithmetic
intensity, i.e. compute to data movement ratio.
 In the left half, convolution layers are bandwidth limited, whereas on the right half, they
are compute limited.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The goal behind designing a hardware architecture, and a compiler stack
is to bring each workload as close as possible to the roofline of the target hardware.
 The roofline plot shows the effects of having the hardware and compiler work together to
maximize utilization of the available hardware resources.
@@ -1375,7 +1375,7 @@ The result is an overall higher utilization of the available compute
and memory
 
 &lt;h3 id=&quot;end-to-end-resnet-18-evaluation&quot;&gt;End to end ResNet-18
evaluation&lt;/h3&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;A benefit of having a complete compiler stack built for VTA is the ability
to run end-to-end workloads. This is compelling in the context of hardware acceleration because
we need to understand what performance bottlenecks, and Amdahl limitations stand in the way
to obtaining faster performance.
 The bar plot above shows inference performance with and without offloading the ResNet convolutional
layers to the FPGA-based VTA design, on the Pynq board’s ARM Cortex A9 SoC.
diff --git a/vta.html b/vta.html
index e7ad980..ce54668 100644
--- a/vta.html
+++ b/vta.html
@@ -159,7 +159,7 @@ The current release includes a behavioral hardware simulator, as well
as the inf
 By extending the TVM stack with a customizable, and open source deep learning hardware accelerator
design, we are exposing a transparent end-to-end deep learning stack from the high-level deep
learning framework, down to the actual hardware design and implementation.
 This forms a truly end-to-end, from software-to-hardware open source stack for deep learning
systems.</p>
 
-<p style="text-align: center"><img src="http://raw.githubusercontent.com/uwsampl/web-data/master/vta/blogpost/vta_stack.png"
alt="image" width="50%" /></p>
+<p style="text-align: center"><img src="https://raw.githubusercontent.com/uwsampl/web-data/master/vta/blogpost/vta_stack.png"
alt="image" width="50%" /></p>
 
 <p>The VTA and TVM stack together constitute a blueprint for end-to-end, accelerator-centric
deep learning system that can:</p>
 
@@ -174,7 +174,7 @@ TVM is now an effort undergoing incubation at The Apache Software Foundation
(AS
 driven by an open source community involving multiple industry and academic institutions
 under the Apache way.</p>
 
-<p>Read more about VTA in the <a href="https://tvm.ai/2018/07/12/vta-release-announcement.html">TVM
blog post</a>, or in the <a href="https://arxiv.org/abs/1807.04188">VTA techreport</a>.</p>
+<p>Read more about VTA in the <a href="https://tvm.apache.org/2018/07/12/vta-release-announcement.html">TVM
blog post</a>, or in the <a href="https://arxiv.org/abs/1807.04188">VTA techreport</a>.</p>
 
       </div>
     </div>


Mime
View raw message