arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Melik-Adamyan, Areg" <>
Subject RE: [Discuss] Benchmarking infrastructure
Date Fri, 29 Mar 2019 19:21:14 GMT
>When you say "output is parsed", how is that exactly? We don't have any scripts in the
repository to do this yet (I have some comments on this below). We also have to collect machine
information and insert that into the database. From my >perspective we have quite a bit
of engineering work on this topic ("benchmark execution and data collection") to do.
Yes I wrote one as a test.  Then it can do POST to the needed endpoint the JSON structure.
Everything else will be done in the 

>My team and I have some physical hardware (including an Aarch64 Jetson TX2 machine, might
be interesting to see what the ARM64 results look like) where we'd like to run benchmarks
and upload the results also, so we need to write some documentation about how to add a new
machine and set up a cron job of some kind.
If it can run Linux, then we can setup it. 

>I'd like to eventually have a bot that we can ask to run a benchmark comparison versus
master. Reporting on all PRs automatically might be quite a bit of work (and load on the machines)
You should be able to choose the comparison between any two points: master-PR, master now
- master yesterday, etc.

>I thought the idea (based on our past e-mail discussions) was that we would implement
benchmark collectors (as programs in the Arrow git
repository) for each benchmarking framework, starting with gbenchmark and expanding to include
ASV (for Python) and then others
I'll open a PR and happy to put it into Arrow.

>It seems like writing the benchmark collector script that runs the benchmarks, collects
machine information, and inserts data into an instance of the database is the next milestone.
Until that's done it seems difficult to do much else
Ok, will update the Jira 5070 and link the 5071.

View raw message