impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5181: Extract PYPI metadata from a webpage
Date Fri, 07 Apr 2017 20:02:29 GMT
Hello Lars Volker,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6579

to look at the new patch set (#2).

Change subject: IMPALA-5181: Extract PYPI metadata from a webpage
......................................................................

IMPALA-5181: Extract PYPI metadata from a webpage

There were some build failures due to a failure to download a JSON file
containing package metadata from PYPI. We need to switch to downloading
this from a PYPI mirror. In order to be able to download the metadata
from a PYPI mirror, we need be able to extract the data from a web page,
because PYPI mirrors do not always have a JSON interface.

We implement a regex based html parser in this patch. Also, we increase
the number of download attempts and randomly vary the amount of time
between each attempt.

Testing:
- Tested locally against PYPI and a PYPI mirror.
- Ran a private build that passed (which used a PYPI mirror).

Change-Id: If3845a0d5f568d4352e3cc4883596736974fd7de
---
M infra/python/deps/pip_download.py
1 file changed, 57 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/6579/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If3845a0d5f568d4352e3cc4883596736974fd7de
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message