impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-5181) Make it possible to get Python package metadata from an HTML web page in pip_download.py
Date Sat, 08 Apr 2017 01:54:41 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Taras Bobrovytsky resolved IMPALA-5181.
---------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.9.0

{code}
commit 4a79c9e7e3928f919b5fb60bab4145ba886d6252
Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Date:   Thu Mar 30 13:08:21 2017 -0700

    IMPALA-5181: Extract PYPI metadata from a webpage
    
    There were some build failures due to a failure to download a JSON file
    containing package metadata from PYPI. We need to switch to downloading
    this from a PYPI mirror. In order to be able to download the metadata
    from a PYPI mirror, we need be able to extract the data from a web page,
    because PYPI mirrors do not always have a JSON interface.
    
    We implement a regex based html parser in this patch. Also, we increase
    the number of download attempts and randomly vary the amount of time
    between each attempt.
    
    Testing:
    - Tested locally against PYPI and a PYPI mirror.
    - Ran a private build that passed (which used a PYPI mirror).
    
    Change-Id: If3845a0d5f568d4352e3cc4883596736974fd7de
    Reviewed-on: http://gerrit.cloudera.org:8080/6579
    Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
    Tested-by: Impala Public Jenkins
{code}

> Make it possible to get Python package metadata from an HTML web page in pip_download.py
> ----------------------------------------------------------------------------------------
>
>                 Key: IMPALA-5181
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5181
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>            Reporter: Taras Bobrovytsky
>            Assignee: Taras Bobrovytsky
>             Fix For: Impala 2.9.0
>
>
> Currently pip_download.py allows retrieving Python package metadata only from a JSON
file, for example https://pypi.python.org/pypi/pyparsing/json. This is a problem because some
PYPI mirrors might not implement this interface.
> The data in the JSON file should also be accessible through a web interface - for example,
https://pypi.python.org/simple/pyparsing/
> pip_download.py should be able to parse the web page and extract the information we need.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message