From commits-return-1337-archive-asf-public=cust-asf.ponee.io@parquet.apache.org Mon Jun 4 17:35:54 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id E9B58180636 for ; Mon, 4 Jun 2018 17:35:53 +0200 (CEST) Received: (qmail 56268 invoked by uid 500); 4 Jun 2018 15:35:52 -0000 Mailing-List: contact commits-help@parquet.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@parquet.apache.org Delivered-To: mailing list commits@parquet.apache.org Received: (qmail 56259 invoked by uid 99); 4 Jun 2018 15:35:52 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jun 2018 15:35:52 +0000 Received: by gitbox.apache.org (ASF Mail Server at gitbox.apache.org, from userid 33) id 468C882971; Mon, 4 Jun 2018 15:35:52 +0000 (UTC) Date: Mon, 04 Jun 2018 15:35:52 +0000 To: "commits@parquet.apache.org" Subject: [parquet-mr] branch master updated: PARQUET-1311: Update README.md (#487) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Message-ID: <152812655218.5951.12297108250196093444@gitbox.apache.org> From: zivanfi@apache.org X-Git-Host: gitbox.apache.org X-Git-Repo: parquet-mr X-Git-Refname: refs/heads/master X-Git-Reftype: branch X-Git-Oldrev: 345e2d541128471641e76aaa44dd5046f199197d X-Git-Newrev: aed9097640c7adffe1151b32e86b5efc3702c657 X-Git-Rev: aed9097640c7adffe1151b32e86b5efc3702c657 X-Git-NotificationType: ref_changed_plus_diff X-Git-Multimail-Version: 1.5.dev Auto-Submitted: auto-generated This is an automated email from the ASF dual-hosted git repository. zivanfi pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/parquet-mr.git The following commit(s) were added to refs/heads/master by this push: new aed9097 PARQUET-1311: Update README.md (#487) aed9097 is described below commit aed9097640c7adffe1151b32e86b5efc3702c657 Author: nandorKollar AuthorDate: Mon Jun 4 17:35:47 2018 +0200 PARQUET-1311: Update README.md (#487) parquet-mr documentation was not up to date: - pointed to broken URLs - instructed to install old Thrift version - current version was stated as 1.8.1, although 1.10.0 is already released --- README.md | 86 ++++++++++++++++++++++++++++------------------------------- dev/README.md | 4 +-- 2 files changed, 43 insertions(+), 47 deletions(-) diff --git a/README.md b/README.md index f084f50..4b6b96a 100644 --- a/README.md +++ b/README.md @@ -20,9 +20,9 @@ Parquet MR [![Build Status](https://travis-ci.org/apache/parquet-mr.svg?branch=master)](http://travis-ci.org/apache/parquet-mr) ====== -Parquet-MR contains the java implementation of the [Parquet format](https://github.com/apache/parquet-format). +Parquet-MR contains the java implementation of the [Parquet format](https://github.com/apache/parquet-format). Parquet is a columnar storage format for Hadoop; it provides efficient storage and encoding of data. -Parquet uses the [record shredding and assembly algorithm](https://github.com/Parquet/parquet-mr/wiki/The-striping-and-assembly-algorithms-from-the-Dremel-paper) described in the Dremel paper to represent nested structures. +Parquet uses the [record shredding and assembly algorithm](https://github.com/julienledem/redelm/wiki/The-striping-and-assembly-algorithms-from-the-Dremel-paper) described in the Dremel paper to represent nested structures. You can find some details about the format and intended use cases in our [Hadoop Summit 2013 presentation](http://www.slideshare.net/julienledem/parquet-hadoop-summit-2013) @@ -49,11 +49,11 @@ sudo ldconfig To build and install the thrift compiler, run: ``` -wget -nv http://archive.apache.org/dist/thrift/0.7.0/thrift-0.7.0.tar.gz -tar xzf thrift-0.7.0.tar.gz -cd thrift-0.7.0 +wget -nv http://archive.apache.org/dist/thrift/0.9.3/thrift-0.9.3.tar.gz +tar xzf thrift-0.9.3.tar.gz +cd thrift-0.9.3 chmod +x ./configure -./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang +./configure --disable-gen-erl --disable-gen-hs --without-ruby --without-haskell --without-erlang --without-php --without-nodejs sudo make install ``` @@ -67,31 +67,29 @@ LC_ALL=C mvn clean install ## Features -Parquet is a very active project, and new features are being added quickly; below is the state as of June 2013. - - - - - - - - - - - - - - - - - - - - - - - -
FeatureIn trunkIn devPlannedExpected release
Type-specific encodingYES1.0
Hive integrationYES (28)1.0
Pig integrationYES1.0
Cascading integrationYES1.0
Crunch integrationYES (CRUNCH-277)1.0
Impala integrationYES (non-nested)1.0
Java Map/Reduce APIYES1.0
Native Avro supportYES1.0
Native Thrift supportYES1.0
Complex structure supportYES1.0
Future-proofed versioningYES1.0
RLEYES1.0
Bit PackingYES1.0
Adaptive dictionary encodingYES1.0
Predicate pushdownYES (68)1.0
Column statsYES2.0
Delta encodingYES2.0
Native Protocol Buffers supportYES1.0
Index pagesYES2.0
+Parquet is a very active project, and new features are being added quickly. Here are a few features: + + +* Type-specific encoding +* Hive integration +* Pig integration +* Cascading integration +* Crunch integration +* Apache Arrow integration +* Apache Scrooge integration +* Impala integration (non-nested) +* Java Map/Reduce API +* Native Avro support +* Native Thrift support +* Native Protocol Buffers support +* Complex structure support +* Run-length encoding (RLE) +* Bit Packing +* Adaptive dictionary encoding +* Predicate pushdown +* Column stats +* Delta encoding +* Index pages ## Map/Reduce integration @@ -138,46 +136,44 @@ Hive integration is provided via the [parquet-hive](https://github.com/apache/pa ## Build -to run the unit tests: -mvn test +To run the unit tests: `mvn test` -to build the jars: -mvn package +To build the jars: `mvn package` The build runs in [Travis CI](http://travis-ci.org/apache/parquet-mr): [![Build Status](https://travis-ci.org/apache/parquet-mr.svg?branch=master)](http://travis-ci.org/apache/parquet-mr) ## Add Parquet as a dependency in Maven -The current release is version `1.8.1` +The current release is version `1.10.0` ```xml org.apache.parquet parquet-common - 1.8.1 + 1.10.0 org.apache.parquet parquet-encoding - 1.8.1 + 1.10.0 org.apache.parquet parquet-column - 1.8.1 + 1.10.0 org.apache.parquet parquet-hadoop - 1.8.1 + 1.10.0 ``` ### How To Contribute -We prefer to receive contributions in the form of GitHub pull requests. Please send pull requests against the [github.com/apache/parquet-mr](https://github.com/apache/parquet-mr) repository. If you've previously forked Parquet from its old location, you will need to add a remote or update your origin remote to https://github.com/apache/parquet-mr.git +We prefer to receive contributions in the form of GitHub pull requests. Please send pull requests against the [parquet-mr](https://github.com/apache/parquet-mr) Git repository. If you've previously forked Parquet from its old location, you will need to add a remote or update your origin remote to https://github.com/apache/parquet-mr.git If you are looking for some ideas on what to contribute, check out jira issues for this project labeled ["pick-me-up"](https://issues.apache.org/jira/browse/PARQUET-5?jql=project%20%3D%20PARQUET%20and%20labels%20%3D%20pick-me-up%20and%20status%20%3D%20open). Comment on the issue and/or contact [dev@parquet.apache.org](http://mail-archives.apache.org/mod_mbox/parquet-dev/) with your questions and ideas. @@ -189,8 +185,8 @@ To contribute a patch: 1. Break your work into small, single-purpose patches if possible. It’s much harder to merge in a large change with a lot of disjoint features. 2. Create a JIRA for your patch on the [Parquet Project JIRA](https://issues.apache.org/jira/browse/PARQUET). 3. Submit the patch as a GitHub pull request against the master branch. For a tutorial, see the GitHub guides on forking a repo and sending a pull request. Prefix your pull request name with the JIRA name (ex: https://github.com/apache/parquet-mr/pull/240). - 4. Make sure that your code passes the unit tests. You can run the tests with `mvn test` in the root directory. - 5. Add new unit tests for your code. + 4. Make sure that your code passes the unit tests. You can run the tests with `mvn test` in the root directory. + 5. Add new unit tests for your code. We tend to do fairly close readings of pull requests, and you may get a lot of comments. Some common issues that are not code structure related, but still important: * Use 2 spaces for whitespace. Not tabs, not 4 spaces. The number of the spacing shall be 2. @@ -212,11 +208,11 @@ We hold ourselves and the Parquet developer community to two codes of conduct: 2. [The Twitter OSS Code of Conduct](https://github.com/twitter/code-of-conduct/blob/master/code-of-conduct.md) ## Discussions -* Mailing list: [dev@parquet.apache.org](http://mail-archives.apache.org/mod_mbox/parquet-dev/) +* Mailing list: [dev@parquet.apache.org](http://mail-archives.apache.org/mod_mbox/parquet-dev/) * Bug trackter: [jira](https://issues.apache.org/jira/browse/PARQUET) * Discussions also take place in github pull requests ## License Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0 -See also: +See also: diff --git a/dev/README.md b/dev/README.md index 8fe30e0..b984b11 100644 --- a/dev/README.md +++ b/dev/README.md @@ -27,7 +27,7 @@ Merging a pull request requires being a committer on the project. have an apache and apache-github remote setup ``` git remote add apache-github https://github.com/apache/parquet-mr.git -git remote add apache https://git-wip-us.apache.org/repos/asf/parquet-mr.git +git remote add apache https://gitbox.apache.org/repos/asf?p=parquet-mr.git ``` run the following command ``` @@ -50,7 +50,7 @@ source repo/branch target master url https://api.github.com/repos/apache/parquet-mr/pulls/X -Proceed with merging pull request #3? (y/n): +Proceed with merging pull request #3? (y/n): ``` If this looks good, type y and hit enter. ``` -- To stop receiving notification emails like this one, please contact zivanfi@apache.org.