Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A693C200D23 for ; Thu, 19 Oct 2017 11:56:04 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A50EC1609ED; Thu, 19 Oct 2017 09:56:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EB91A1609E2 for ; Thu, 19 Oct 2017 11:56:03 +0200 (CEST) Received: (qmail 66689 invoked by uid 500); 19 Oct 2017 09:56:02 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 66668 invoked by uid 99); 19 Oct 2017 09:56:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Oct 2017 09:56:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E55F2CFFE8 for ; Thu, 19 Oct 2017 09:56:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id tngz_Dd4-SBc for ; Thu, 19 Oct 2017 09:56:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D68775F3DE for ; Thu, 19 Oct 2017 09:56:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 70E4EE0B20 for ; Thu, 19 Oct 2017 09:56:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 33ECA21EE5 for ; Thu, 19 Oct 2017 09:56:00 +0000 (UTC) Date: Thu, 19 Oct 2017 09:56:00 +0000 (UTC) From: "Steve Loughran (JIRA)" To: common-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HADOOP-14965) s3a input stream "normal" fadvise mode to be adaptive MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 19 Oct 2017 09:56:04 -0000 Steve Loughran created HADOOP-14965: --------------------------------------- Summary: s3a input stream "normal" fadvise mode to be adaptive Key: HADOOP-14965 URL: https://issues.apache.org/jira/browse/HADOOP-14965 Project: Hadoop Common Issue Type: Sub-task Reporter: Steve Loughran HADOOP-14535 added seek optimisation to wasb, but rather than require the caller to declare sequential vs random, it works out for itself. # defaults to sequential, lazy seek # if the caller ever seeks backwards, switches to random IO. This means that on the use pattern of columnar stores: of go to end of file, read summary, then go to columns and work forwards, will switch to random IO after that first seek back (cost: one aborted HTTP connection)/. Where this should benefit the most is in downstream apps where you are working with different data sources in the same object store/running of the same app config, but have different read patterns. I'm seeing exactly this in some of my spark tests, where it's near impossible to set things up so that .gz files are read sequentially, but ORC data is read in random IO I propose the "normal" fadvise => adaptive, sequential==sequential always, random => random from the outset. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-dev-help@hadoop.apache.org