Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 13A8D200BEE for ; Sat, 31 Dec 2016 22:49:03 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 12563160B28; Sat, 31 Dec 2016 21:49:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 36031160B15 for ; Sat, 31 Dec 2016 22:49:02 +0100 (CET) Received: (qmail 67811 invoked by uid 500); 31 Dec 2016 21:49:00 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 67801 invoked by uid 99); 31 Dec 2016 21:49:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 31 Dec 2016 21:49:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id BDD57C10DF for ; Sat, 31 Dec 2016 21:48:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.9 X-Spam-Level: * X-Spam-Status: No, score=1.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=rallyhealth.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Aa4PlXfozsly for ; Sat, 31 Dec 2016 21:48:59 +0000 (UTC) Received: from mx0a-001a6401.pphosted.com (mx0b-001a6401.pphosted.com [67.231.153.254]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id BD11C5F4AE for ; Sat, 31 Dec 2016 21:48:58 +0000 (UTC) Received: from pps.filterd (m0076089.ppops.net [127.0.0.1]) by mx0b-001a6401.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id uBVLmLrS025451 for ; Sat, 31 Dec 2016 13:48:42 -0800 Received: from mail-lf0-f71.google.com (mail-lf0-f71.google.com [209.85.215.71]) by mx0b-001a6401.pphosted.com with ESMTP id 27pc27g8hp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Sat, 31 Dec 2016 13:48:42 -0800 Received: by mail-lf0-f71.google.com with SMTP id b14so157537803lfg.6 for ; Sat, 31 Dec 2016 13:48:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rallyhealth.com; s=rally; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=1jurm1mtw6jhtbCht2sqVlUSzZG4qrzhxhffg6f/k2g=; b=Q6LlWP0W4KDNXlZcd4bexfYs98LRMJrGGEpugGohtow5+wFAJr3k8Rq119Pfg6lIiM LAvYzwewJzmO0My6gMA41zhylKhD+SjjylhJ7qcQUV0XmKFuHM/MuTkKt0bnfytNjLh8 KyrS/XMse4elV/OwmWqwg/qZMV7OaSqmhXtc0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=1jurm1mtw6jhtbCht2sqVlUSzZG4qrzhxhffg6f/k2g=; b=rLJweUH2g2MMa1t5G3itUMI3vs+HXnrW+bkOIGkrlGBh0LAikA9czpyUr1nVvzu1LB HbzvuB6wksNTg8kZ3tkp0SS61N9jFh9koTJo7wOBRVHFJyJ8Z5aggddWxpc1eaif4p1k 1hQ6R4FB5giSysnSWIBBGAQ+YGoTGnmRAW8STpRMNZR1FrcBSmR/C14KBC/fuJizCWiD zzh1UraInn3ryC5aeoitiL4Ab7sGMYklZ1Q2kowoPVmWXgiyCTDfbOaXTXPIrjLChsEI 8n1NKJsKoBgDje2FvPPSe9kuG3Nlz6UWlpu9/JAB3hOUs26sbRXCaVCbc3GcmHKY//1+ TZ0g== X-Gm-Message-State: AIkVDXL8psLPYb110SDf2ZhMekPxdyD/ujHtkvMR4ND5qdIPfDV4VFTa2B+qpeX1szKzS4QHVPtag4No8MkapFAE1rawhCx88ZjrpJcK5bZnV4Vo+W0rELl1Nrwmv+jSB6yBMo8chN9wgH5SP46hlRdBaaS15xKTTJc5xU49szjsMHjTsUI= X-Received: by 10.25.27.72 with SMTP id b69mr14732485lfb.160.1483220919715; Sat, 31 Dec 2016 13:48:39 -0800 (PST) X-Received: by 10.25.27.72 with SMTP id b69mr14732483lfb.160.1483220919466; Sat, 31 Dec 2016 13:48:39 -0800 (PST) MIME-Version: 1.0 Received: by 10.25.66.3 with HTTP; Sat, 31 Dec 2016 13:48:38 -0800 (PST) In-Reply-To: References: From: Nicholas Hakobian Date: Sat, 31 Dec 2016 13:48:38 -0800 Message-ID: Subject: Re: Custom delimiter file load To: A Shaikh Cc: "user @spark" Content-Type: multipart/alternative; boundary=001a114022ca074e500544fb4807 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-12-31_15:,, signatures=0 X-Proofpoint-Spam-Reason: safe archived-at: Sat, 31 Dec 2016 21:49:03 -0000 --001a114022ca074e500544fb4807 Content-Type: text/plain; charset=UTF-8 See the documentation for the options given to the csv function: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameReader@csv(paths:String*):org.apache.spark.sql.DataFrame The options can be passed with the option/options functions to the DataFrameReader class (a similar syntax is also available in pySpark). -Nick Nicholas Szandor Hakobian, Ph.D. Senior Data Scientist Rally Health nicholas.hakobian@rallyhealth.com On Sat, Dec 31, 2016 at 9:58 AM, A Shaikh wrote: > In Pyspark 2 loading file wtih any delimiter into Dataframe is pretty > straightforward > spark.read.csv(file, schema=, sep='|') > > Is there something similar in Spark 2 in Scala! spark.read.csv(path, > sep='|')? > > --001a114022ca074e500544fb4807 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
See the documentation for the options given to the csv fun= ction:=C2=A0http://spark.apache.org/docs/latest/api/scala/index.html#= org.apache.spark.sql.DataFrameReader@csv(paths:String*):org.apache.spark.sq= l.DataFrame

The options can be passed with the optio= n/options functions to the DataFrameReader class (a similar syntax is also = available in pySpark).

-Nick

<= div class=3D"gmail_extra">
Nicholas Szandor Hakobian, = Ph.D.
Senior Data Scientist
Rally Health

<= /div>

On Sat, Dec 31, 2016 at 9:58 AM, A Shaikh <shaikh.afzal@gmail.com> wrote:
In Pyspark 2 loading file wtih any delimiter into= Dataframe is pretty straightforward
spark.read.csv(file, schema=3D, se= p=3D'|')

Is there something similar in= Spark 2 in Scala! spark.read.csv(path, sep=3D'|')?

<= /div>

--001a114022ca074e500544fb4807--