spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saatvik Shah <saatvikshah1...@gmail.com>
Subject Re: Best alternative for Category Type in Spark Dataframe
Date Fri, 16 Jun 2017 14:40:43 GMT
Hi Pralabh,

I want the ability to create a column such that its values be restricted to
a specific set of predefined values.
For example, suppose I have a column called EMOTION: I want to ensure each
row value is one of HAPPY,SAD,ANGRY,NEUTRAL,NA.

Thanks and Regards,
Saatvik Shah

On Fri, Jun 16, 2017 at 10:30 AM, Pralabh Kumar <pralabhkumar@gmail.com>
wrote:

> Hi satvik
>
> Can u please provide an example of what exactly you want.
>
>
>
> On 16-Jun-2017 7:40 PM, "Saatvik Shah" <saatvikshah1994@gmail.com> wrote:
>
>> Hi Yan,
>>
>> Basically the reason I was looking for the categorical datatype is as
>> given here
>> <https://pandas.pydata.org/pandas-docs/stable/categorical.html>: ability
>> to fix column values to specific categories. Is it possible to create a
>> user defined data type which could do so?
>>
>> Thanks and Regards,
>> Saatvik Shah
>>
>> On Fri, Jun 16, 2017 at 1:42 AM, 颜发才(Yan Facai) <facai.yan@gmail.com>
>> wrote:
>>
>>> You can use some Transformers to handle categorical data,
>>> For example,
>>> StringIndexer encodes a string column of labels to a column of label
>>> indices:
>>> http://spark.apache.org/docs/latest/ml-features.html#stringindexer
>>>
>>>
>>> On Thu, Jun 15, 2017 at 10:19 PM, saatvikshah1994 <
>>> saatvikshah1994@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I'm trying to convert a Pandas -> Spark dataframe. One of the columns
I
>>>> have
>>>> is of the Category type in Pandas. But there does not seem to be
>>>> support for
>>>> this same type in Spark. What is the best alternative?
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://apache-spark-user-list.
>>>> 1001560.n3.nabble.com/Best-alternative-for-Category-Type-in-
>>>> Spark-Dataframe-tp28764.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>>
>>>>
>>>
>>
>>
>> --
>> *Saatvik Shah,*
>> *1st  Year,*
>> *Masters in the School of Computer Science,*
>> *Carnegie Mellon University*
>>
>> *https://saatvikshah1994.github.io/ <https://saatvikshah1994.github.io/>*
>>
>


-- 
*Saatvik Shah,*
*1st  Year,*
*Masters in the School of Computer Science,*
*Carnegie Mellon University*

*https://saatvikshah1994.github.io/ <https://saatvikshah1994.github.io/>*

Mime
View raw message