Label
Encoding
Before
Applying Label Encoding:
import
pandas as pd
from
sklearn.datasets import fetch_openml
from
sklearn.preprocessing import LabelEncoder
#
Load the Titanic dataset
titanic
= fetch_openml(name='titanic', version=1, as_frame=True)
#
Convert the data and target into a DataFrame
df
= pd.concat([titanic.data, titanic.target], axis=1)
#
Display the first few rows of the DataFrame after label encoding
df.head()
After Applying Label Encoding:
import
pandas as pd
from
sklearn.datasets import fetch_openml
from
sklearn.preprocessing import LabelEncoder
#
Load the Titanic dataset
titanic
= fetch_openml(name='titanic', version=1, as_frame=True)
#
Convert the data and target into a DataFrame
df
= pd.concat([titanic.data, titanic.target], axis=1)
#
Define columns for label encoding
columns_to_encode
= ['sex', 'embarked', 'pclass', 'survived']
#
Apply label encoding
label_encoder
= LabelEncoder()
for
column in columns_to_encode:
df[column] =
label_encoder.fit_transform(df[column])
#
Display the first few rows of the DataFrame after label encoding
df.head()
This Python code performs
the following tasks:
Import necessary
libraries: The code begins by importing the required libraries - pandas,
fetch_openml from sklearn.datasets, and LabelEncoder from
sklearn.preprocessing.
Load the Titanic dataset:
Using the fetch_openml function, it loads the Titanic dataset with the
specified name ('titanic') and version (1) as a DataFrame (as_frame=True). This
dataset contains information about passengers aboard the Titanic.
Convert the data and
target into a DataFrame: The code concatenates the data and target of the
Titanic dataset into a single DataFrame called df. This DataFrame contains both
the features (data) and the target variable ('survived').
Define columns for label
encoding: It specifies the columns that need to be label encoded. In this case,
the columns include 'sex', 'embarked', 'pclass', and 'survived'.
Apply label encoding: A
LabelEncoder object is created, and a loop is used to iterate over the columns
specified for label encoding. For each column, the fit_transform method of the
LabelEncoder object is applied to encode the categorical values into numeric
labels.
Display the first few
rows of the DataFrame after label encoding: Finally, the code displays the
first few rows of the DataFrame (df) after label encoding using the head()
method.
The label encoding
replaces categorical values with numeric labels, making it easier to work with
the data in machine learning algorithms that require numerical input. For
example, in the 'sex' column, 'male' might be encoded as 0 and 'female' as 1,
while in the 'embarked' column, 'S', 'C', and 'Q' might be encoded as 0, 1, and
2 respectively.
0 टिप्पण्या
कृपया तुमच्या प्रियजनांना लेख शेअर करा आणि तुमचा अभिप्राय जरूर नोंदवा. 🙏 🙏