BBA Analytics Using Python Practice Notebook

1

Import titanic dataset. Display dataframe. Fill sex column with male as 0 and female as 1 Embarked column as “S”, “C”, and :Q” Interpret the result

In [4]:

import pandas as pdimport seaborn as sns​# Load the Titanic dataset from Seaborntitanic = sns.load_dataset('titanic')​# Display the first few rows of the DataFrameprint("Original DataFrame:")print(titanic.head())​# Fill 'sex' column with 0 for male and 1 for femaletitanic['sex'] = titanic['sex'].map({'male': 0, 'female': 1})​# Fill 'embarked' column with 'S', 'C', and 'Q'titanic['embarked'].fillna('S', inplace=True)titanic['embarked'] = titanic['embarked'].map({'S': 'S', 'C': 'C', 'Q': 'Q'})​# Display the modified DataFrameprint("\nDataFrame after filling values:")titanic.head()

executed in 70ms, finished 21:44:35 2024-04-19

Original DataFrame:
   survived  pclass     sex   age  sibsp  parch     fare embarked  class  \
0         0       3    male  22.0      1      0   7.2500        S  Third   
1         1       1  female  38.0      1      0  71.2833        C  First   
2         1       3  female  26.0      0      0   7.9250        S  Third   
3         1       1  female  35.0      1      0  53.1000        S  First   
4         0       3    male  35.0      0      0   8.0500        S  Third   

     who  adult_male deck  embark_town alive  alone  
0    man        True  NaN  Southampton    no  False  
1  woman       False    C    Cherbourg   yes  False  
2  woman       False  NaN  Southampton   yes   True  
3  woman       False    C  Southampton   yes  False  
4    man        True  NaN  Southampton    no   True  

DataFrame after filling values:

Out[4]:

	survived	pclass	sex	age	sibsp	fare	embarked	class	who	adult_male	deck	embark_town	alive	alone
0	0	3	0	22.0	1	7.2500	S	Third	man	True	NaN	Southampton	no	False
1	1	1	1	38.0	1	71.2833	C	First	woman	False	C	Cherbourg	yes	False
2	1	3	1	26.0	0	7.9250	S	Third	woman	False	NaN	Southampton	yes	True
3	1	1	1	35.0	1	53.1000	S	First	woman	False	C	Southampton	yes	False
4	0	3	0	35.0	0	8.0500	S	Third	man	True	NaN	Southampton	no	True

2 Import titanic dataset. Find null values. Display total null values.

If there are null values fill all null values with mean values. Display your dataframe after filling null values with mean.

In [8]:

import pandas as pdimport seaborn as sns​# Load the Titanic dataset from Seaborntitanic = sns.load_dataset('titanic')​# Display total null valuesprint("Total null values before filling:")print(titanic.isnull().sum())​# Fill null values with meantitanic.fillna(titanic.mean(), inplace=True)​# Display DataFrame after filling null values with meanprint("\nDataFrame after filling null values with mean:")titanic.head()​

executed in 83ms, finished 21:47:05 2024-04-19

Total null values before filling:
survived         0
pclass           0
sex              0
age            177
sibsp            0
parch            0
fare             0
embarked         2
class            0
who              0
adult_male       0
deck           688
embark_town      2
alive            0
alone            0
dtype: int64

DataFrame after filling null values with mean:

C:\Users\Admin\AppData\Local\Temp\ipykernel_25652\2346204559.py:12: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.  Select only valid columns before calling the reduction.
  titanic.fillna(titanic.mean(), inplace=True)

Out[8]:

	survived	pclass	sex	age	sibsp	fare	embarked	class	who	adult_male	deck	embark_town	alive	alone
0	0	3	male	22.0	1	7.2500	S	Third	man	True	NaN	Southampton	no	False
1	1	1	female	38.0	1	71.2833	C	First	woman	False	C	Cherbourg	yes	False
2	1	3	female	26.0	0	7.9250	S	Third	woman	False	NaN	Southampton	yes	True
3	1	1	female	35.0	1	53.1000	S	First	woman	False	C	Southampton	yes	False
4	0	3	male	35.0	0	8.0500	S	Third	man	True	NaN	Southampton	no	True

3 Import any dataset apply mean, median, mode, corr and cov and std function

In [9]:

import pandas as pdimport seaborn as sns​# Load the dataset from Seaborniris = sns.load_dataset('iris')​# Display the first few rows of the datasetprint("First few rows of the dataset:")print(iris.head())​# Calculate meanprint("\nMean values:")print(iris.mean())​# Calculate medianprint("\nMedian values:")print(iris.median())​# Calculate modeprint("\nMode values:")print(iris.mode())​# Calculate correlationprint("\nCorrelation matrix:")print(iris.corr())​# Calculate covarianceprint("\nCovariance matrix:")print(iris.cov())​# Calculate standard deviationprint("\nStandard deviation values:")print(iris.std())​

executed in 8.32s, finished 21:49:22 2024-04-19

First few rows of the dataset:
   sepal_length  sepal_width  petal_length  petal_width species
0           5.1          3.5           1.4          0.2  setosa
1           4.9          3.0           1.4          0.2  setosa
2           4.7          3.2           1.3          0.2  setosa
3           4.6          3.1           1.5          0.2  setosa
4           5.0          3.6           1.4          0.2  setosa

Mean values:
sepal_length    5.843333
sepal_width     3.057333
petal_length    3.758000
petal_width     1.199333
dtype: float64

Median values:
sepal_length    5.80
sepal_width     3.00
petal_length    4.35
petal_width     1.30
dtype: float64

Mode values:
   sepal_length  sepal_width  petal_length  petal_width     species
0           5.0          3.0           1.4          0.2      setosa
1           NaN          NaN           1.5          NaN  versicolor
2           NaN          NaN           NaN          NaN   virginica

Correlation matrix:
              sepal_length  sepal_width  petal_length  petal_width
sepal_length      1.000000    -0.117570      0.871754     0.817941
sepal_width      -0.117570     1.000000     -0.428440    -0.366126
petal_length      0.871754    -0.428440      1.000000     0.962865
petal_width       0.817941    -0.366126      0.962865     1.000000

Covariance matrix:
              sepal_length  sepal_width  petal_length  petal_width
sepal_length      0.685694    -0.042434      1.274315     0.516271
sepal_width      -0.042434     0.189979     -0.329656    -0.121639
petal_length      1.274315    -0.329656      3.116278     1.295609
petal_width       0.516271    -0.121639      1.295609     0.581006

Standard deviation values:
sepal_length    0.828066
sepal_width     0.435866
petal_length    1.765298
petal_width     0.762238
dtype: float64

C:\Users\Admin\AppData\Local\Temp\ipykernel_25652\4261197520.py:13: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.  Select only valid columns before calling the reduction.
  print(iris.mean())
C:\Users\Admin\AppData\Local\Temp\ipykernel_25652\4261197520.py:17: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.  Select only valid columns before calling the reduction.
  print(iris.median())
C:\Users\Admin\AppData\Local\Temp\ipykernel_25652\4261197520.py:33: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.  Select only valid columns before calling the reduction.
  print(iris.std())

4 Import array package and perform following operations

Addition
multiplication
product of an array elements

In [11]:

import numpy as np​# Define two arraysarr1 = np.array([1, 2, 3])arr2 = np.array([4, 5, 6])​# Additionaddition_result = arr1 + arr2print("Addition result:", addition_result)​# Multiplicationmultiplication_result = arr1 * arr2print("Multiplication result:", multiplication_result)​# Product of array elementsarray_product = np.prod(arr1)print("Product of array elements:", array_product)​

executed in 22ms, finished 21:50:27 2024-04-19

Addition result: [5 7 9]
Multiplication result: [ 4 10 18]
Product of array elements: 6

5 Use any dataset and draw boxplot. Interpret the boxplot

In [12]:

import seaborn as snsimport matplotlib.pyplot as plt​# Load the dataset directly from Seaborn (for example, let's use the 'iris' dataset)iris = sns.load_dataset('iris')​# Draw a boxplot for each numerical columnsns.boxplot(data=iris)​# Show the plotplt.show()​

executed in 179ms, finished 21:52:15 2024-04-19

*Here’s a more detailed breakdown of the box plot:

The top whisker extends to the maximum value within 1.5 IQR from the upper quartile.
The upper quartile is the 75th percentile, which means that 75% of the data points fall below this value.
The box represents the middle 50% of the data points, or the IQR.
The lower quartile is the 25th percentile, which means that 25% of the data points fall below this value.
The bottom whisker extends to the minimum value within 1.5 IQR from the lower quartile.
The outliers are any data points that fall outside the whiskers.
Box plots are a useful way to quickly visualize the distribution of data, including the center, spread, and outliers. They can be used to compare data sets or to identify patterns in data.

मधूषाब्लॉग्स

Header Ad

BBA Analytics Using Python Practice Notebook

BBA Analytics Using Python Practice Notebook

1

2 Import titanic dataset. Find null values. Display total null values.

3 Import any dataset apply mean, median, mode, corr and cov and std function

4 Import array package and perform following operations

5 Use any dataset and draw boxplot. Interpret the boxplot

Posted by: Dr.Manisha More

टिप्पणी पोस्ट करा

0 टिप्पण्या

Translate Article

Popular Posts

C Language Program List with Source Code

Python Program List with Source Code

Oracle - SQL / PLSQL Notes

Categories

Tags

आमचे इतर ब्लॉग पहा

Feed

माझ्याबद्दल

फॉलोअर ( ब्लॉग ला फॉलो करा )

Menu Footer Widget

मधूषाब्लॉग्स

Header Ad

BBA Analytics Using Python Practice Notebook

BBA Analytics Using Python Practice Notebook

1

2 Import titanic dataset. Find null values. Display total null values.

3 Import any dataset apply mean, median, mode, corr and cov and std function

4 Import array package and perform following operations

5 Use any dataset and draw boxplot. Interpret the boxplot

Posted by: Dr.Manisha More

तुम्‍हाला या पोस्‍ट आवडू शकतात

टिप्पणी पोस्ट करा

0 टिप्पण्या

Translate Article

Social Plugin

Popular Posts

C Language Program List with Source Code

Python Program List with Source Code

Oracle - SQL / PLSQL Notes

Categories

Tags

आमचे इतर ब्लॉग पहा

Feed

माझ्याबद्दल

फॉलोअर ( ब्लॉग ला फॉलो करा )

Menu Footer Widget