Clustering in Power BI: A Step-by-Step Guide

Clustering is a data analysis technique used to group similar data points together based on their characteristics. In Power BI, clustering helps you identify patterns, segment data, and make better decisions. This guide explains two approaches: using Power BI’s built-in clustering and using the Python visual.

1. Clustering in Power BI (Without Python)

Step 1: Prepare Your Data

Ensure your dataset contains numeric fields for clustering (e.g., Sales, Profit, Quantity).
You can also use categorical fields as a in Values filed (e.g., Region, Product Category).

Step 2: Create a Scatter Plot

Go to Visualizations → Select Scatter Chart.
Assign X-axis and Y-axis numeric fields.
Optionally, add a Legend field to color the points.

Step 3: Enable Clustering

Select the scatter plot.
click on (...) three dots More options ---> click Automatically find Clusters
Set up the setting if required, set the cluster size as per your requirement, by default it is Auto - it will create 6 clusters then click OK
Power BI will automatically group data into clusters based on similarity.

Step 4: Customize Clusters

Choose the number of clusters manually or let Power BI decide automatically.
Clusters will be color-coded for easy visualization.
You can hover over points to see cluster membership.

Use case: Segment customers based on purchase behavior, identify high-profit regions, or group products based on sales and quantity.

2. Clustering in Power BI Using Python Visual

Sometimes you need more advanced clustering or want to experiment with different algorithms. Power BI allows you to use Python scripts for clustering.

Step 1: Prepare Your Data

Load your dataset in Power BI.
Ensure numeric fields are included (e.g., Sales, Profit, Quantity).

Step 2: Add a Python Visual

Click on the Python visual icon in the Visualizations pane.
Drag the numeric fields you want to cluster into the Values section.
Power BI automatically makes these fields available as a dataframe called dataset in Python.

Step 3: Run Python for Clustering

Use Python to apply clustering algorithms like K-Means or Hierarchical Clustering.
The output can be scatter plots with clusters, color-coded points, or even advanced 3D visualizations.

Step 4: Customize Clusters

Change the number of clusters or clustering method depending on your analysis needs.
Python allows flexible clustering on multiple dimensions, which may not be possible with the built-in clustering tool.

Use case: Segment products by multiple metrics, perform customer profiling, or analyze patterns in large datasets like Superstore or sales records.

3. Tips for Effective Clustering in Power BI

Always use numeric fields for X and Y axes.
Add Legend fields for better visualization.
Use Python visual when you need advanced customization or more than 2 dimensions.
Keep the number of clusters meaningful for interpretation.
Hover over points to understand cluster composition.

Clustering in Power BI Using Python Visual

Clustering is the process of grouping similar data points based on patterns in the data. In Power BI, the Python visual allows you to apply advanced clustering algorithms, like K-Means, for more flexibility compared to the built-in clustering option.

Step 1: Prepare Your Data

Load your dataset into Power BI (e.g., Sample Superstore dataset).
Make sure your dataset contains numeric fields for clustering, such as:
- Sales
- Profit
- Quantity

Step 2: Add a Python Visual

Click the Python visual (Py icon) in the Visualizations pane.
Drag the numeric fields you want to cluster into the Values section.
Power BI automatically passes these fields to Python as a dataframe called dataset.

Step 3: Write Python Code for Clustering

# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:

# dataset = pandas.DataFrame(Sales, Profit, Quantity)

# dataset = dataset.drop_duplicates()

# Paste or type your script code here:

# Import libraries

import pandas as pd

from sklearn.cluster import KMeans

import matplotlib.pyplot as plt

import seaborn as sns

# The dataset from Power BI is automatically called 'dataset'

df = dataset.copy()

# Optional: check the data

print(df.head())

# Select only numeric columns for clustering

X = df[['Sales', 'Profit', 'Quantity']]

# KMeans clustering (choose number of clusters, e.g., 3)

kmeans = KMeans(n_clusters=3, random_state=42)

df['Cluster'] = kmeans.fit_predict(X)

# Plot clusters using seaborn

plt.figure(figsize=(8,6))

sns.scatterplot(data=df, x='Sales', y='Profit', hue='Cluster', palette='Set2', s=100)

plt.title('K-Means Clustering: Sales vs Profit')

plt.show()

If you change the size of cluster as per your requirement then change number and run python script again

मधूषाब्लॉग्स

Header Ad

Clustering in Power BI

Clustering in Power BI: A Step-by-Step Guide

1. Clustering in Power BI (Without Python)

Step 1: Prepare Your Data

Step 2: Create a Scatter Plot

Step 3: Enable Clustering

Step 4: Customize Clusters

2. Clustering in Power BI Using Python Visual

Step 1: Prepare Your Data

Step 2: Add a Python Visual

Step 3: Run Python for Clustering

Step 4: Customize Clusters

3. Tips for Effective Clustering in Power BI

Clustering in Power BI Using Python Visual

Step 1: Prepare Your Data

Step 2: Add a Python Visual

Step 3: Write Python Code for Clustering

Posted by: Dr.Manisha More

टिप्पणी पोस्ट करा

0 टिप्पण्या

Translate Article

Popular Posts

C Language Program List with Source Code

Python Program List with Source Code

C Programming Notes

Categories

Tags

आमचे इतर ब्लॉग पहा

Feed

माझ्याबद्दल

फॉलोअर ( ब्लॉग ला फॉलो करा )

Menu Footer Widget

मधूषाब्लॉग्स

Header Ad

Clustering in Power BI

Clustering in Power BI: A Step-by-Step Guide

1. Clustering in Power BI (Without Python)

Step 1: Prepare Your Data

Step 2: Create a Scatter Plot

Step 3: Enable Clustering

Step 4: Customize Clusters

2. Clustering in Power BI Using Python Visual

Step 1: Prepare Your Data

Step 2: Add a Python Visual

Step 3: Run Python for Clustering

Step 4: Customize Clusters

3. Tips for Effective Clustering in Power BI

Clustering in Power BI Using Python Visual

Step 1: Prepare Your Data

Step 2: Add a Python Visual

Step 3: Write Python Code for Clustering

Posted by: Dr.Manisha More

तुम्‍हाला या पोस्‍ट आवडू शकतात

टिप्पणी पोस्ट करा

0 टिप्पण्या

Translate Article

Social Plugin

Popular Posts

C Language Program List with Source Code

Python Program List with Source Code

C Programming Notes

Categories

Tags

आमचे इतर ब्लॉग पहा

Feed

माझ्याबद्दल

फॉलोअर ( ब्लॉग ला फॉलो करा )

Menu Footer Widget