Basics of Hypothesis Testing in Power BI
Introduction
Hypothesis testing is a fundamental statistical method used to make decisions or inferences about population parameters based on sample data. It helps answer questions like:
-
Is a new marketing campaign increasing sales?
-
Is the average salary of employees more than ₹50,000?
-
Do two different regions perform equally in terms of revenue?
Concept Explanation
Population Entire group of interest (e.g., all students in India)
Sample A part of the population selected for analysis
Hypothesis A statement about a population parameter (e.g., average, proportion)
Test Statistic A standardized value used to decide whether to reject the null hypothesis
P-value Probability of observing your result if the null hypothesis is true
Significance Level (α) Threshold for decision-making (commonly 0.05)
Types of Hypotheses
Hypothesis testing is a fundamental statistical method used to make decisions or inferences about population parameters based on sample data. It helps answer questions like:
-
Is a new marketing campaign increasing sales?
-
Is the average salary of employees more than ₹50,000?
-
Do two different regions perform equally in terms of revenue?
Concept | Explanation |
---|
Population | Entire group of interest (e.g., all students in India) |
Sample | A part of the population selected for analysis |
Hypothesis | A statement about a population parameter (e.g., average, proportion) |
Test Statistic | A standardized value used to decide whether to reject the null hypothesis |
P-value | Probability of observing your result if the null hypothesis is true |
Significance Level (α) | Threshold for decision-making (commonly 0.05) |
Types of Hypotheses
1. Null Hypothesis (H₀)
-
The assumption that there is no effect or no difference.
-
It is always tested with the aim of rejecting it.
-
Example:
-
“There is no difference in average sales between North and South.”
-
“The average customer satisfaction score is 4.0.”
The assumption that there is no effect or no difference.
It is always tested with the aim of rejecting it.
Example:
-
“There is no difference in average sales between North and South.”
-
“The average customer satisfaction score is 4.0.”
2. Alternative Hypothesis (H₁ or Ha)
-
The assumption that there is an effect, or a difference exists.
-
It challenges the null.
-
Example:
-
“The average customer satisfaction score is not 4.0.”
-
“North region sales are significantly different from South.”
The assumption that there is an effect, or a difference exists.
It challenges the null.
Example:
-
“The average customer satisfaction score is not 4.0.”
-
“North region sales are significantly different from South.”
Step 1: Define Hypotheses
Clearly state H₀ and H₁.
🔸 Example:
-
H₀: μ = ₹50,000
-
H₁: μ ≠ ₹50,000
Step 2: Set Significance Level (α)
Common choices are:
-
0.05 (5% chance of error)
-
0.01 (1% chance of error)
Step 3: Choose the Test Type
Based on data and assumptions:
Condition | Test |
---|---|
Sample size < 30 and population SD unknown | T-Test |
Sample size ≥ 30 and population SD known | Z-Test |
Comparing means of two groups | Two-sample T-Test |
Comparing proportions | Z-Test for proportions |
Step 5: Find the Critical Value / P-value
-
Compare calculated T or Z value to the critical value from tables
-
OR use P-value approach:
-
If P-value < α → Reject H₀
-
If P-value ≥ α → Fail to reject H₀
-
Compare calculated T or Z value to the critical value from tables
-
OR use P-value approach:
-
If P-value < α → Reject H₀
-
If P-value ≥ α → Fail to reject H₀
Step 6: Make a Decision
-
If test statistic lies in rejection region → Reject H₀
-
Else → Fail to Reject H₀
-
If test statistic lies in rejection region → Reject H₀
-
Else → Fail to Reject H₀
Step 7: Draw Conclusion
Interpret in context of the problem:
“There is enough evidence to suggest that the average sales are different from ₹50,000.”
Power BI Step-by-Step Instructions
Dataset : Salesdata
Sample Size = 30
Used for: One Sample T-Test
Hypothesis:
-
H₀: Mean Sales = ₹50,000
-
H₁: Mean Sales ≠ ₹50,000
Interpret in context of the problem:
“There is enough evidence to suggest that the average sales are different from ₹50,000.”
Power BI Step-by-Step Instructions
Dataset : Salesdata
Sample Size = 30Used for: One Sample T-Test
Hypothesis:
-
H₀: Mean Sales = ₹50,000
-
H₁: Mean Sales ≠ ₹50,000
1. Load Data into Power BI
-
Open Power BI Desktop
-
Click Home > Get Data > Excel
-
Browse and select SalesData.xlsx
-
Load the SalesData table
-
Open Power BI Desktop
-
Click Home > Get Data > Excel
-
Browse and select
SalesData.xlsx
-
Load the SalesData table
2. Create DAX Measures for Analysis
Go to Modeling > New Measure and create these one-by-one:
1. Mean Sales
Dax:
Mean_Sales = AVERAGE(SalesData[SalesAmount])
2. Standard Deviation
Dax:
StdDev_Sales = STDEV.S(SalesData[SalesAmount])
3. Sample Size
Dax:
Sample_Size = COUNT(SalesData[SalesAmount])
4. T-Statistic (assuming population mean = 50000)
Dax:
T_Value =
VAR xBar = [Mean_Sales]
VAR mu = 50000
VAR s = [Sample_Size]
VAR n = [Sample_Size]
RETURN
DIVIDE(xBar - mu, s / SQRT(n))
Go to Modeling > New Measure and create these one-by-one:
1. Mean Sales
Dax:
Mean_Sales = AVERAGE(SalesData[SalesAmount])
2. Standard Deviation
Dax:
StdDev_Sales = STDEV.S(SalesData[SalesAmount])
3. Sample Size
Dax:
Sample_Size = COUNT(SalesData[SalesAmount])
4. T-Statistic (assuming population mean = 50000)
Dax:
T_Value =
VAR xBar = [Mean_Sales]
VAR mu = 50000
VAR s = [Sample_Size]
VAR n = [Sample_Size]
RETURN
DIVIDE(xBar - mu, s / SQRT(n))
Following is Mathematical Formula for
One-Sample T-Test Formula
Where:
Symbol Meaning Sample Mean (average of observed data) Population Mean (hypothesized mean under H₀) Sample Standard Deviation Sample Size Test Statistic (T-value)
5. T-Critical Value (for α = 0.05, two-tailed)
Dax:T_Critical = T.INV(0.975, [Sample_Size] - 1)
6. Hypothesis Result
Dax:
Test_Result =
VAR t_stat = ABS([T_Value])
VAR t_crit = [T_Critical]
RETURN
IF(t_stat > t_crit, "Reject Null Hypothesis", "Fail to Reject Null Hypothesis")
Symbol | Meaning |
---|---|
Sample Mean (average of observed data) | |
Population Mean (hypothesized mean under H₀) | |
Sample Standard Deviation | |
Sample Size | |
Test Statistic (T-value) |
5. T-Critical Value (for α = 0.05, two-tailed)
T_Critical = T.INV(0.975, [Sample_Size] - 1)
6. Hypothesis Result
Dax:
Test_Result =
VAR t_stat = ABS([T_Value])
VAR t_crit = [T_Critical]
RETURN
IF(t_stat > t_crit, "Reject Null Hypothesis", "Fail to Reject Null Hypothesis")
3. Build Power BI Visuals
Switch to Report View:
Switch to Report View:
➤ Use Card Visuals for:
-
Mean_Sales
-
StdDev_Sales
-
Sample_Size
-
T_Value
-
T_Critical
-
Test_Result
-
Mean_Sales
-
StdDev_Sales
-
Sample_Size
-
T_Value
-
T_Critical
-
Test_Result
Use Bar Chart to show average sales by region:
-
X-axis: Region
-
Y-axis: Average of SalesAmount
-
X-axis:
Region
-
Y-axis:
Average of SalesAmount
4. Interpret Results
Based on the dashboard:
-
If |T_Value| > T_Critical
, you reject the null hypothesis
-
If not, you fail to reject it
Explain:
“If T_Value is 2.1 and T_Critical is 2.045, we reject H₀, and conclude average sales are statistically different from ₹50,000.”
Based on the dashboard:
-
If
|T_Value| > T_Critical
, you reject the null hypothesis -
If not, you fail to reject it
Explain:
“If T_Value is 2.1 and T_Critical is 2.045, we reject H₀, and conclude average sales are statistically different from ₹50,000.”
0 टिप्पण्या
कृपया तुमच्या प्रियजनांना लेख शेअर करा आणि तुमचा अभिप्राय जरूर नोंदवा. 🙏 🙏