Geospatial and Demographic Analysis of Indian Census Data Using Power BI via Google Drive Integration

Geospatial and Demographic Analysis of Indian Census Data Using Power BI via Google Drive Integration

Objective:
To analyze India's demographic and geographic distribution by leveraging Power BI for meaningful insights into population, settlements, and resource density using census data hosted on Google Drive.

To effectively plan resources and infrastructure, policymakers need to analyze population distribution, settlement types, and density across Indian states. This project guides students to use Power BI to connect live census data stored on Google Drive, transform the dataset using Power Query, and create a visual demographic report.

 Download Dataset

Steps to Connect Google Drive Dataset to Power BI via Web URL + Data Transformation


STEP 1: Prepare and Publish the File from Google Drive

  1. Open Google Drive and locate the .xlsx or .csv file.

  2. Double-click to open the file in Google Sheets (you can convert Excel to Google Sheets if not already).

  3. Go to the top menu bar: File > Share > Publish to web.

  4. In the “Publish to the web” dialog:

    • Under Link, choose:

      • Entire Document or a Specific Sheet (recommended).

    • Under Format, select:

      • Microsoft Excel (.xlsx) OR

      • Comma-separated values (.csv) (for cleaner data import).

  5. Click the Publish button.

  6. A public URL will be generated – Copy that URL.


STEP 2: Load Data into Power BI from Published Web URL

  1. Open Power BI Desktop.

  2. Click Home > Get Data > Web.

  3. In the URL field, paste the published Google Sheet/CSV URL.

  4. Click OK.

  5. In the Navigator window, Power BI will detect the table or sheet.

  6. Select the table and choose:

    • Load (if data is clean), or

    • Transform Data to open Power Query Editor.


STEP 3: Perform Data Transformation in Power Query Editor

In Power Query Editor, follow these transformations based on your dataset:


A. Rename Columns

  • Use friendly names:

    • PersonsTotalPopulation

    • Area (In sq. km)AreaSqKm

    • Population per sq. kmPopulationDensity

    • Total/Rural/UrbanRegionType

    • NameRegionName

    • Etc.


B. Remove Unwanted Rows or Columns

  • Remove top rows if headers repeat.

  • Delete columns like State Code, District Code if not needed for analysis.


C. Change Data Types

  • Ensure correct types:

    • Whole Number: Population, households, villages

    • Decimal Number: Area, density

    • Text: Region names, region type


D. Fill or Replace Nulls (if any)

  • Use Fill Down or Replace Errors where data is missing.


E. Create Custom Columns (Optional)

Example custom columns:

1. Total Villages

= [Inhabited Villages] + [Uninhabited Villages]

2. Gender Ratio

= ([Females] / [Males]) * 1000

3. Population per Household

= [TotalPopulation] / [Number of Households]

F. Filter by Region Type (Optional)

  • Use filters to analyze only Total, Rural, or Urban.


STEP 4: Apply and Load

  1. After all transformations are complete, click Close & Apply.

  2. Your dataset is now ready in Power BI for visual analysis.


Research Questions with Steps in Power BI


Research Question 1:

"Which Indian regions have the highest and lowest population densities?"

✅ Steps:

  1. Ensure column Population per sq. km or compute it using:

    PopulationDensity = DIVIDE([TotalPopulation], [AreaSqKm])
    
  2. Create a Map visual:

    • Location: RegionName

    • Values: PopulationDensity

  3. Add a slicer for RegionType (Total/Rural/Urban).

  4. Use Conditional Formatting in a Table or Matrix to highlight highest/lowest densities.


Research Question 2:

"How does the distribution of villages and towns vary across states and region types?"

✅ Steps:

  1. In Power Query, ensure columns for:

    • Inhabited Villages

    • Uninhabited Villages

    • Number of Towns

  2. Create a Stacked Bar Chart:

    • Axis: RegionName

    • Values: Total Villages and Number of Towns

  3. Add a slicer for RegionType.


Research Question 3:

"What is the gender distribution across regions?"

✅ Steps:

  1. Use the existing Males and Females columns.

  2. Create:

    • Pie Chart for Gender Proportion

    • Bar Chart for side-by-side comparison

  3. Add Tooltip to show exact counts.

  4. Create a DAX Measure for Gender Ratio:

    GenderRatio = DIVIDE([Females], [Males]) * 1000
    

    Display this using Cards or in Tooltips.


Research Question 4:

"Which regions have the highest number of households per sq. km?"

✅ Steps:

  1. Create a new DAX Measure:

    HouseholdsPerSqKm = DIVIDE([Number of Households], [AreaSqKm])
    
  2. Use a Bar Chart:

    • Axis: RegionName

    • Value: HouseholdsPerSqKm

  3. Add filters or sorting to show top/bottom 10 regions.


Research Question 5:

"Is there a correlation between total population and area size?"

✅ Steps:

  1. Create a Scatter Chart:

    • X-axis: AreaSqKm

    • Y-axis: TotalPopulation

    • Details: RegionName

  2. Optionally, add RegionType as a Legend.

  3. Use Trend Line to show correlation.


Research Question 6:

"Which states have urban dominance versus rural dominance?"

✅ Steps:

  1. Use the RegionType field to filter Urban and Rural.

  2. Create:

    • A Table showing count of towns vs villages.

    • A Stacked Column Chart:

      • Axis: RegionName

      • Values: Urban Population, Rural Population (if available separately) or infer from type.

  3. Use Filters to switch views.


Research Question 7:

"What is the overall population distribution across states?"

✅ Steps:

  1. Create a Bar Chart:

    • Axis: RegionName

    • Value: TotalPopulation

  2. Add a Map or Donut Chart for visual storytelling.

  3. Use Slicers to filter by RegionType.


टिप्पणी पोस्ट करा

0 टिप्पण्या