How To Do a Chi-Square Test in Excel (With 2 Methods)

Updated January 9, 2023

Researchers and financial analysts often use the null hypothesis when analyzing their observed data. This hypothesis suggests that there isn't a statistical relationship between different variables, though it may not always be accurate. Knowing how to do a chi-square test can help you compare patterns between observed and expected data. In this article, we teach you how to do chi-square in Excel via two methods.

Related: 7 Types of Statistical Analysis Techniques (With the Statistical Analysis Process)

What is a chi-square?

Chi-square is a statistical hypothesis test for determining whether to accept or reject the null hypothesis. It involves comparing observed and expected values to find the p-value. Once you find the p-value, or significance level, you can compare it to the alpha value to determine if the null hypothesis is valid. Researchers often use the chi-square test to evaluate the fairness of their sampling groups and analyze the relationships of different variables. Financial analysts may also use this test to make more accurate predictions regarding investment advice.

Related: How To Calculate a Test Statistic (With Types and Examples)

How to do a chi-square test in Excel

Here's how to do a chi-square test in Excel:

1. Set up 4 columns

Open your spreadsheet and create four columns with these headings:

  • Category

  • Hypothesized proportion

  • Observed data

  • Expected data

Enter your categories and observed data in their respective columns. These steps consider an example where a teacher wants to determine the probability of students having a birthday in the first half versus the second half of the year. Here's an example of how their spreadsheet might appear if they surveyed 500 students:


A

B

C

1

Category

Observe data

Expected data

2

January-June

260


3

July-December

240


4




2. Calculate the expected data

The expected data is what you assume the results would be before collecting actual data. You can calculate this value by totaling the observed data and dividing it by the number of categories. In the example above, the teacher could enter the formula "=SUM(B2:B3)" into B4 to get 500, which is the total number of students the teacher surveyed.

Then, the teacher could enter the formula "=B4/2" into C2. This formula fields 250, which is the number of students the teacher expects to say their birthday is January. Finally, they can copy this formula to all other rows in the "Expected data" column. The example above would display expected data of 250 for the number of students with birthdays in January-June and July-December.

3. Calculate the p-value

Designate a cell to contain the p-value. In this cell, type "=CHISQ.TEST" and press "Enter" on your keyboard. Enter the observed data as the first argument, type a comma and enter the expected data as the second argument. Add a closing parenthesis and press "Enter." In the example above, the formula would be: "=CHISQ.TEST(B2:B3, C2:C3)." Pressing "Enter" yields a result of 0.3710933695.

4. Compare the p-value to alpha

Alpha is the value that indicates level significance, or the probability of obtaining your results by chance. In the example above, imagine the teacher designates the standard alpha value of 0.05. You can compare your p-value to the alpha to determine whether to accept the null hypothesis, or expected values. Because the p-value of 0.3710933695 is greater than 0.05, it suggests that your observed data aligns with the expected data.

Related: How To Calculate the IQR in Statistics (With Examples)

How to do a chi-square test in Excel when comparing multiple categories

Here's how to do a chi-square test in Excel when comparing multiple categories:

1. Set up an "Observed data" table

Open your spreadsheet and create rows for all the variables in your first category. Then, create columns for all the rows in your second category. Add a final row and column to calculate the totals. Title this table "Observed data." In each "Total" cell, add a formula that calculates the sum of all the values in its respective row or column. Note that the final "Total" cell in the bottom-right corner of the table calculates the total sample size.

A student who wants to analyze the gender ratio of a high school's enrollment might create the following "Observed data" table after collecting survey results:



A

B

C

D

1


Male

Female

Total

2

Freshman

101

99

200

3

Sophomore

127

123

250

4

Junior

110

115

225

5

Senior

114

111

225

6

Total

452

448

900

2. Create an "Expected data" table

Select the titles and data you entered for the "Observed data" table. Right-click and select "Copy." Select an empty cell in the spreadsheet that has enough space next to and below it. Right-click, click "Paste special" and select "Values." Label this table "Expected data." Highlight all values other than the totals and press "Delete" on your keyboard.

Related: Learn About Being a Statistician

3. Calculate the expected values

In the "Expected data" table, calculate the expected value of each empty cell. Select the first empty cell and enter a formula that divides the product of the categories' total values by the total sample size. In the example above, the formula in cell B2 would be: "=SUM((B6*D2)/D6)." Repeat this process for all the other empty cells.

4. Calculate the p-value

Select a cell outside of the "Observed data" and "Expected data" tables. Type "=CHISQ.TEST" and press "Enter" on your keyboard. Select the observed values in the "Observed data" table, type a comma and select the expected values in the "Expected data" table. Ensure to exclude totals in your selections. Add a closing parenthesis and press "Enter" on your keyboard.

5. Compare the p-value to alpha

Finally, you can compare the p-value to the alpha value to determine whether the null hypothesis is reasonable. This example yields a p-value of 0.0004261512268, which is less than the standard alpha value of 0.05. It implies that the gender distribution across different grade levels in the observed data is accurate in larger sample sizes, despite the data diverging from the expected values.

Please note that none of the companies mentioned in this article are affiliated with Indeed.

Share:

Explore more articles

  • 10 Qualities of a Good Employee (With Examples)
  • How To Set Personal Development Goals for Work
  • 10 Types of Business Risks and How to Manage Them
  • Appreciation Calculation: Definition, Formula and Example
  • Guide to Survey Methods: Definition, Types and Advantages
  • Corporation vs. Company: What's the Difference?
  • How Far Is a Klick in Military Terms?
  • How To Add a VBA Button in Excel (Types and Methods)
  • How To Write a Reflection Paper (Components and Examples)
  • What Is the Auditory Learning Style? (With Key Strategies)
  • 8 Effective Public Relations Strategies (With Tips)
  • How To Become a Nurse Anesthetist (CRNA) in 6 Steps