Shoe Size vs. Height
Statistical Analysis and Linear Regression
with FATHOM

Jay Yohe
Susquehanna Township High School
Grades 10-12
Algebra II

Prerequisite skills:

    Algebra I (basic skills in graphing linear functions)
     Basic knowledge of statistics (mean, median, mode, range, standard deviation, histograms and box plots)

Objectives:
 

Students will analyze 1-variable distributions (utilizing histograms and box plots) and determine measures of center and spread of the data (utilizing mean, median, range, standard deviation, etc.).
Students will enter and analyze data points in a scatter plot using Fathom.
Students will use the movable line option in Fathom to estimate the line of best fit.
Students will determine the "r" squared correlation coefficient for the line of best fit.
Students will observe the least squares regression line of best fit and obtain the equation for the line of best fit.

PA Math Standards:
 

Academic Standard

Description

2.5.11 Mathematical Problem Solving and Communication (B) Use symbols, mathematical terminology, standard notation, mathematical rules, graphing and other types of mathematical representations to communicate observations, predictions, concepts, procedures, generalizations, ideas and results.
2.6.11 Statistics and Data Analysis (B) Use appropriate technology to organize and analyze data taken from the local community.
2.6.11 Statistics and Data Analysis (C) Determine the regression equation of best fit (e.g., linear, quadratic, exponential)
2.6.11 Statistics and Data Analysis (D) Make predictions using interpolation, extrapolation, regression and estimation using technology to verify them.
2.6.11 Statistics and Data Analysis (E) Determine the validity of the sampling method described in a given study.
2.6.11 Statistics and Data Analysis (H) Use sampling techniques to draw inferences about large populations.
2.6.11 Statistics and Data Analysis (I) Describe the normal curve and use its properties to answer questions about sets of data that are assumed to be normally distributed.
2.7.11 Probability and Predictions (C) Draw and justify a conclusion regarding the validity of a probability or statistical argument.

Materials and Resources:
 

Surveys (completed by a sample population)
Computers with FATHOM (or at least one computer display station with FATHOM installed)
Supplemental lessons on exploring data by hand, exploring regressions via the TI-83 calculator and exploring regressions via EXCEL
Supplemental Exercises from Advanced Algebra by Holt, Rinehart and Winston ©1977

Instruction Mode:

    *Discovery learning in cooperative groups
    *Individual graded explorations for closure activity

Anticipatory Set:

Instructions/Procedures:

(A) Entering Data into FATHOM
The sample data (click to download this data set in EXCEL) for the instructions presented in this lesson was collected at the Eisenhower Institute at Shippensburg University in July 2001.  Of approximately 85 total mathematics teachers attending the institute, 67 of them returned completed surveys.

In FATHOM, drag down a new collection by pulling the "open box" icon off the menu bar to a blank area of the document as illustrated:
Collection

Highlight and copy the data from the attached EXCEL spreadsheet (enter male data and female data into separate collection boxes).  Right click on the collection and select "paste".  With the collection still highlighted, choose "Case Table" from the "Insert" menu.  The data should have copied nicely into FATHOM (double click on Collection 1 and rename it): Case Table

You may also highlight the collection and choose "Case Table" from the "Insert" menu to bring down a table.  Create the headings by clicking on <new> and entering in the category names.  You can manually type in the data into these new columns or copy and paste information into columns from another source.  Observe the collected data below after it was placed into FATHOM. Male Data
 
 

Female Data Part 1     Female Data Part 2
Female Data Part 3



(B) Analyze 1-Variable Statistics  (use Male data for class discussion and assign Female data for lab work)
 



Enrichment (Normal Curve)

  1. You can form the "Normal Curve" distribution over your original Histogram.  Since the total area under a normal curve must add up to "1", first rescale the Histogram.
  2. Rescale the Histogram by choosing "Scale" from the "Graph" window and choosing "Density":

  3. Density

  4. Now form the Normal Distribution curve by highlighting the Histogram and right clicking.  Choose "Plot Function".
  5. In the dialog box click "Functions", "Distributions", "Normal" and double click on "normalDensity".
  6. Since you must know the mean and standard deviation to form the Normal Curve, enter the following into the formula:

  7. Normal Formula

  8. You should see the following:

  9. Apply the empirical rule to your data set to see if it appears to have a normal distribution (the sample for this exercise has too few men to be a good sample of a normal distribution).  What is the empirical rule?
  10. Do you think that the population graph of height for all men in the world forms a normal distribution?  Justify your response and/or provide other distributions that might be more accurate!


(C) Analyze 1-Variable Statistics  (use Male data for class discussion and assign Female data for lab work)

Have students complete all activities for part (B) above with male shoe size.  Answer the same questions that were posed in part (B) for shoe size.



(D) Analyze 2-Variable Statistics  (use Male data for class discussion and assign Female data for lab work)

  1. Now drag a new graph onto a blank portion of your FATHOM screen.
  2. Drag the Height from the Case Table to the x-axis of the graph and drag the Shoe Size from the Case Table to the y-axis of the graph.  This will form a scatter plot:

  3. Scatter Plot
    Does it appear that the data points form a line?
    What type of correlation best describes this data:  positive, negative or zero?
    Is the degree of relationship of the points with regard to forming a line weak, moderate, strong or non-existent?

  4. With the graph highlighted, from the "Graph" menu, select "Movable Line".  Slide the line up and down by dragging the middle.  Change the slope of the line by grabbing the ends.  Drag this movable line to estimate the line of best fit!

  5. Movable Line

  6. Choose "Show Squares" from the Graph menu.  Your goal is to form a line with the smallest sum of squares formed by the square of the distances from the actual y values to those generated by your movable line.  Observe:

  7. Squares

  8. Now choose the "Least Squares Line" from the "Graph" menu to see how well you did with the movable line:

  9. Scatter Movable

  10. This is a perfect opportunity to demonstrate the meaning of the least sum of squares.  Compare the sums from the movable line you created and the least squares line generated by FATHOM. The line that fits the data best has minimized the sum of squares from the actual values to the values on the regression line.
  11. You can also observe the residuals line which shows the difference between the actual y value and y value generated by the regression line.  First select "Movable Line" from the "Graph" menu to turn it off.  From the "Graph" menu, choose "Make Residual Plot":

  12. Residual Line

  13. Drag one of the points on the graph.  What does this do to residuals?  What does this do to the "Least Squares Line"?
  14. Based on the regression line, what is the predicted value for a man 6'3" tall?
  15. Based on the regression line, what is the predicted height for a man with a shoe size of 12?
  16. What does the slope of the regression line really mean?
  17. What does the y-intercept of the regression line really mean?
  18. What human traits might have a better correlation than shoe size vs height?  How would you test your idea?  What population do you wish to include?  If you can't measure the entire population, how would you determine an appropriate sample to measure these traits?
  19. Write a plan (include the sample/population) to measure the correlation between two human traits other than shoe size vs height?


(E) Final Exploration (Optional)

Combine all the data for males and females and make a histogram of height (sample of combination table below):
Combined Table Samp    Eisenhower Histogram (total Height)
What kind of distribution occurs?  Create a summary table of results also (select "Summary Table" from the "Insert" menu and drag appropriate data to the down arrow and horizontal arrow):
Eisenhower Histogram (total Height)     Summary Table
It is obvious that in the sample data above, there are a majority of females in the institute.  Do you think the graphs would have the same traits if the sample sizes of females and males were of sufficient size (>30) and were identical?  Explain?  What do the combined heights of your data show?  Explain!  Do you find any interesting traits regarding mean vs median for all heights of males and females?



Assessment:

(1)  Have students work in groups to duplicate all information for the females in the sample population.  Answer all the same questions for females that were posed in lessons A through E for males.  Students will work in small cooperative groups on the female data in the computer lab utilizing FATHOM.  If students don't have access to FATHOM, they will conduct most of the data analysis on TI-83 calculators or EXCEL.

(2)  Students will submit a plan for analyzing another set of human traits for homework (refer to the questions at the end of part (D) above).  Students will receive homework points for submitting their proposal.  Students will share their ideas in class during a follow-up session.



References:

    *Eisenhower Institute presentations July 2001
    *Advanced Algebra by Holt, Rinehart and Winston ©1977 (page 28)
    *FATHOM (Key Curriculum Press)


Click to download FATHOM samples for all exercises and homework based on the Eisenhower sample data. Of course, you need the FATHOM software package from Key Curriculum Press to view these documents.

  1. Male Lab 1-Variable Statistics
  2. Female Lab 1-Variable Statistics
  3. 2-Variable Scatter Plots and Linear Regressions Equations for males and females.
  4. Least Squares Determination and Residuals

A Shane/Greydon Production
Production


Return to Hanna Web