Reading: Fitting Linear Models to Data
In the real world, rarely do things follow trends perfectly. When we expect the trend to behave linearly, or when inspection suggests the trend is behaving linearly, it is often desirable to find an equation to approximate the data. Finding an equation to approximate the data helps us understand the behavior of the data and allows us to use the linear model to make predictions about the data, inside and outside of the data range.Example 1
The table below shows the number of cricket chirps in 15 seconds, and the air temperature, in degrees Fahrenheit. 1 Plot this data, and determine whether the data appears to be linearly related.chirps | 44 | 35 | 20.4 | 33 | 31 | 35 | 18.5 | 37 | 26 |
Temp | 80.5 | 70.5 | 57 | 66 | 68 | 72 | 52 | 73.5 | 53 |
Flashback
- What descriptive variables would you choose to represent Temperature & Chirps?
- Which variable is the independent variable and which is the dependent variable?
- Based on this data and the graph, what is a reasonable domain & range?
- Based on the data alone, is this function one-to-one, explain?
Example 2
Using the table of values from the previous example, find a linear function that fits the data by "eyeballing" a line that seems to fit.Interpolation and Extrapolation
Interpolation: When we predict a value inside the domain and range of the data Extrapolation: When we predict a value outside the domain and range of the data For the Temperature as a function of chirps in our hand drawn model above: Interpolation would occur if we used our model to predict temperature when the values for chirps are between 18.5 and 44. Extrapolation would occur if we used our model to predict temperature when the values for chirps are less than 18.5 or greater than 44.Example 3
- Would predicting the temperature when crickets are chirping 30 times in 15 seconds be interpolation or extrapolation? Make the prediction, and discuss if it is reasonable.
- Would predicting the number of chirps crickets will make at 40 degrees be interpolation or extrapolation? Make the prediction, and discuss if it is reasonable.
Try it Now 1
What temperature would you predict if you counted 20 chirps in 15 seconds?Fitting Lines with Technology
While eyeballing a line works reasonably well, there are statistical techniques for fitting a line to data that minimize the differences between the line and data values. (Technically, the method minimizes the sum of the squared differences in the vertical direction between the line and the data values.) This technique is called least-square regression, and can be computed by many graphing calculators, spreadsheet software like Excel or Google Docs, statistical software, and many web-based calculators.2Example 4
Correlation Coefficient
The correlation coefficient is a value, r, between –1 and 1. r > 0 suggests a positive (increasing) relationship r < 0 suggests a negative (decreasing) relationship The closer the value is to 0, the more scattered the data The closer the value is to 1 or –1, the less scattered the data is The correlation coefficient provides an easy way to get some idea of how close to a line the data falls. We should only compute the correlation coefficient for data that follows a linear pattern; if the data exhibits a non-linear pattern, the correlation coefficient is meaningless. To get a sense for the relationship between the value of r and the graph of the data, here are some large data sets with their correlation coefficients:Examples of Correlation Coefficient Values
Example 5
Calculate the correlation coefficient for our cricket data. Because the data appears to follow a linear pattern, we can use technology to calculate r = 0.9509. Since this value is very close to 1, it suggests a strong increasing linear relationship.Example 6
Gasoline consumption in the US has been increasing steadily. Consumption data from 1994 to 2004 is shown below. 3 Determine if the trend is linear, and if so, find a model for the data. Use the model to predict the consumption in 2008.Year | '94 | '95 | '96 | '97 | '98 | '99 | '00 | '01 | '02 | '03 | '04 |
Consumption (billions of gas) | 113 | 116 | 118 | 119 | 123 | 125 | 126 | 128 | 131 | 133 | 136 |
Try it Now 2
Use the model created by technology in example 6 to predict the gas consumption in 2011. Is this an interpolation or an extrapolation?Important Topics of this Section
- Fitting linear models to data by hand
- Fitting linear models to data using technology
- Interpolation
- Extrapolation
- Correlation coefficient
Flashback Answers
- T = Temperature, C = Chirps (answers may vary)
- Independent (Chirps) , Dependent (Temperature)
- Reasonable Domain (18.5, 44) , Reasonable Range (52, 80.5) (answers may vary)
- NO, it is not one-to-one, there are two different output values for 35 chirps.
Try it Now Answers
- 54 degrees Fahrenheit
- 150.871 billion gallons, extrapolation
Works Cited
1. Selected data from http://classic.globe.gov/fsl/scientistsblog/2007/10/. Retrieved Aug 3, 2010 2. For example, http://www.shodor.org/unchem/math/lls/leastsq.html 3. http://www.bts.gov/publications/national_transport...Licenses & Attributions
CC licensed content, Shared previously
- Chapter 2: Functions. Authored by: David Lippman and Melonie Rasmussen. Located at: http://www.opentextbookstore.com/precalc/1.4/Chapter%202.doc. License: CC BY: Attribution.
- Imagecreator. Provided by: Wikimedia Commons Located at: https://commons.wikimedia.org/wiki/File:Correlation_examples.png. License: Public Domain: No Known Copyright.