Susan E Johnston
Postdoc in Ecological Genetics, University of Turku, Finland
A quick and easy function to plot lm() results with ggplot2 in R
Sometimes it's nice to quickly visualise the data that went into a simple linear regression, especially when you are performing lots of tests at once. Here is a quick and dirty solution with ggplot2 to create the following plot:Let's try it out using the
iris
dataset in R:data(iris)
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Create fit1
, a linear regression of Sepal.Length and Petal.Width. Normally we would quickly plot the data in R base graphics:fit1 <- lm(Sepal.Length ~ Petal.Width, data = iris)
summary(fit1)
##
## Call:
## lm(formula = Sepal.Length ~ Petal.Width, data = iris)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3882 -0.2936 -0.0439 0.2643 1.3452
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.7776 0.0729 65.5 <2e-16 ***
## Petal.Width 0.8886 0.0514 17.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.478 on 148 degrees of freedom
## Multiple R-squared: 0.669, Adjusted R-squared: 0.667
## F-statistic: 299 on 1 and 148 DF, p-value: <2e-16
##
plot(Sepal.Length ~ Petal.Width, data = iris)
abline(fit1)
However, we can create a quick function that will pull the data out of a linear regression, and return important values (R-squares, slope, intercept and P value) at the top of a nice ggplot graph with the regression line:
ggplotRegression <- function (fit) {
require(ggplot2)
ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) +
geom_point() +
stat_smooth(method = "lm", col = "red") +
opts(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
"; Intercept =",signif(fit$coef[[1]],5 ),
"; Slope =",signif(fit$coef[[2]], 5),
"; P =",signif(summary(fit)$coef[2,4], 5)))
}
After specifying this function, all you would have to run is:fit1 <- lm(Sepal.Length ~ Petal.Width, data = iris)
ggplotRegression(fit1)
or even justggplotRegression(lm(Sepal.Length ~ Petal.Width, data = iris))
Disclaimer: This is not suitable for non-normal distributions or categorical variables!
No comments:
Post a Comment