Create function to automatically create plots from summary(fit <- lm( y ~ x1 + x2 +... xn))_问答_开发者

Create function to automatically create plots from summary(fit <- lm( y ~ x1 + x2 +... xn))

开发者 https://www.devze.com 2023-04-12 10:47 出处：网络

I am running the same regression with small alterations of x variables several times. My aim is after having determined the fit and significance of each variable for this linear regression model to view all all major plots. Instead of having to create each plot one by one, I want a function to loop through my variables (x1...xn) from the following list.

fit <-lm( y ~ x1 + x2 +... xn))

The plots I want to create for all x are 1) 'x versus y' for all x in the function above 2) 'x versus predicted y 3) x versus residuals 4) x versus time, where time is not a variable used in the regression but provided in the dataframe the data comes from.

I know how to access the coefficients from fit, however I am not able to use the coefficient names from the summary and reuse them in a function for creating the plots, as the names are characters.

I hope my question has been clearly descr开发者_运维问答ibed and hasn't been asked already.

Thanks!

Create some mock data

dat <- data.frame(x1=rnorm(100), x2=rnorm(100,4,5), x3=rnorm(100,8,27), 
  x4=rnorm(100,-6,0.1), t=(1:100)+runif(100,-2,2))
dat <- transform(dat, y=x1+4*x2+3.6*x3+4.7*x4+rnorm(100,3,50))

Make the fit

fit <- lm(y~x1+x2+x3+x4, data=dat)

Compute the predicted values

dat$yhat <- predict(fit)

Compute the residuals

dat$resid <- residuals(fit)

Get a vector of the variable names

vars <- names(coef(fit))[-1]

A plot can be made using this character representation of the name if you use it to build a string version of a formula and translate that. The four plots are below, and the are wrapped in a loop over all the vars. Additionally, this is surrounded by setting ask to TRUE so that you get a chance to see each plot. Alternatively you arrange multiple plots on the screen, or write them all to files to review later.

opar <- par(ask=TRUE)
for (v in vars) {
  plot(as.formula(paste("y~",v)), data=dat)
  plot(as.formula(paste("yhat~",v)), data=dat)
  plot(as.formula(paste("resid~",v)), data=dat)
  plot(as.formula(paste("t~",v)), data=dat)
}
par(opar)

The coefficients are stored in the fit objects as you say, but you can access them generically in a function by referring to them this way:

x <- 1:10
y <- x*3 + rnorm(1)
plot(x,y)

fit <- lm(y~x)
fit$coefficient[1] # intercept
fit$coefficient[2] # slope
str(fit) # a lot of info, but you can see how the fit is stored

My guess is when you say you know how to access the coefficients you are getting them from summary(fit) which is a bit harder to access than taking them directly from the fit. By using fit$coeff[1] etc you don't have to have the name of the variable in your function.

Three options to directly answer what I think was the question: How to access the coefficients using character arguments:

x <- 1:10
y <- x*3 + rnorm(1)
fit <- lm(y~x)
# 1
fit$coefficient["x"]
# 2
coefname <- "x"
fit$coefficient[coefname]
#3
coef(fit)[coefname]

If the question was how to plot the various functions then you should supply a sufficiently complex construction (in R) to allow demonstration of methods with a well-specified set of objects.