Brief Introduction

由於最重要的變數為學代出席率,因此我嘗試尋找兩份資料當中的變數與學代出席率的關係。我用了院別、年級、任期、各式投票率如廢票率、支持率、院別人口比例等,很遺憾的是所有線性回歸模型的解釋能力都有點差,所以底下就直接呈現其中幾個(失敗的)模型。

## 
## Call:
## lm(formula = 出席率 ~ college_vote_rate + college_vote_rate + 
##     competitive + college_population_rate, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.69859 -0.22968  0.02842  0.23313  0.52872 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             0.451053   0.098050   4.600 6.22e-06 ***
## college_vote_rate       1.542225   0.511472   3.015  0.00279 ** 
## competitive             0.092421   0.039837   2.320  0.02101 *  
## college_population_rate 0.004395   0.735543   0.006  0.99524    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3002 on 302 degrees of freedom
## Multiple R-squared:  0.0865, Adjusted R-squared:  0.07743 
## F-statistic: 9.532 on 3 and 302 DF,  p-value: 4.929e-06
## 
## Call:
## lm(formula = 出席率 ~ college_vote_rate, data = df_attnd_vote)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.66147 -0.22097  0.02742  0.22181  0.51441 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.46016    0.02995  15.366  < 2e-16 ***
## college_vote_rate  2.00690    0.42123   4.764 2.94e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3019 on 304 degrees of freedom
## Multiple R-squared:  0.06948,    Adjusted R-squared:  0.06642 
## F-statistic:  22.7 on 1 and 304 DF,  p-value: 2.939e-06