由於最重要的變數為學代出席率,因此我嘗試尋找兩份資料當中的變數與學代出席率的關係。我用了院別、年級、任期、各式投票率如廢票率、支持率、院別人口比例等,很遺憾的是所有線性回歸模型的解釋能力都有點差,所以底下就直接呈現其中幾個(失敗的)模型。
##
## Call:
## lm(formula = 出席率 ~ college_vote_rate + college_vote_rate +
## competitive + college_population_rate, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.69859 -0.22968 0.02842 0.23313 0.52872
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.451053 0.098050 4.600 6.22e-06 ***
## college_vote_rate 1.542225 0.511472 3.015 0.00279 **
## competitive 0.092421 0.039837 2.320 0.02101 *
## college_population_rate 0.004395 0.735543 0.006 0.99524
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3002 on 302 degrees of freedom
## Multiple R-squared: 0.0865, Adjusted R-squared: 0.07743
## F-statistic: 9.532 on 3 and 302 DF, p-value: 4.929e-06
##
## Call:
## lm(formula = 出席率 ~ college_vote_rate, data = df_attnd_vote)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.66147 -0.22097 0.02742 0.22181 0.51441
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.46016 0.02995 15.366 < 2e-16 ***
## college_vote_rate 2.00690 0.42123 4.764 2.94e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3019 on 304 degrees of freedom
## Multiple R-squared: 0.06948, Adjusted R-squared: 0.06642
## F-statistic: 22.7 on 1 and 304 DF, p-value: 2.939e-06