Increasingly, grade retention is viewed as an important alternative to social promotion, yet evidence to date is unable to disentangle how the effect of grade retention varies by abilities and over time. The key challenge is differential selection of students into retention across grades and by abilities. Because existing quasi-experimental methods cannot address this question, we develop a new strategy that is a hybrid between a control function and a generalization of the fixed effects approach. Applying our method to nationally-representative, longitudinal data, we find evidence of dynamic selection into retention and that the treatment effect of retention varies considerably across grades and unobservable abilities of students. Our strategy can be applied more broadly to many time-varying or multiple treatment settings.