Contributed equally to this work with: Daniel P. Maes, Julia Tucher Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing Affiliations Department of Mathematics and Statistics, Williams College, Williamstown, MA, United States of America, Department of Mathematics, University of Michigan, Ann Arbor, MI, United States of America
Contributed equally to this work with: Daniel P. Maes, Julia Tucher Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing * E-mail: jjt4@williams.edu Affiliation Department of Mathematics and Statistics, Williams College, Williamstown, MA, United States of America
Roles Formal analysis, Investigation, Methodology, Project administration, Resources, Validation, Visualization, Writing – original draft, Writing – review & editing Affiliations Department of Mathematics and Statistics, Williams College, Williamstown, MA, United States of America, Institute for the Quantitative Study of Inclusion, Diversity, and Equity, Williamstown, MA, United States of America
Black and Latinx students are underrepresented on most public university campuses. At the same time, affirmative action policies are controversial and legally fraught. The Supreme Court has ruled that affirmative action should help a minoritized group achieve a critical mass of representation. While the idea of critical mass is frequently invoked in law and in policy, the term remains ill-defined and hence difficult to operationalize. Motivated by these challenges, we build a mathematical model to forecast undergraduate student body racial/ethnic demographics on public university campuses. Our model takes the form of a Markov chain that tracks students through application, admission, matriculation, retention, and graduation. Using publicly available data, we calibrate our model for two different campuses within the University of California system, test it for accuracy, and make a 10-year prediction. We also propose a coarse definition of critical mass and use our model to assess progress towards it at the University of California-Berkeley. If no policy changes are made over the next decade, we predict that the Latinx population on campus will move towards critical mass but not achieve it, and that the Black student population will decrease, moving further below critical mass. Because affirmative action is banned in California and in nine other states, it is worthwhile to consider alternative policies for diversifying a campus, including targeted recruitment and retention efforts. Our modeling framework provides a setting in which to test the efficacy of affirmative action and of these alternative policies.
Citation: Maes DP, Tucher J, Topaz CM (2021) Affirmative action, critical mass, and a predictive model of undergraduate student body demographics. PLoS ONE 16(5): e0250266. https://doi.org/10.1371/journal.pone.0250266
Editor: Christopher M. Danforth, University of Vermont, UNITED STATES
Received: November 30, 2020; Accepted: April 1, 2021; Published: May 12, 2021
Copyright: © 2021 Maes et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used to build our model are available from the National Center for Education Statistics (https://nces.ed.gov/programs/digest/d15/tables/dt15_219.30.asp), the Regents of the University of California Undergraduate Admissions Summary (https://www.universityofcalifornia.edu/infocenter/admissions-residency-and-ethnicity), and the Regents of the University of California Undergraduate Graduation Rates (https://www.universityofcalifornia.edu/infocenter/ug-outcomes).
Funding: We acknowledge support from Williams College through the senior thesis and science research assistant programs. Author DPM acknowledges support from the National Science Foundation Graduate Research Fellowship through grant DGE-1256260.
Competing interests: The authors have declared that no competing interests exist.
Affirmative action policies are those designed to increase representation of minoritized identity groups in spheres where they have faced historical exclusion or discrimination. The identity groups in question typically pertain to gender or race/ethnicity, and common spheres of exclusion include employment, housing, and education [1]. Affirmative action has been at the forefront of the American social consciousness in recent decades. Proponents see affirmative action as necessary to pave the way for equality, while opponents view it as reverse discrimination [2]. In the United States, the debate over affirmative action has been ongoing since the policy’s inception in 1961 [3].
Sixty years later, important questions remain, two of which motivate our work. First, how can we know the effect that an affirmative action policy will have? To date, affirmative action policy assessment overwhelmingly uses retrospective data analysis [4]. That is to say, the most common way to assess a policy has been to implement it and then gather data for several years. One then uses statistical tools to assess whether or not a demographic shift has occurred, and if it is attributable to affirmative action. Policymakers might tweak their policies, wait to see the effect, and repeat, resulting in a lengthy, iterative process.
Our second motivating question is: what does it mean for an affirmative action policy to be successful? As we will later explain, affirmative action should help a particular minoritized group achieve a critical mass of representation. While the idea of critical mass is now used in legal proceedings and in policy decisions, the term remains ill-defined. We will propose a coarse metric that could serve as a lower bound on critical mass. Our definition is specific enough to be operationalizable but flexible enough to allow institutional variation. It is crucial to keep in mind that the ultimate goals of affirmative action should be to increase equity and diversity. Critical mass is not an end unto itself, but nonetheless, it is a tool that campuses can use in assessing diversity. Our proposed definition is especially suited for a quantitative modeling framework.
Our work addresses the aforementioned questions by creating a tool for prospective rather than retrospective analysis. We build a forecasting model in the form of a mathematical object known as a Markov chain. We then validate the model and use it to predict racial/ethnic demographics on public campuses. An advantage of a predictive modeling approach is that it provides a test bed for policy experimentation in situations where, as mentioned before, the actual experiment would take years to assess. Our modeling framework allows us not only to predict the effect of various affirmative action admissions policies, but equally, of other policies that might diversify a student body and yet that are less legally contentious. For instance, an educational institution could ask a question such as “if we increase by a few percentage points the probability that Black students accept our admissions offers, what effect might we expect on the overall demographic make-up of the student body in coming years?” and use our model to make a prediction.
More specifically, our study aims to create a predictive model of racial/ethnic demographics in the student bodies of undergraduate programs at public colleges and universities. We use the University of California as our case study because of the large number of students they serve, because of their data transparency practices, and because, as we will explain, the state of California has played a key role in the history of affirmative action. Additionally, University of California schools do not practice race-conscious admissions due to legal limitations. Thus, we can assess the importance of recruitment for underrepresented groups throughout the admissions process in lieu of accounting for race/ethnicity in the admissions process. Still, our model and methods could be straightforwardly applied to other institutions, and to axes of diversity other than race/ethnicity.
The rest of this paper is organized as follows. In Legal History, we provide a review of selected United States Supreme Court cases and California laws that contribute to our understanding of affirmative action and critical mass. In Mathematical Model, California Law, we construct our Markov chain description of the pipeline for college application, admission, matriculation, retention, and graduation. We then calibrate the model using publicly available data from the University of California and validate it by using historical measurements. In Model Predictions, we use the model to predict the racial/ethnic makeup of the undergraduate student body in coming years at the University of California-Berkeley (UCB) and the University of California-Los Angeles (UCLA). Without policy changes, the campus demographics of minoritized groups at UCB will remain largely unchanged over the next ten years, with a slight increase in the proportion of Hispanic/Latinx students and a decrease in the proportion of Black students. For UCLA, we predict increases in the proportion of Hispanic/Latinx students and, to a lesser degree, Black students. Finally, in Critical Mass, we introduce our own definition of the term, and apply it to UCB campus data. If no policy changes are made over the next decade, we predict that the Latinx population on campus will move towards critical mass but not achieve it, and that the Black student population will decrease, moving further below critical mass.
In Regents of the University of California v. Bakke, 438 U.S. 265 (1978), the Supreme Court considered the case of Alan Bakke, a White man who was twice rejected from medical school at the University of California-Davis (UCD). At the time, UCD reserved 16 out of 100 of their admitted spots for qualified minorities. Bakke’s admissions ratings were higher than any of the minority students’ in both years of applying. Bakke claimed he was denied admissions unfairly, and solely based on his race. The court ruled 5-4 in favor of Regents that Title VI of the Civil Rights Act of 1964 does not prohibit race-based admissions programs, and that the Equal Protection Clause of the 14th amendment permits race to be a factor in admissions. However, they also ruled 5-4 that the Equal Protection Clause does prohibit the use of racial quotas of the type UCD used. Therefore, the court instructed the university to admit Bakke.
Another landmark case was Gratz v. Bollinger, 539 U.S. 244 (2003). When admitting undergraduate students, the University of Michigan (UM) considered grades, test scores, relationship with alumni, geography, leadership qualities, and more. UM scored each applicant on a 150 point scale, where 100 points guaranteed admission. They also considered race/ethnicity in their decisions, adding 20 points for students whom they considered to be from underrepresented minority groups. In 1995, White in-state students Jennifer Gratz and Patrick Hamacher were both rejected for admission. In 1997, they filed a class action lawsuit against the university on the grounds of racial discrimination. The court decided 6-3 in favor of the students that the university’s policy was not narrow enough to meet strict scrutiny. Strict scrutiny is a standard used to determine the constitutionality of a policy. To pass strict scrutiny, a policy must serve a compelling government interest and be narrowly tailored. The court found UM’s policy to fail strict scrutiny because it assumed that every applicant from a specific underrepresented minority group was from a similar background. This decision helped establish the use of strict scrutiny for evaluating affirmative action policies in public higher education.
Grutter v. Bollinger, 539 U.S. 306 (2003) also involved UM. Barbara Grutter, a White in-state student, applied to UM’s law school in 1997. Despite a high undergraduate gradepoint average and high admissions exam scores, she was denied admission. The law school had a stated policy of using race/ethnicity in their admissions decisions because having a critical mass of minority students is in the interest of the state. The court sided with UM, finding that their highly individualized application review meant that no acceptance was based on one sole factor, including race. The verdict upheld the verdict of Regents v. Bakke, namely, that there is a compelling interest in achieving a diverse student body. Pivotally, though, the court clarified for the first time that the goal of critical mass does not equate to a quota system. That is, a highly individualized use of race/ethnicity in admissions is constitutional.
The most recent landmark case for affirmative action in higher education is Fisher v. University of Texas, 579 U.S. __ (2016), commonly known as Fisher II. In Texas, a policy called the Top Ten Percent Plan guarantees admission to the University of Texas (UT) for any public high school student graduating in the top ten percent of their class. In-state applications that do not qualify under this policy are evaluated according to a holistic process which does include consideration of race/ethnicity. In 2008, White in-state student Abigail Fisher, who did not qualify for admission to UT under this policy, had her application rejected. Fisher sued the university, arguing that the use of race as an admissions consideration was a violation of the Equal Protection Clause. The court ruled 4-3 in favor of UT, finding that the university was allowed to use race/ethnicity as a factor in admission policies on condition that they would study how diversity was being achieved or maintained on campus. Similarly to in Grutter v. Bollinger, the university’s approach was upheld because it used a rather individualized assessment of an application, using race as one factor, and because there were no workable alternatives for achieving diversity, already found to be a compelling interest.
Overall, the appropriate means of implementing affirmative action in undergraduate admissions remain ill-defined. From Regents v. Bakke, racial/ethnic quotas were deemed unconstitutional, and so the mechanisms of affirmative action needed to become broader in order for policies to persist. Eventually, the judiciary arrived at the term used in the most recent court cases, namely, critical mass. In Fisher II, the court stated that “critical mass is neither some absolute number of African-American or Hispanic students nor the percentage of African-Americans or Hispanics in the general population of the State…and [the] term remains undefined.” This lack of specificity is a challenge of contemporary implementations of critical mass. Any undergraduate institution can dictate its own interpretation of critical mass, and since diversity is already considered a compelling government interest, almost any definition of critical mass will pass the standard of strict scrutiny.
In summary, as legal precedent has forced affirmative action policies to become more ill-defined, the effectiveness of those policies has become more difficult to assess. While the Supreme Court has recognized that diversity is beneficial to college campus, it is difficult to know if specific admissions practices and policies effectively support the promotion of diversity. The idea of critical mass exists as a tool to help universities achieve the goal of diversifying campus. Critical mass has, to date, lacked an operationalizable definition. Later, we will propose one that is relevant for quantitative modeling efforts. First, we narrow our attention to the State of California in order to elucidate the context for our modeling study.
Outside of federal policies and decisions, states and institutions have a say in how and to what extent their public systems use affirmative action policies. Because of our own focus on the University of California system, we now review key legal developments regarding affirmative action in public undergraduate admissions in California.
In 1995, the Regents of the University of California passed a resolution called SP-1 which eliminated the use of race, ethnicity, and gender in admissions decisions for institutions in the UC system. A year later, California voters amended their state constitution by passing a ballot initiative called Proposition 209. The amended constitution prohibits state and local agencies from giving preferential treatment to individuals or groups on the basis of their race, sex, ethnicity, or national origin in public education, employment, or contracting. As a result, schools in the UC system are not allowed to use affirmative action in their admissions policies [5].
Three years after Proposition 209 passed, California implemented a state-wide policy that guaranteed admissions to UC for top students in public schools. Under this policy, a student was guaranteed admission to (at least) one school in the UC system if they were in the top four percent of a California public high school’s graduating class or were in the top four percent of graduating students statewide. Students admitted under the “Four Percent Plan did not get to choose the institution to which they were admitted. The plan remains in place, though in 2012 it was expanded from four percent to nine percent [5].
Meanwhile, in 2001, the Regents voted to rescind SP-1 and they replaced it with Regents Policy 4401. This policy affirmed that all students would be treated equally in the admissions process regardless of their race, sex, color, ethnicity, or national origin. The policy also specified that each campus should seek to enroll a student body which demonstrates a high academic achievement or talent level, as well as encompassing the broad diversity of backgrounds represented in the state of California [6]. Though the Regents rescinded SP-1, the passage of Proposition 209 several years earlier still prohibits affirmative action policies at UC institutions. However, there has been movement in California to attempt to repeal Proposition 209. In particular, California Proposition 16 would have reverted the state constitution, thus allowing the affirmative action policies to be used in undergraduate admissions. In November 2020, though, California voters rejected Proposition 16 and hence affirmative action policies remain banned in California. Therefore, it is useful to focus on factors that influence campus demographics other than the acceptance rates of various racial/ethnic groups. For example, an admissions office attempting to recruit more heavily from certain demographic groups would not be in violation of bans on affirmative action. The mathematical model we will construct will enable the assessment of the impact of such practices.
We now construct a model for predicting racial/ethnic demographics in a four-year undergraduate program. We will later apply this model to UCB and UCLA.
From the start, it is important to keep in mind several limitations of our modeling approach. Mirroring the limitations of the public data from which we compute model parameters, our model only accounts for students who are Asian American, Black, Hispanic/Latinx, or White. The modeling framework does not account for multiple racial/ethnic identities, nor any identities other than the aforementioned four. As for individuals who are Native American, Alaska Native, Pacific Islander, Native Hawaiian, Middle Eastern, multi-racial/ethnic, and others, their unfortunate omission from the model is related to issues of data availability and/or sample size. Future work should attempt to utilize more complete data that would remedy this erasure.
In addition to the data limitations above, a second set of limitations pertains to modeling assumptions that we make for simplicity. We account neither for international students nor students transferring into or out of the campuses we model. Instead, we focus on students entering a U.S. campus directly from a U.S. high school. Future models could incorporate other routes to matriculating at an institution. Similarly, we do not account for various possible complex paths through an undergraduate program, including students who take longer than six years to graduate. Our model should be viewed as a first step towards predictive modeling of undergraduate demographics. Future iterations of the model might relax some of the simplifying assumptions we have made here.
The remainder of this section is organized as follows. In Markov Chain Model Construction, we build our basic modeling framework, which tracks one demographic group of high school graduating class year through college application, admission, matriculation, retention, and graduation. In Markov Chain Transition Rates, we explain how entries in our Markov chain transition matrix can be deduced from publicly available data. In Inferring Model Parameters, we forecast this public data ahead in time in order to be able to specify Markov chain transition rates in future years. In Model Simulation and Validation, we detail how to combine results from multiple demographic groups and multiple high school graduation class years in order to predict overall student demographic proportions on a campus. We also explain how to generate 95% confidence intervals on our model output, where these intervals arise from uncertainty in calculation of the Markov chain transition rates. Finally, we train and test our model on historical data to provide evidence of the efficacy of the modeling approach.
Students enter our modeling framework as graduating high school seniors who apply and are accepted to a post-secondary institution. Then, they choose whether or not to attend the institution. If they enroll, they graduate in four to six years, though there is a possibility of dropping out after each semester. We model this process with a Markov chain, a memoryless model that describes the evolution of the probability of the system being in any particular state. For a review of Markov Chains, see [7]. In our model, there are 24 states representing students’ progression through college: a high school student applying, graduating, and being accepted (states 1—3); a college student in their first, second, third, fourth, or fifth year during the fall term, spring term, and in between years (states 4—18); a sixth-year college student in their fall or spring term (states 19—20); a college student graduating after four, five, or six years (states 21—23); and a college student dropping out (state 24). Because Markov chains describe probabilities, and because we know the sizes of relevant populations (applicant pools, student bodies, and so forth) we can also calculate expected values of the number of students in various states. In fact, our discrete Markov chain model is a special case of a more general class of models, namely Leslie matrix population models, typically used to model age structure in biological populations [8]. Therefore, it is as natural to specify the model’s state as population counts as it is to specify a probability distribution.
Given the set of all states S = s1, s2, …, s24>, the probability of moving from state si to sj is pij; these quantities are called transition probabilities. Once we specify the transition probabilities, we construct a transition matrix P: a 24 × 24 matrix where the ijth entry is the transition probability pij. One represents the state of the system as a vector and progresses the Markov chain via multiplication by P. Conveniently, the matrix P n gives the probability of a student moving from state si to state sj in n steps. For our model, each step represents a calendar time of four months.
Our model is an absorbing Markov chain, meaning that for some states in the model, the probability of leaving that state is zero, and equivalently pii = 1 and pij = 0 for all i ≠ j. These states are absorbing states, and all non-absorbing states are transient states. Our model is an absorbing Markov chain because after a student graduates or drops out, they do not re-enroll at the college. We split the states into two distinct sets T = s1, s2, …, st> and F = st+1, st+2, …st+f>, where T is the set containing all t transient states and F is the set containing all f absorbing states. The original set of states is S = T ∪ F and contains all t + f = 24 states in the Markov chain.
For our analysis, we put the transition matrix P into canonical form (1) where Q is a t-by-t matrix, R is a nonzero t-by-f matrix, 0 is the f-by-t zero matrix, and I is the f-by-f identity matrix. Now consider the form of the matrix P n , which owing to the upper block triangular form of P is: (2) Here, the asterisk is a placeholder for a t-by-f matrix. It is a standard calculation to show that due to the absorbing nature of our Markov Chain, Q n → 0 as n → ∞, meaning that that over time, the probability of a student moving into a transient state approaches zero. This limit reflects that in our model, the college process eventually terminates for every student.
Prior to calculating initial conditions and transition rates, we describe the states of the Markov chain and the transitions between them. As mentioned earlier, our model is comprised of 24 states. For convenience, we use shorthand to denote each one. For the 23 states that are not drop-out, we pair a number from zero through six to describe the academic year (where zero represents senior year of high school) and a letter from the set A, B, C, G> to correspond to the term or standing within that year. Specifically, A represents the fall term, B represents the spring term, C represents the summer term (or the transition from one year to the next), and G corresponds to graduation. For example, a high school senior in the summer before college is in state 0C, a third-year college student in their fall term would be in state 3A, and a student who graduated after six years would be in state 6G. We denote the remaining stage as DNF, meaning the student did not finish their college education. Thus, we have the complete set of states (3) As mentioned in our discussion of modeling limitations, we assume that after graduating or dropping out, a student neither re-enters the applicant pool nor re-enrolls in college. We separate the states into transient and absorbing sets: (4)
We can represent the Markov chain model as a graph where nodes are the states of the model and directed weighted edges represent the transition rates between states; see Fig 1. A student begins in state 0A and then moves through the graph according to the probabilities on each directed edge. Blue nodes represent transient states, green nodes represent the absorbing states of graduating, and the red node represents the absorbing state of dropping out.