I have a confession to make: I spent a LOT of time procrastinating my college applications this past season.

What was I doing instead? I was trying to predict them before their release.

In honor of college decisions being released recently, I’m sharing the two techniques I used to predict my decisions with X% accuracy.

I started with a statistical approach.

Lincoln’s Statistics & LLMs Approach: Place applicants from the applicant pool on a normal distribution to calculate the standard score of another applicant for SAT/ACT and unweighted GPA. I also tried using the SAT’s Nationally Representative and User Percentiles instead of using SAT scores from the applicant pool dataset I had, but this did not affect the results significantly. Use LLMs to score extracurricular activities on a scale from 1.0 (most compelling) to 0.0 (least compelling). For example, published research in a scientific journal or ISEF finalist may score a 1.0. Use these factors to group applicants by their scores and find probability of acceptance.

While this approach my sound good, it has a few issues:

I cannot advocate for this approach.


Recently, I’ve been learning ML, which leads me to my second approach.

Lincoln’s Machine Learning Approach: Train a model on SAT/ACT, unweighted GPA, application round (Regular or Early), and outcome (Accepted, Waitlisted, or Rejected) for a specific school to predict outcome probabilities.

This approach yields a continuous model that is more representative of outcomes than the statistical approach.

This lets us create a heatmap of acceptance probabilities by application round:

Heatmap of acceptance probabilities for UMass Amherst's early action round where green is high probability of acceptance and red is low

Heatmap of acceptance probabilities for UMass Amherst's regular decision round where green is high probability of acceptance and red is low

Note: This is in-state data from a competitive high school trained on a little over 900 students from the past few years. If you’re an out of state student, this is likely to look much different.

As you can see, it’s continuous. And pretty fun to think about!

While this approach doesn’t include extracurricular activities, it can be extended to factor in the extracurricular activities by introducing an extracurricular score just like SAT/ACT and GPA. You could use LLMs or another model, but I can’t recommend this approach.

Send me an email if you have any ideas!