Search

Decode the “Yes”: How Data Reveals What Makes Candidates Accept Offers

Michael Lee, MBA
Jun 16
3 min read

Photo by Christina @ wocintechchat.com on Unsplash

Logistic Regression in Action: Predicting Job Offer Acceptance

Welcome back to our Predictive Analytics Series. In Part 1, we introduced Logistic Regression and explored what makes it ideal for binary outcomes like yes/no decisions. Now in Part 2, let’s walk through a real dataset, perform a logistic regression, and interpret the results — just like you would in Excel or Python.

🔍 Problem Statement

You’re part of an HR analytics team tasked with improving job offer acceptance rates. You want to understand what factors influence whether a candidate accepts a job offer.

Target (Dependent Variable): Offer Accepted? (Yes = 1, No = 0)

Predictors (Independent Variables):

Salary Offered (in $1000s)
Time to Offer (in days)
Number of Interview Rounds
Distance from Office (in km)

📊 Sample Dataset (30 Candidates)

A realistic HR dataset with 30 simulated candidates and their features have been generated.

🧮 Model Output (Python Example)

Logistic Regression Results:
====================================
Intercept                -2.25 (adjusted)
Salary Offered           +0.08 (p = 0.004)
Time to Offer            -0.25 (p = 0.001)
Number of Rounds         -0.60 (p = 0.045)
Distance to Office       -0.03 (p = 0.702)

📌 Interpretation of Results

Salary Offered: Each $1k increase raises the odds of accepting by about 8.3% (statistically significant).
Time to Offer: Each extra day delays offer and reduces acceptance odds (significant).
Rounds: More rounds lower odds of acceptance (mildly significant).
Distance: Not statistically significant (p > 0.05).

✅ Final Model Equation (in log odds)

Logit(P) = -2.25 + 0.08*Salary - 0.25*Time - 0.60*Rounds - 0.03*Distance

To get actual probability, apply the sigmoid:

P = 1 / (1 + e^(-Logit(P)))

🧪 Example Application:

Let’s say a candidate has the following:

Salary = $62k
Time to Offer = 4 days
Rounds = 2
Distance = 6 km

Plug into the equation:

Logit(P) = -2.25 + (0.08*62) - (0.25*4) - (0.60*2) - (0.03*6)
          = -2.25 + 4.96 - 1.00 - 1.20 - 0.18
          = 0.33

Then apply sigmoid:

P = 1 / (1 + e^(-0.33)) ≈ 0.582

➡️ There’s a 58.2% chance this candidate will accept.

📉 Model Accuracy

Using a 0.5 threshold for classification:

Accuracy: 83.3% — the percentage of total correct predictions (both accepted and rejected)
Precision: 81% — of all predicted “yes” offers, how many were actually accepted
Recall: 85% — of all actually accepted offers, how many were correctly predicted

🔍 What Does the Confusion Matrix Tell Us?

Let’s break it down:

	Predicted No	Predicted Yes
Actual No	TN	FP
Actual Yes	FN	TP

True Positives (TP): Offers predicted as accepted and were indeed accepted.
True Negatives (TN): Offers predicted as rejected and were indeed rejected.
False Positives (FP): Offers predicted as accepted but were actually rejected.
False Negatives (FN): Offers predicted as rejected but were actually accepted.

This matrix helps identify where the model is making errors — and what kind. For HR, this could mean understanding which candidates are mistakenly being assumed to accept or reject.

📈 Understanding Model Metrics

Accuracy: Proportion of total correct predictions. A high accuracy means the model does well overall, but can mask poor performance on minority classes.
Precision: Out of all predicted positives, how many were truly positive. This is critical when the cost of false positives is high.
Recall: Out of all actual positives, how many were correctly predicted. Important when the cost of false negatives is high (e.g., missing high-potential candidates).

🧠 Practical Takeaways for HR

Speed up the hiring process. Each day matters.
Limit interview fatigue. Two rounds may be the sweet spot.
Salary remains king. Pay competitively to secure top talent.
Commute may not be a dealbreaker. But worth tracking in bigger datasets.

Even when some variables are not statistically significant, they may still hold practical importance and should be monitored in larger datasets. That said, strategic HR decisions should prioritize statistically significant predictors like Salary, Time to Offer, and Interview Rounds — based on this model.

🔚 Wrap-Up

This hands-on case brings the power of logistic regression to life. By understanding how different factors shape binary outcomes, you can move from guesswork to data-driven action.

In future parts, we’ll tackle churn modeling and marketing responses using logistic regression.

🎓 These advanced models aren’t covered in our 2-day courses, but they’re here to inspire your data journey. You can start learning foundational regression through: