top of page

Decode the “Yes”: How Data Reveals What Makes Candidates Accept Offers


Logistic Regression in Action: Predicting Job Offer Acceptance

Welcome back to our Predictive Analytics Series. In Part 1, we introduced Logistic Regression and explored what makes it ideal for binary outcomes like yes/no decisions. Now in Part 2, let’s walk through a real dataset, perform a logistic regression, and interpret the results — just like you would in Excel or Python.


🔍 Problem Statement

You’re part of an HR analytics team tasked with improving job offer acceptance rates. You want to understand what factors influence whether a candidate accepts a job offer.


Target (Dependent Variable): Offer Accepted? (Yes = 1, No = 0)


Predictors (Independent Variables):

  • Salary Offered (in $1000s)

  • Time to Offer (in days)

  • Number of Interview Rounds

  • Distance from Office (in km)


📊 Sample Dataset (30 Candidates)

A realistic HR dataset with 30 simulated candidates and their features have been generated.



🧮 Model Output (Python Example)

Logistic Regression Results:
====================================
Intercept                -2.25 (adjusted)
Salary Offered           +0.08 (p = 0.004)
Time to Offer            -0.25 (p = 0.001)
Number of Rounds         -0.60 (p = 0.045)
Distance to Office       -0.03 (p = 0.702)

📌 Interpretation of Results

  • Salary Offered: Each $1k increase raises the odds of accepting by about 8.3% (statistically significant).

  • Time to Offer: Each extra day delays offer and reduces acceptance odds (significant).

  • Rounds: More rounds lower odds of acceptance (mildly significant).

  • Distance: Not statistically significant (p > 0.05).



✅ Final Model Equation (in log odds)

Logit(P) = -2.25 + 0.08*Salary - 0.25*Time - 0.60*Rounds - 0.03*Distance

To get actual probability, apply the sigmoid:

P = 1 / (1 + e^(-Logit(P)))

🧪 Example Application:

Let’s say a candidate has the following:

  • Salary = $62k

  • Time to Offer = 4 days

  • Rounds = 2

  • Distance = 6 km


Plug into the equation:

Logit(P) = -2.25 + (0.08*62) - (0.25*4) - (0.60*2) - (0.03*6)
          = -2.25 + 4.96 - 1.00 - 1.20 - 0.18
          = 0.33

Then apply sigmoid:

P = 1 / (1 + e^(-0.33)) ≈ 0.582

➡️ There’s a 58.2% chance this candidate will accept.




📉 Model Accuracy

Using a 0.5 threshold for classification:

  • Accuracy: 83.3% — the percentage of total correct predictions (both accepted and rejected)

  • Precision: 81% — of all predicted “yes” offers, how many were actually accepted

  • Recall: 85% — of all actually accepted offers, how many were correctly predicted



🔍 What Does the Confusion Matrix Tell Us?

Let’s break it down:


Predicted No

Predicted Yes

Actual No

TN

FP

Actual Yes

FN

TP

  • True Positives (TP): Offers predicted as accepted and were indeed accepted.

  • True Negatives (TN): Offers predicted as rejected and were indeed rejected.

  • False Positives (FP): Offers predicted as accepted but were actually rejected.

  • False Negatives (FN): Offers predicted as rejected but were actually accepted.


This matrix helps identify where the model is making errors — and what kind. For HR, this could mean understanding which candidates are mistakenly being assumed to accept or reject.



📈 Understanding Model Metrics

  • Accuracy: Proportion of total correct predictions. A high accuracy means the model does well overall, but can mask poor performance on minority classes.

  • Precision: Out of all predicted positives, how many were truly positive. This is critical when the cost of false positives is high.

  • Recall: Out of all actual positives, how many were correctly predicted. Important when the cost of false negatives is high (e.g., missing high-potential candidates).



🧠 Practical Takeaways for HR

  • Speed up the hiring process. Each day matters.

  • Limit interview fatigue. Two rounds may be the sweet spot.

  • Salary remains king. Pay competitively to secure top talent.

  • Commute may not be a dealbreaker. But worth tracking in bigger datasets.


Even when some variables are not statistically significant, they may still hold practical importance and should be monitored in larger datasets. That said, strategic HR decisions should prioritize statistically significant predictors like Salary, Time to Offer, and Interview Rounds — based on this model.



🔚 Wrap-Up

This hands-on case brings the power of logistic regression to life. By understanding how different factors shape binary outcomes, you can move from guesswork to data-driven action.


In future parts, we’ll tackle churn modeling and marketing responses using logistic regression.

🎓 These advanced models aren’t covered in our 2-day courses, but they’re here to inspire your data journey. You can start learning foundational regression through:


Follow us to continue your journey from insights to action!


 
 
 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Featured Posts
Recent Posts

Copyright by FYT CONSULTING PTE LTD - All rights reserved

  • LinkedIn App Icon
bottom of page