The Hiring Algorithms – who knows how they work?

Having read The Algorithm by Hilke Schellmann, it raises questions about how we are using AI technology for hiring and assessing employees. The problem is that many companies simply let the tools run without supervision; very few people are looking under the hood to see what the machines are actually doing. Here are some examples of what she has found:

Résumé screeners can end up predicting job success based on meaningless keyword correlations. One system even linked larger shoe sizes with better performance, disproportionately disadvantaging women. In another case, “softball” was scored negatively and “baseball” positively — an obvious gender bias. A big driver of this is applicants stuffing CVs with keywords to impress the AI; once everyone lists the same skills, the algorithm looks for other patterns and latches onto correlations instead of real capability.

One client’s adaptive hiring test had a coding flaw that mis-scored answers: candidates who were wrong were marked as correct, pushed to harder questions, and often rejected. In another case, a simple data-row error shifted every candidate’s results by one line, mismatching names and scores. For three months, every hiring decision was wrong — just one example of the many basic but serious programming mistakes she has uncovered.

Other tools claim they can analyse a candidate’s social media to reveal their “real persona,” predicting traits such as teamwork, openness, and emotional stability. This kind of AI-driven personality profiling goes beyond résumé screening, mining behavioural signals to judge culture fit — often without candidates knowing they’re being assessed. And as every user knows, Netflix’s and Amazon’s recommendation algorithms frequently get things wrong. That’s Amazon’s problem if it misreads my preferences and loses a sale. But the stakes are far higher in hiring: if an AI system misjudges my personality and rejects me despite being well qualified, the consequences are far more serious.
When Schellmann took an English-fluency test, she gamed the system by answering entirely in German — yet it still rated her English as “competent,” exposing serious flaws. She ran similar experiments on platforms, which analyse a candidate’s speech and tone: by repeating “I love teamwork” to every question, she still scored a 71% match, and using an AI-generated voice she scored 79%, with the system failing to detect it was fake! All of this shows there are significant hidden weaknesses — and figuring out how to guard against them can feel like you’d need a PhD just to understand how the AI works.

We also have to be conscious of disabilities: if an interviewee has a speech-related disability that the software can’t interpret, the system often scores them unfairly low, and because AI video interviews are largely black boxes, there is still no evidence showing exactly why that happens or how to change it.

Other tools, like Visier’s flight-risk indicator, predict which employees are likely to leave within the next year by analysing patterns in HR data. It uses hundreds of variables — such as promotions, performance, engagement, meetings, commute time, and pay — to generate a “flight risk score.” Many argue that it is unreliable, sometimes less accurate than a coin toss, as it is based on incomplete or inappropriate data and can lead to unfair bias or discrimination. Many personal factors influencing someone’s decision to stay or leave — like family, finances, or relationships — lie outside ethical data boundaries. Managers told that someone is a “flight risk” may withhold promotions or development. Yet one case study showed how wrong this can be: a manager labelled “toxic” had high turnover only because they encouraged internal transfers. The data made them look problematic — but they were actually a model leader promoting growth and mobility.

Emerging tools like vocal biomarkers and EEG-equipped headsets purport to detect emotions, stress, or mental health states from brain waves or speech patterns. Although marketed as breakthroughs, there is limited scientific evidence that they work reliably or ethically. The idea that AI can “reverse engineer” brain activity from someone’s voice blurs the line between innovation and pseudoscience. Unlike objective medical biomarkers found in blood or imaging, voice data is subjective and easily misread.

When you ask major employers how they validate their AI selection tools, most admit they don’t — instead pointing to adoption by “seventy of the Fortune 250” as justification. But employment lawyers warn this is dangerously naïve, especially given the liability if these systems prove faulty or discriminatory. If a test counts as an illegal pre-employment medical exam, every candidate who completes it could have a claim. For employers screening millions each year, that’s a huge potential exposure. The bigger issue is that properly vetting such complex algorithms requires specialist expertise and deep analysis — resources most companies simply lack. While there are huge advantages to AI, which I have covered in other articles, it is also important to step back and ask ourselves who knows how the AI actually works?

You May Also Like…

Get in Touch!

+353 87 620 0836

info@futurewise.ie