Assessment Validation

Home » Assessment Manual » Assessment Validation

Assessment Validation

Assessment Validation

If you are a professional human resource person, you are well aware of this term “Validation.” If you are not familiar with this term, I will give a brief overview of it first.

Assessment Validation is simply an objective study that demonstrates that the Assessment actually measures what it says it does. It is not a stamp of approval by any government agency or from a professional person. It is a study of how the assessment works for its intended purpose.

There are three types of validation set forth by the Equal Employment Opportunities Commission (EEOC). A United States federal program. In 1978, the EEOC created guidelines to ensure that knowledge gained from testing is applied with impartiality to protect minority applicants from discriminatory employment procedures. These three types are as follows:

Book An Appointment

Ready to Take the Next Step in Data-Driven Hiring?

Learn about the Assessments that will help you Hire Top performers

Talk With Us Now

Already set up with our Assessments?

Need Help Reviewing Candidate Results?

Call now or schedule an appointment for a Free Consultation

(833) 332-8378

Criterion Validity: If data demonstrates that a test is significantly correlated with a vital measure of job performance, the test is said to demonstrate criterion validity. For example, if all the good salespeople scored well on an assessment related to sales than that test would demonstrate criterion validity.

Construct Validity: The term construct is a technical term for personality traits like empathy and detail oriented. Construct validity is demonstrated if a test measures traits that have been found to influence successful performance of a job. For example, if a trait called organization on an assessment affects how well administrative assistants do on their jobs, than it is likely to indicate construct validity.

Content Validity: Content validity is demonstrated if the questions that make up an assessment are representative of content that is required to perform a particular activity or task. A test made up of programming terms given to a candidate for a programmer position would demonstrate content validity.

And from a legal standpoint, our assessments, skill and knowledge tests are well within all these guidelines and comply with all other ADA, EEOC, Discrimination and other federal and state regulations.

Predictive Validity

There is another popular type of validity which is more specific than the EEOC guidelines. It is called Predictive Validity. It follows some of the same validity requirements as the criterion and construct guidelines. The predictive validity requires the employer have an applicant do the assessment but not mark it or use any of the information from it. Then after the person has worked for the employer for approximately 6 to 9 months, the employer rates the person who he or she thinks turned out to be successful. Then he or she has the assessment marked. The standards of a successful person for the position are then noted.

The Wimbush Assessments are validated using a slightly different variation of the predictive method yet fall within the EEOC guidelines. Our clients use the results immediately. Then after 6 months to a year from when the person was hired we see what happened. By then, most our clients have forgotten what the results were. So the way we do our validation seems objective. We also profile the unsuccessful employee based on who was fired and who quit. The ones who quit could have good reasons, so we use discernment on those

Interviews Verses Assessments

Back in our early days, these were some of the problems our clients were having with their hiring processes: They would hire someone thinking the person was a superstar who turned out to be average. They would hire salespeople who couldn’t sell. They would hire customer service people who frustrated customers or wouldn’t be able to handle the customer’s concern and require a supervisor to deal with them. Or they thought someone would get along with the team but didn’t. They were honest enough to look at the people they had hired. When they started using our assessments, they would make comments to us that the assessments were like x-rays that could look right through candidates for who they really were. But the advantage to them was having some really poor candidates exposed and they received help on what to examine further. Our support service, already included in the price of our assessments, is still offered today.

Our Assessment Accuracey and History of Improvements

Through 30+ years of validating our assessments we are able to predict which employees will succeed on the job with up to 90% success.

Our accuracy began to rapidly increase after some more adjustments in 1998 the results improved to 70% which is an acceptable industry average for assessments. At that point we had met the acceptable validation standard for an assessment. However, we were never satisfied with the 30% failures.

Then in 2003 we made a breakthrough that shocked us. We already knew that every candidate fills in assessments differently. Some exaggerate and some are honest. Some are humble and some think too highly of themselves. And certain personality types answer attitude questions quite differently than others. So we had at least three variables in the way candidates answer assessment questions. We thought that with modern day computers we just might be able to allow for these factors to get a better result.

We researched ways of programming computers with the most modern software and were able to achieve much greater accuracy. The breakthrough in 2003 was valid and valuable.

After months of setting up our computer software, we were able to simultaneously evaluate hundreds of assessments of people who had been hired and clients had given us feedback on. We were able to see the affect of embedded lie questions on the bottom-line predictions. We could see how some questions worked with others and how different weightings (influences) if changed produced better computer generated results. Some questions that had seemed to work well by themselves proved worthless to the bottom- line results. When you think of comparing one change to 72,000 answers, from 300 candidates and over 30 traits the computations run into the billions. This could never be done without computers—not even a team of researchers could do that in a lifetime.

In 2007 I made another breakthrough regarding predictive research rather than hindsight research. For example, buying stocks or real estate looks so easy in hindsight. Even looking at the past behavior of stocks seems a good method of knowing what stocks to invest in until you try it for real. For me it was like looking in a stock chart book for 1995 and deciding what were the best stocks to buy. Then looking in the 1996 book and realizing how little I knew. This is not a perfect analogy for pre-employment testing but I did something similar.

Instead of doing research on all those our clients had given us feedback on, I only did it on 80% and kept the other 20% hidden. This is a single “blind” testing method. Similar to only making your prediction by looking at the 1995 stock book without peaking in the 1996 book. The first time I did this I found the accuracy levels were not as perfect as I thought they were. It caused me to look even harder into the basic predicable factors to the assessments.

In the first two months of 2008 I was finally able to crack this problem using many more samples my clients had given me in my yearly request for feedback. It also allowed me to use several questions that had been on the assessments for years that I could finally validate and use. But the biggest breakthrough came from the use of the “double blind” testing method.

The Details of the Double Blind Research

We selected people who did our assessments before they were hired, have been on the job for at least 6 months or up to two years and our clients are able to tell us without doubts how the employee turned out in reality.

Then we take 30% of that group and put it aside. It is not used in any of the research. The idea is that after we figure out a great marker we will be able to accurately tell how that 30% turned out in reality. We do it in two stages thus the double blind. Instead of doing the 30% at once we take 15% on our first test. If we are successful with the first 15% we then try it on the second 15%. If the first one fails we go back to the drawing boards. Only when it can predict both groups accurately do we consider that marker is a valid marker.

Competent Producers

Just because people have high integrity and great attitudes does not mean they are competent producers. From my observation, most of those looking for jobs are either competent or have a good attitude—and a great hire is both. An even bigger percentage of the unemployed and employed are borderline incompetent or have borderline attitude problems. If these borderline people are hired, their minuses usually outweigh their pluses and stay around for a long time not helping the organization prosper. They are difficult to let go because they usually don’t give an employer a black and white reason to take such an action. From my over 30 years of coaching business leaders I have found only the best employees make a company profitable. The borderline of worse, if in big enough numbers, will drive a company into bankruptcy. Even if you say, “If we have a good product or service and it is well marketed, it will makes up for the borderline employees” you still need to realized it takes good employees to make good products, deliver good service or help coming up with marketing ideas.

To validate the accuracy of predicting an applicant’s ability to produce we look at two main factors. One depends on the right personality profile for the position and the other is the person’s general competence. Again a person can have the right personality for a position but not be competent. But in many cases, an applicant will have the right personality for the job but are incompetent at it. Using the double blind method we were able to predict accurately 90% of those in the blinds providing the candidate did not exaggerate or was too brutally honest filling in the questionnaire. The number falls to 70% when the exaggeration is severe. The use of skill and knowledge tests can greatly help competency predictions.

Our Simple Approach

In order give our clients accurate and useable predictions, we believe in keeping everything simple. Instead of complicated personality analysis, we use the simple quadrant approach. It is not only simple; it has proven to be more accurate during computer analysis. Because we do it this way, if someone exaggerates filling in the questionnaire they do so on each quadrant. We make adjustments for this but in the end we only consider the result of how one quadrant compares to the other quadrants.

For example, let us say we are looking for good engineers. We know from experience they need to be in or very close to the Analyzer/Logical quadrant. In the wrong quadrant they will not enjoy doing their work, quality will suffer and turn over will be higher. If the applicants are brutally honest, they will be brutally honest on all the quadrants but their true quadrant, Analyzer/Logical, will still be higher than the others.

I finally figured out how to evaluate the accuracy of the quadrants by using computers. I did this in late 2007 but before I got the huge amount of feedback from my clients soon after that time. I then compared how the computer evaluated quadrants would have worked on those people. It was amazingly accurate.

Yes, these assessments do what we say they will do and the accuracy rates for competency and attitude are close to 90%. Only those candidates that overly exaggerate or are inconsistent in their answers are less accurate because we can only give a range of how they are likely to turn out.