ATD Blog
Keys for Résumé Wordsmithing: Matching Keywords, Partial Keywords, and Keyword Logic
Tue May 29 2012
Content
On a conceptual basis, job keywords make sense to everyone. However, when you dig into the details of how computers identify keywords, many challenges arise. I am going to explain the logic to clear up the confusion and hopefully start an ongoing discussion about keywords. I will use laymen terms to remove the complexity of understanding computational linguistics.
On a conceptual basis, job keywords make sense to everyone. However, when you dig into the details of how computers identify keywords, many challenges arise. I am going to explain the logic to clear up the confusion and hopefully start an ongoing discussion about keywords. I will use laymen terms to remove the complexity of understanding computational linguistics.
Content
Computers Look for Different Keywords than Humans
Computers Look for Different Keywords than Humans
Content
As for the job keywords, lots of advice suggests focusing on industry keywords and functional keywords. While this advice is still valid, a new era has emerged in which computers look for keywords in a résumé before a hiring manager reviews a résumé.
As for the job keywords, lots of advice suggests focusing on industry keywords and functional keywords. While this advice is still valid, a new era has emerged in which computers look for keywords in a résumé before a hiring manager reviews a résumé.
Content
Unlike humans, computers do not try to decipher meaning from individual words (e.g., does “manage” mean managing people or managing products). Instead, they apply complex mathematical formulas to determine the words and phrases that can precisely and compactly represent the content of the job description. Then, these phrases are searched for in the résumé. Based on the search, a complex ranking system is used to compare one candidate’s résumé to another’s. Complex? Yes!
Unlike humans, computers do not try to decipher meaning from individual words (e.g., does “manage” mean managing people or managing products). Instead, they apply complex mathematical formulas to determine the words and phrases that can precisely and compactly represent the content of the job description. Then, these phrases are searched for in the résumé. Based on the search, a complex ranking system is used to compare one candidate’s résumé to another’s. Complex? Yes!
Content
I think a good example of how a computer identifies keywords is to use a sample job description. Let’s focus on just three lines of the Requirements section:
I think a good example of how a computer identifies keywords is to use a sample job description. Let’s focus on just three lines of the Requirements section:
Content
Requirements:
Requirements:
Content
Bachelors degree in a relevant scientific discipline or equivalent.
Bachelors degree in a relevant scientific discipline or equivalent.
Content
At least 2 years of relevant experience as a CRA in the biotech / pharmaceutical industry.
At least 2 years of relevant experience as a CRA in the biotech / pharmaceutical industry.
Content
3+ years CRA experience is preferred; Knowledge of GCP and ICH guidelines.
3+ years CRA experience is preferred; Knowledge of GCP and ICH guidelines.
Content
Computers identify keywords by determining how often phrases are used among other job descriptions, then the computer looks for the phrases in a candidate’s résumé and ranks the candidate based on the findings. To understand the process, we can break it into four parts.
Computers identify keywords by determining how often phrases are used among other job descriptions, then the computer looks for the phrases in a candidate’s résumé and ranks the candidate based on the findings. To understand the process, we can break it into four parts.
Content
1. Identifying the Keyword Phrases
1. Identifying the Keyword Phrases
Content
The computers first begin by analyzing the job description to identify all the keyword phrases in the job description. What is a “phrase?” A “phrase” is one or more words in succession from the job description. Phrases can be single words like “CRA,” from our example, or longer strings of words like “ Bachelors degree in a relevant scientific discipline. ”
The computers first begin by analyzing the job description to identify all the keyword phrases in the job description. What is a “phrase?” A “phrase” is one or more words in succession from the job description. Phrases can be single words like “CRA,” from our example, or longer strings of words like “Bachelors degree in a relevant scientific discipline.”
Content
2. Determine Frequency of the phrases
2. Determine Frequency of the phrases
Content
With the phrases identified, next, the computer identifies how many times that phrase is found in all the other job descriptions. The more it is found, a higher score is assigned to the phrase. The less it is found, the lower the score. For instance, let’s look at the first two lines of our sample job description.
With the phrases identified, next, the computer identifies how many times that phrase is found in all the other job descriptions. The more it is found, a higher score is assigned to the phrase. The less it is found, the lower the score. For instance, let’s look at the first two lines of our sample job description.
Content
Requirements:
Requirements:
Content
Bachelors degree in a relevant scientific discipline or equivalent.
Bachelors degree in a relevant scientific discipline or equivalent.
Content
At least 2 years of relevant experience as a CRA in the biotech / pharmaceutical industry.
At least 2 years of relevant experience as a CRA in the biotech / pharmaceutical industry.
Content
If we had 10 other job descriptions and counted the frequency of the phrases, we might end up with something like this:
If we had 10 other job descriptions and counted the frequency of the phrases, we might end up with something like this:
Content | Content |
Content Phrase Phrase | Content Frequency Frequency |
Content Bachelors Bachelors | Content 10 10 |
Content Bachelors degree Bachelors degree | Content 10 10 |
Content Bachelors degree in Bachelors degree in | Content 10 10 |
Content Bachelors degree in a Bachelors degree in a | Content 8 8 |
Content relevant relevant | Content 10 10 |
Content relevant scientific relevant scientific | Content 7 7 |
Content relevant scientific discipline relevant scientific discipline | Content 5 5 |
Content relevant scientific discipline or equivalent relevant scientific discipline or equivalent | Content 4 4 |
Content At At | Content 10 10 |
Content At least At least | Content 8 8 |
Content At least 2 At least 2 | Content 3 3 |
Content At least 2 years At least 2 years | Content 3 3 |
Content At least 2 years of At least 2 years of | Content 3 3 |
Content At least 2 years of At least 2 years of | Content 3 3 |
Content relevant relevant | Content 10 10 |
Content relevant experience relevant experience | Content 10 10 |
Content relevant experience as relevant experience as | Content 10 10 |
Content relevant experience as a relevant experience as a | Content 10 10 |
Content CRA CRA | Content 2 2 |
Content CRA in CRA in | Content 1 1 |
Content CRA in the CRA in the | Content 1 1 |
Content CRA in the biotech CRA in the biotech | Content 1 1 |
Content CRA in the biotech pharmaceutical CRA in the biotech pharmaceutical | Content 1 1 |
Content
|
Content
What the computer does is start with a word and counts its frequency (i.e., how many times was it found in all job descriptions). Then, it will add on additional words and get a count.
What the computer does is start with a word and counts its frequency (i.e., how many times was it found in all job descriptions). Then, it will add on additional words and get a count.
Content
Once the frequency is determined, then the computer decides what are the keywords for a job. With the information we have above, we could claim the words that show up less frequently are the most important phrases for this job, and the words that show up more frequently are too generic. For instance, if a phrase appears in 10 job descriptions, we may think this is not important (this is the case with “Bachelors degree”). However, the phrase “CRA in the biotech pharmaceutical” is very unique to this job. Therefore, we could assert “any phrase with a count of three or less is a keyword phrase.”
Once the frequency is determined, then the computer decides what are the keywords for a job. With the information we have above, we could claim the words that show up less frequently are the most important phrases for this job, and the words that show up more frequently are too generic. For instance, if a phrase appears in 10 job descriptions, we may think this is not important (this is the case with “Bachelors degree”). However, the phrase “CRA in the biotech pharmaceutical” is very unique to this job. Therefore, we could assert “any phrase with a count of three or less is a keyword phrase.”
Content
But these don’t look like keywords. The reason it appears to be incomplete phrases or gibberish is due to the added words in the phrase that make it less frequent. The computer is not looking for grammatical or commonly used phrases. For example, you may believe “2 years of experience” is the keyword but a computer may say “At least 2 years of” is the keyword, because “2 years of experience” shows up in too many job descriptions.
But these don’t look like keywords. The reason it appears to be incomplete phrases or gibberish is due to the added words in the phrase that make it less frequent. The computer is not looking for grammatical or commonly used phrases. For example, you may believe “2 years of experience” is the keyword but a computer may say “At least 2 years of” is the keyword, because “2 years of experience” shows up in too many job descriptions.
Content
3. Searching Résumés for Keyword Phrases
3. Searching Résumés for Keyword Phrases
Content
Once the computer has a set of keyword phrases, next it searches a résumé for the keywords. If it finds a match or partial match, it will give it a score. Let’s use “at least 2 years experience” as the keyword phrase. If the résumé had “I have more than 2 years experience in…”, we would get a partial match with “2 years experience” being the overlap. Changing tense of a word and/or adding or removing plurality or possession will result in getting a partial match. Partial matches are not bad. It is unlikely any résumé will match the job description exactly without copying it word for word. Therefore, the goal is to eliminate any missing keywords and fill your résumé with matched and partially matched keywords.
Once the computer has a set of keyword phrases, next it searches a résumé for the keywords. If it finds a match or partial match, it will give it a score. Let’s use “at least 2 years experience” as the keyword phrase. If the résumé had “I have more than 2 years experience in…”, we would get a partial match with “2 years experience” being the overlap. Changing tense of a word and/or adding or removing plurality or possession will result in getting a partial match. Partial matches are not bad. It is unlikely any résumé will match the job description exactly without copying it word for word. Therefore, the goal is to eliminate any missing keywords and fill your résumé with matched and partially matched keywords.
Content
4. Assigning a Résumé Rank
4. Assigning a Résumé Rank
Content
Once the computer has a list of all the matched and partially matched keywords, the computer assigns a rank or value. The rank is weighted based on the matches and the frequency of the keyword phrase. A keyword phrase that is less frequently found will get a higher weight than a keyword phrase that is more frequently found. An exact match will get a higher weight than a partial match. Within the partial match, the closer to the exact phrase you can get, the higher the rank. The computer takes all of these into account and assigns a weighting.
Once the computer has a list of all the matched and partially matched keywords, the computer assigns a rank or value. The rank is weighted based on the matches and the frequency of the keyword phrase. A keyword phrase that is less frequently found will get a higher weight than a keyword phrase that is more frequently found. An exact match will get a higher weight than a partial match. Within the partial match, the closer to the exact phrase you can get, the higher the rank. The computer takes all of these into account and assigns a weighting.
Content
For every résumé that comes in, a rating can be assigned.
For every résumé that comes in, a rating can be assigned.
Content
Does it work?
Does it work?
Content
Using our example, let’s do a simple test to see if the process works. Let’s assume we get hundreds of résumés. If 10 résumés have “CRA” or “CRA in the biotech” versus 100 résumés that have “Bachelor’s Degree,” a hiring manager could quickly narrow the applicant list to just 10. While the hiring manager may miss out on a strong candidate who does not have this term, they do avoid having to read through hundreds of résumés.
Using our example, let’s do a simple test to see if the process works. Let’s assume we get hundreds of résumés. If 10 résumés have “CRA” or “CRA in the biotech” versus 100 résumés that have “Bachelor’s Degree,” a hiring manager could quickly narrow the applicant list to just 10. While the hiring manager may miss out on a strong candidate who does not have this term, they do avoid having to read through hundreds of résumés.
Content
There are plenty of arguments on why this may not result in the best hiring decisions, but in today’s economy where employees are required to do more with less, these systems are here to stay.
There are plenty of arguments on why this may not result in the best hiring decisions, but in today’s economy where employees are required to do more with less, these systems are here to stay.