In this post, I am going to talk about automated spelling correction. Let’s say you are writing a document on your computer, and instead of typing “morning”, you accidentally type “mornig”. If you have automated spelling correction enabled, you will probably see that “mornig” has been transformed to “morning” on its own. How does this work? How does your computer know that when you typed “mornig”, you actually meant “morning”? We are going to see how in this post.
Spelling mistakes could turn out to be real words!
Before we actually go through how spelling correction works, let’s think about the complexity of this problem. In the previous example, “mornig” was not a real word, so we knew it had to be a spelling mistake. But what if you misspelled “college” as “collage”, or you misspelled “three” as “tree”? In these cases, the word you typed incorrectly happens to be an actual word itself! Correcting these types of errors is called real word spelling correction. On the other hand, if the error is not a real word (like “mornig” instead of “morning”), correcting those errors is called non-word spelling correction. You can see that real world spelling correction seems more difficult than non-word spelling correction because every word that you type could be an error (even if it has a correct spelling). For example, the sentence “The tree threes were tail” makes no sense because every word except “the” and “were” is an error even though they are all actual words. The actual sentence should be “The three trees were tall”. In this post, I am going to talk about non-word spelling correction with a basic approach to it.