Admittedly, it is anachronistic: the written word is – still – the core of search engine optimisation. Google still cannot interpret images and analyse video content if the content is not written in the appropriate way. Texts are and remain – until further notice – the only source of information, in terms of content, for Google Bots.
The perfect Google text is since ever a heatedly discussed topic. The meaning of search term and the combination of given search terms in a search phrase was always undisputed. Google collects and categorises content based on keywords in order to be able to assign them to a suitable search query. However, the specialists are divided, among other issues, by the question how often a given keyword should be ideally employed in a given text. The formula used to measure this is the “keyword density”: it represents the proportion of keywords in a text in comparison to the overall amount of words and is given in percentage. At the beginning of search engine optimisation, over 3% was still considered as ideal. Texts with a higher keyword density were rare because of the reading habits of people and were bursting with words repetitions. Trying to “stuff” a keyword as often as possible in a text was called “keyword stuffing”.
As Google’s algorithm became more and more important in the reading of the keywords that shaped a text, the keyword density could also be reduced even further. Many search engine experts already suggested in 2013 that the concept of keyword density should be deleted entirely from the SEO vocabulary. Google is now dominating the semantic interpretation of texts very well. Since the presentation of the new Google algorithm last year, it is clear that the search engine giant strives for perfection. Google recognises words and formulations with similar meaning always better. Synonyms can by now be used without disadvantages in the ranking. The good news for all friends of a good language: the results of this development are readable texts. The focus of SEO texts is now on quality and added value for the reader.
Discontinued model of keyword density, new star WDF*IDF
Besides the discussion on “ideal” values, keyword density has always had strong weaknesses. One of these was that it did not make a statement about the occurrence of a particular keyword in relation to other words. If a text with the keyword X were created, with a density of three percent, the same text might contain the keyword Y in a comparable high concentration. Many texts are thus not unique and ideally optimised for a keyword. This leads to fuzziness.
It is more precise and efficient to calculate keywords after the formula WDF*IDF. WDF stands for “Within Document Frequency”. The frequency of a specific word is calculated in relation to all other search words appearing in this text. The more often a keyword appears in a text, the higher will be its WDF value. It is multiplied with the “Inverse Document Frequency” IDF. This describes the meaning of the keyword in relation to the overall of quantity of documents available in a database, in our case Google. The more documents lead to a given keyword, the more difficult will also be to generate relevance.
In SEO practice, it has been proven that the formula WDF*IDF works much better in comparison with the traditional measurement of keyword density. For most keywords, it is possible to establish a wonderful correlation between WDF*IDF and ranking of the page on Google. It should be kept in mind, that keywords are only one among many hundreds of ranking signals. This shows even more clearly how important the right keywords are for a website.
For the website operator, content optimisation according to the WDF*IDF formula means of course a given additional expense. For an exact calculation of the ideal value, not only determining appropriate keywords for the offer and/or product, but also their use compared to competitors. This task should ideally be taken up by a SEO agency as professional tools are employed. However, the effort will be worthwhile, because the aforementioned method will allow a precise estimation of efforts and benefits of SEO optimisation measures.