Voice Search Optimization (Why We’re Getting Voice Search All Wrong)
Voice search is the latest and greatest craze to hit the SEO world. If you’ve been keeping up on your SEO reading it’s very likely that you may have come across some sort of article testing out some sort of voice search device, be it Google Home or Amazon Echo, etc. But with this latest phenomenon to enter the search universe has come concern of how to optimize for the new search medium. Almost uniformly the SEO community has taken an approach for how to best optimize for voice search SEO, but what if we’re all wrong? Just in case you were wondering, I think we’re way off base here as an industry and in this two-part series, I’ll tell you why.
In Part I of my voice search analysis, I’ll highlight the meta-problems that present themselves when trying to equalize oral and written language. In the upcoming installment, I’ll get into the nitty-gritty and show how the two forms of language are structurally incompatible, emphasizing how dialect, culture, and the overall linguistic structure of the two forms of communication would make dual optimization essentially impossible.
The Basic and Faulty Premise in Optimizing for Voice Search
I never thought of myself as a language geek before, that is, until I started to see what the SEO community was saying about voice search. How are we supposed to optimize for voice search now that the medium is becoming more and more common? C’mon, you can say it… long-tail keywords and less formal content that is not as hyper-focused on concentrated keyword phrases. I must have read a dozen articles on the topic that said the same thing and didn’t even think twice about it. That is until I had a “Yoda moment” and remembered my training… my training as an English teacher within the urban environment.
One of the first things you learn about teaching students in the urban setting is that their speech patterns are very much culturally influenced and differ vastly from what would be considered “standard English” (should such a thing actually exist). In other words, and it’s something I saw each and every day in the classroom, the way people speak is not the way they communicate in writing, they are two almost distinct entities.
The Problem with Long Tail Keywords and Less Formal Content
Considering my struggle as a teacher in the inner-city, and seeing the disconnect between written and oral language on a daily basis, a light bulb went off almost instantly. There is a problem with saying that voice search can be optimized for by targeting long tail keywords. The problem with that is that oral language is not only quantitatively different from written language, but qualitatively different as well. Oral language is so essentially different from its pen-and-paper counterpart that just saying it’s a bit less formal is a monstrous understatement.
We in the SEO community have oversimplified a linguistic difference that can fill up a library and constitutes a study that began in the 1960’s. We, to no fault of our own, have boiled down a topic worthy of a doctoral thesis into more long tail keywords and more informal content. However, the differences between a query like universe origins and how did the universe get here go far beyond their inequality in length and formality.
The most notable difference in the above searches is that there is a different featured snippet for each search, which is extremely telling.
Google shows a different featured snippet when oral type language is employed than when a more traditional search is executed
Also, you’ll notice in the image below that the first search, the more traditional search, produced a series of Related Questions. However, what is most striking is that when oral language was employed (i.e. the search for how did the Universe get here) Google included results related to God’s existence/role in creation whereas the traditional search did not. Google was evidently a bit unclear of my intent in the second search and as such included some results that go in a sort of different direction so as to compensate.
Within the context of a voice-like search Google removed the Related Questions feature and inserted results that indicated it was unsure of user intent
So then, what is this great divide that separates verbal and written communication?
The Real Differences Between Traditional Search and Voice Search
There are, in all honesty, a lot of significant differences between how we communicate verbally relative to how we communicate on paper, too many to discuss here and now. Let me then begin with some of the “meta” differences, the foundational sort of distinctions between the two methods of communication.
Writing Is Unnatural
Writing is like taking steroids…it’s not really natural. It sounds like an odd sort of thing to say, but being able to read and write is not the natural state of humankind.
Once you start thinking about it, this is very much an obvious point, even illiterate people can communicate verbally. All a baby needs to do is hang around some people speaking long enough and language acquisition happens on its own. However, such is not the case with writing, just ask any 2nd grader. This in it of itself is indicative of the major differences between communicating verbally and communicating via the written word.
Writing and Speaking Use Different Parts of the Brain
To qualify this notion that speaking and writing have their own orbits, a 2015 study out of Johns Hopkins University found that each exercise uses a different part of the brain. The study found that it is conceivable to sustain brain damage that would limit speaking but that would simultaneously leave the part of the brain that regulates writing unaffected (and vice versa).
Speaking and Writing Exist Within Different Contexts
Simply stated, there are different situations where one form of communication may be more appropriate and effective than the other. Speech in particular exists within a conversant context. Speech is designed to be interactive, and as such it has various structural divergences when compared to writing (which I’ll discuss later on and show why this has a major potential SEO impact). Due to the context of when speech is designed to be employed, a study out of the University of Illinois dating back to 1977 (like I said, this is not a new issue and is one that occupies decades of research) is quoted as saying “the two modes [speech and writing] are by no means interchangeable: Some situations and purposes call for spoken communication and others for written”. This sort of incongruity is not in line with the notion that with a few small adjustments to the length of the keywords we’re tracking, we can bridge the gap between traditional and voice search.
Why Understanding Language Matters for Your SEO
You can’t just adjust and make some alterations to your SEO strategy in order to successfully optimize for voice search. Because it is verbal communication, voice search is an entirely different beast than the search you and I have known all of these years. You can’t tailor and cater to voice search by making some slight adjustments to your SEO course, you need to consider an entirely new way of going about search and optimize for that – which practically speaking is problematic.
We’re operating under a false notion that there is some sort of linguistic “string theory” that can unite verbal and written communication into one seamless fabric. A 2014 study from the University of Leeds perhaps said it best when it concluded that, “Spoken English varies more than written English, and more than people realize…. There is no straightforward relationship between spoken and written English.” (Download the University of Leeds study.)
The idea that we’re going to easily optimize for both forms of search is not true. Unless Google determines a way to comprehend and translate between the two forms of communication, simultaneous optimization is going to be quite difficult indeed (if at all possible…more on that shortly). The question for us is, does the SEO community really appreciate this difference?
Appreciating Just How Different Voice Search Is
You might ask, “Can’t we solve this “voice search discrepancy” by training ourselves over time to talk the way we write?” In other words, why can’t we bridge the divide between the two methods of search simply by taking a more formal oral tone when doing a voice search? Well, this may help in certain cases, for certain queries, but it’s not a systemic solution simply because it is impossible to speak the way we write. Speaking, as I mentioned earlier, is hardwired into our brains in a way that writing is not, and vice versa. Simply put, you can’t just solve this very complex issue by trying to talk the way you would write (just as you can’t solve it with long tail keywords).
Online Search Is a Language of Its Own
To take this notion one step further, and to really complicate the idea of bridging the optimization gap between oral and written search, some prominent linguists consider online search to be a form of language unto itself. In a lecture given to English language learners in Serbia, famed linguist David Crystal points out that each venue of electronic communication is in a sense a dialect of its own. Specifically, Crystal notes, “every internet domain [medium] that you’re dealing with influences the way in which you use the language.” The way we communicate linguistically when we interact with a search engine is different than any other form of communication. Where else, other than within search, would you ever say things like pizza New York downtown or Pop Rocks Mikey myth?
Voice search on the other hand is intended to be completely natural, that’s the whole point… search the way you talk. It’s not as if the task of optimizing for both voice and traditional search is merely the process of integrating informal English into a standard English environment, which is extremely problematic in its own right (as I’ll show in the second installment of this series). Rather, the ideal of optimizing for voice search while simultaneously optimizing for traditional search is the attempt to merge oral language with a very specific, very nuanced, and very unique language… search language. Thus the gap between the two optimization efforts is not just wide, but dauntingly expansive. Attempting to create harmony between non-standardized language and standardized language is one thing, trying to create an equal playing field where two diverse linguistic entities can co-exist is not really possible.
Highlighting the Differences Between Voice and Traditional Search
Before I wrap this all up there is one other construct that in a way is unique to oral language, and that is its social structure. As I touched upon earlier, oral language includes a social dynamic that is unique to the medium. Simply, oral language includes the immediacy of two parties who interact with each other. This creates a social and psychological dynamic that is unique to the medium and has enormous impact on how oral language is structured inherently.
It’s for this reason that developmental and educational studies highlight the social nature of oral communication (as shown in this downloadable analysis of oral language development of students with special needs). This will be discussed extensively in the next installment of this study, but for now let’s deal with it in a general sense.
For the sake of exploring this notion, let’s just say I was conversing with someone about job loss in Canada. In doing so I might ask, How many Canadians are let go each year? If I were writing the same question down to an abstract audience, my brain may substitute the term let go in favor of the more direct and less wordy lose, as in How many Canadians lose their jobs each year? In fact, in formal writing I would be encouraged to do so.
Now, I’m not saying you will do this, I’m saying you might do this, because when we speak, we often employ the use of euphemisms (not to say we don’t when writing, but as previously mentioned doing so is often discouraged). Our brains do this automatically, and it does so for good reason… when we speak it’s usually to another party. In this case, the term let go is a bit softer and more sensitive. This is the reason why we may ask someone when did your relative pass away and not when did they die.
My point here is simple, our brains take different things into consideration when speaking than when writing due to the social associations we have towards speech. This meta-difference has deep practical effects. In our case here, we actually get different search results.
For the search How many Canadians lose their jobs each year we get:
A search that employs a standard use of language produces only relevant results
Everything looks fine in the first search that employed a standard form of English. However, once a euphemism common to oral language was employed, Google had no idea what we were talking about, as shown below.
Deviating from the direct language of standard English, employing euphemisms common to oral language had a deep impact on Google’s search results
The featured snippet when searching for how many Canadians are let go each year relates to immigration and even asks if we meant to search for something else entirely, the top results have to do with immigration again, as well as buying and selling used goods (the results towards the bottom of the SERP were more of the same).
Lastly, let’s try a more formal search term, the way we would normally conduct written search, and run a query for yearly employment loss in Canada:
The search results for the query “yearly employment loss in Canada” produces relevant results with three of them matching the query “How many Canadians lose their jobs each year”
Here we get results similar to when we searched for How many Canadians lose their jobs each year. In fact, three of the results match the search How many Canadians lose their jobs each year (the last match, How Ontario lost 300,000… was on the SERP when I searched How many Canadians lose their… just at a position not shown in the earlier image). Google does a much better job when not having to factor in characteristics unique to how the brain functions within the context of oral language. Though, I’ll still point out that many of the actual results were not parallel even though they were on topic.
To sum it all up, there is a deep meta-linguistic difference that is undeniably present when you compare traditional search to voice search. (Just how far-reaching are these differences practically will again be discussed in Part II of this series).
Summing It All Up – The Real Problem Voice Search Optimization Presents
Succinctly speaking, the difference between oral and written language relates to the very substance of the two forms of communication. In other words, we’re not dealing with two variations of the same linguistic material, but two distinct and separate substances. Speaking is different than writing from a historical /societal level, on contextual level, and perhaps importantly on a neurological level. Relegating voice search optimization to long tail keywords and a less formal tone in order to create search harmony is certainly an oversimplification and most likely a fantasy. The truth is, unless Google can bring synchronicity to the system by being able to “translate” voice search queries to match their written counterparts, the SEO community could be facing an uphill battle should voice search become as popular as many think it will.
Fundamentally speaking, the linguistic constructs that go into voice search are on a meta-level simply incompatible with traditional search… Wait until you see how it all plays out when we look at the actual structure of oral vs. written language, including how regional dialects and cultural influences throw a serious wrench into creating a harmony between the two forms of search… stay tuned!
Darrell creates SEO content for Similarweb, drawing on his deep understanding of SEO and Google patents.
Related Posts
Wondering what Similarweb can do for your business?
Give it a try or talk to our insights team — don’t worry, it’s free!