Hollywood’s Love Affair With Fictional Languages

In Avatar, the fictional language Na’vi is built on a painstakingly detailed world.

Fictional language scripts
Erik Carter / The Atlantic

For big fans of James Cameron’s Avatar, the 13-year wait between the original and this year’s sequel probably felt near interminable. But die-hard fans might have counted with a bit more agony and say it’s actually been vomrra zìsìt, or “15 years.”

I’m not implying that Avatar rots the brain. Rather, the blue-skinned Na’vi people, who inhabit the planet Pandora in Cameron’s universe, have four digits per hand. As a result, their language—painstakingly built from scratch for the movies—uses base-eight counting instead of the human base-10. Fifteen in Na’vi actually means eight plus five (as opposed to 10 plus five in English), making it the equivalent of our 13.

During those “15” years, Paul Frommer—the business professor and linguist who developed a complete Na’vi language for the first movie, including its octal counting system—created a distinct dialect for the reef-dwelling clan introduced in Avatar: The Way of Water. Only a few snippets are audible in the three-hour sequel, and Frommer is waiting to release more details to the small but passionate community of Na’vi speakers here on planet Earth—he wants to give them the opportunity to puzzle over its lexical and syntactical variations first.

Commissioning an entirely new language, which felt special for the first Avatar, is becoming a staple for immersive science-fiction and fantasy worlds. We’ve seen the invention of Dothraki and High Valyrian for HBO’s Game of Thrones, spoken and sign languages for the recent Dune remake, and bloodsucker-speak for Vampire Academy, to name only a handful. These languages are as functional as English, with internally consistent rules. In turn, neuroscientists have been able to harness them to better understand how the human brain processes constructed and natural languages, giving us new clues into what, exactly, constitutes a language to begin with.

Whereas the sounds and syntax of natural languages evolve over hundreds of years of unscripted conversation, many invented languages of similar complexity are quite literally scripted, pieced together on short timelines of months or even weeks. It all raises the question: Just how does someone build a fictional language?

The earliest recorded constructed language, or “conlang,” was created in the 12th century by a German nun, Hildegard of Bingen. Scholars still puzzle over the purpose of Bingen’s lingua ignota, preserved in a glossary of about 1,000 words, but its categories and hierarchies, with God and angels on top, suggest religious motivations.

The documented history of sustained, systematic language construction really begins several hundred years later. In the 1600s, as the ideas that would eventually produce the Enlightenment were gaining momentum, philosophers sought to create an ultrarational mode of communication. “The purpose was to find the truth of the universe by finding a language in which you could only express the truth,” says Arika Okrent, a linguist who wrote the landmark history In the Land of Invented Languages.

To create a universally true language would require the categorization of every possible thing and idea. That’s exactly what the British polymath John Wilkins set out to do when he created his “philosophical language,” among the most famous of these attempts, in which he broke down the universe into its most basic units of meaning and laid them out in a monstrous conceptual map. When it came time to link written words to those concepts, Wilkins sought a “real character” composed of symbols that were not surrogates for words or sounds, but that produced meaning through their form. Each word was essentially a coordinate for a concept’s location on Wilkins’s map. The philosophical language translation for “shit,” as Okrent tracks down in her book, is a stringing together of the scriptural representations of category XXXI, or motion (“ce”); subcategory IV, or purgation (“p”); sub-subcategory 9, or gross parts (“uhw”); and finally, the opposite of vomiting (“s”)—all of which combine to form cepuhws, or “shit.”

Efforts like Wilkins’s were brilliant, even beautiful, and laid the foundation for modern taxonomy. But their high standard for conceptual precision made the actual languages unusable because “you have to know what you want to say before you can put your words together,” Okrent told me. Intellectuals soon lost interest.

Two centuries later, the search for another mode of ideal communication began, more practical but no less lofty in its ambition: a common language that would serve as a vehicle for international peace. Esperanto, invented in 1887 by a Polish ophthalmologist, is the most famous example and is still in use. But that quest, too, was abandoned after horrors throughout the 20th century made clear that linguistic divides were not the root of humanity’s enmity and bloodshed.

The model for modern language creation lies not in philosophy or international relations, but in the work of the Lord of the Rings author, J. R. R. Tolkien—that is, in fantasy. Tolkien, a philologist who helped work on the Oxford English Dictionary, did not design languages for his fictional world and histories, but built a universe around the multiple tongues he had been making since about 1910. (A common misconception is that Elvish is a language; rather, it is a language family, something like Romance or Sino-Tibetan.) The next well-known effort was Star Trek’s Klingon, designed in 1984 to sound as alien as possible by creating sound combinations not found in any human language. Yet “conlanging” remained fringe, even among nerds and linguists, for decades. As Okrent wrote in the opening line of her book, which came out in 2009, Klingon speakers inhabited “the lowest possible rung on the geek ladder.”

The debut of 10-foot-tall blue aliens in theaters that same year morphed conlang condescension into fascination. In 2005, Cameron had sent out a request for a linguist to construct a unique Na’vi language, and Frommer, then teaching business communication at the University of Southern California but with a linguistics Ph.D., got the job. There were a few constraints: He needed to incorporate 30 or so words Cameron had already come up with and make the language learnable by humans.

Frommer also suspected that a subset of enthusiastic viewers would want to explore the language, and so he designed it to stand up to scrutiny. He began with the basic sounds and sound system of Na’vi, some parts of which were inspired by various human languages, “but others are very unfamiliar,” like the fng and tskx consonant clusters, Frommer told me. Then he considered the formation of words and their relation to basic grammar, also known as morphology. The prefixes that indicate a noun’s plurality aren’t limited to indicating “one” or “many,” but have different forms for “one,” “two,” “three,” and “four or more”; verbs have in-fixes (as opposed to pre- or suffixes), insertions into the middle of a word to modify its meaning. Last came syntax, how words combine into phrases and sentences, with some innovations Frommer has not encountered in any natural language. For instance, expressing “I am here” in Na’vi requires a transitive verb—tok, literally “to occupy a space”—signifying how “your existing in the place has changed the place,” he said: Oel fìtsengit tok (oel tok fìtsengit or tok oel fìtsengit are also acceptableNa’vi word order is very flexible).

As important as these technical aspects, Frommer said, is “how your language is going to conform to and be appropriate to the environment in which it’s spoken, and to the individuals who speak it.” The octal counting system is one example, constructed to match how Na’vi would naturally have evolved given the physical characteristics of the beings speaking it. Frommer also created various idioms to reflect the Pandoran planet, such as na loreyu ’awnampi, which means that someone is shy but literally translates to “like a touched helicoradian”—a reference to large, spiraling plants (well, technically plant-animals) that coil up when brushed against.

Similar principles apply to nonverbal communication: The deaf actor CJ Jones created an underwater sign language for the reef-dwelling Na’vi in The Way of Water by imagining “how the Na’vi would communicate underneath the water,” he said in an interview with IGN. “And so I decided to create, use the feeling, and get into their soul.” Creating languages for Cameron’s films, then, required conjuring a sort of avatar—imagining the Na’vi people and their environment by putting oneself in their bodies and world.

Only two years after the first Avatar, the debut of HBO’s Game of Thrones series—with complete languages such as Dothraki and High Valyrian and which, across all seasons, generated an estimated $285 million in profits an episode—firmly established invented languages as a benchmark for immersive, well-constructed fantasy and science-fiction worlds. The show’s creators went to David Peterson, a linguist who co-founded the Language Creation Society, hoping he could flesh out the snippets of language found in George R. R. Martin’s novels.

Peterson, joined by the linguist Jessie Sams in 2019, creates conlangs professionally for television and film—“the top of the chain of the artistic conlang movement,” Okrent told me—and is responsible for the fictional languages in productions including The Witcher, Paper Girls, and Motherland: Fort Salem. Ideally they’d have six months for each project, Sams told me, but they sometimes have been given as few as 10 days. The success of several movies and shows with their own languages, combined with communities of fans facilitated by the internet, is part of what makes this business possible. “People look at it as not only an important way to build characters and build world, but to help build a stronger, better fan base,” Sams said.

Peterson embraces a kind of simulation to create his languages. He sets up a simple protolanguage that might exist in a given fictional universe, and then traces the kind of natural evolution that might take place in its sounds, lexicon, and grammar—as if the language’s path “recommends itself to you,” Peterson told me.

Every decision shapes and is shaped by the language’s overall structure; vocabulary and idioms should reflect the environment and history in which they will be used. “A fleshed-out history is what separates languages that are good from languages that are excellent,” Peterson wrote in his 2015 book. Locating a language in space and time, then—fitting it to an embodied communicator and physical environment, as well as to a point in its evolution—may be the key to its success.

When executed well, fantasy and science-fiction languages don’t only mimic the structure and evolution of natural forms of spoken communication; fMRI scans reveal that the brain seems to treat them the same as real languages.

Evelina Fedorenko, a cognitive scientist at MIT, has for years studied how the brain behaves when an individual speaks different languages, and has discovered that “some basic features of the neural mechanism for language are similar” across languages, she told me. The same neural machinery fires when the brain processes any of the 45 languages, across 12 language families, that her lab has studied. Fedorenko also previously found that the parts of the brain that process literal language are not active when people engage in cognitive activities often metaphorically described as “languages,” like solving math problems, listening to music, or programming.

Her lab wanted to apply the same method to conlangs to see if, for example, a fluent Na’vi speaker listening to the language would use the same parts of the brain as a Mandarin speaker listening to Mandarin would. Constructed languages haven’t evolved over hundreds of years via organic conversations, after all: “So the question then arose, does the brain treat it as a language?” says Saima Malik-Moraleda, who worked on a study of Esperanto, Klingon, Dothraki, High Valyrian, and Na’vi. “Or will it be like computer languages, where it’s processed in these other networks?” Their research, which has not yet been published, found that it was the former—all five of the languages studied activated the brain’s language network.

“It seems like languages provide us with mappings between forms and meanings,” Fedorenko said. English and Na’vi might lead the brain to associate words with objects and ideas in an attempt to communicate those meanings to others, whereas a line of code or sheet music helps with problem-solving or aesthetic expression; in other words, language and thought are not necessarily the same. Perhaps John Wilkins’s philosophical language, if difficult to use, had struck near the essence of language; centuries later, mapping symbols to an abstract meaning-space is similar to how cutting-edge AI translation programs work as well.

Beautiful languages are created without Hollywood’s backing, of course; Peterson is fond of Rikchik, a language created for seven-armed inhabitants imagined to live on a planet in the Alpha Centauri system. And in turn, enjoyment can arise even without a technically sophisticated conlang. The Game of Thrones novels were best sellers without fleshed-out Dothraki; the languages in Star Wars, one of the most successful franchises ever, are mostly gibberish, even if Han Solo claims to understand Chewbacca’s bestial warbling.

I’ll even admit to crying multiple times while watching Avatar: The Way of Water, in which, even as a fictional language makes the Na’vi world feel complete, most of the dialogue is in English. Perhaps, if someone had completed the quest to create a language of universal truths, or one that would foster universal empathy—or if you and I both had the distinct neural braids of the Na’vi people—I could precisely explain in only a few words why I connected with the blue-skinned CGI characters and make you cry as well. As is, you’ll have to watch all 192 minutes of the film before agreeing with or ridiculing me.