字幕列表 影片播放
My name is Steve Pinker, and I’m Professor of Psychology at Harvard University. And
today I’m going to speak to you about language. I’m actually not a linguist, but a
cognitive scientist. I’m not so much interested as language as an object in its own right,
but as a window to the human mind. Language is one of the fundamental topics
in the human sciences. It’s the trait that most conspicuously distinguishes humans
from other species, it’s essential to human cooperation; we accomplish amazing things
by sharing our knowledge or coordinating our actions by means of words. It poses profound
scientific mysteries such as, how did language evolve in this particular species? How does
the brain compute language? But also, language has many practical applications not surprisingly
given how central it is to human life. Language comes so naturally to us that we’re
apt to forget what a strange and miraculous gift it is. But think about what you’re
doing for the next hour. You’re going to be listening patiently as a guy makes noise
as he exhales. Now, why would you do something like that? It’s not that I can claim that
the sounds I’m going to make are particularly mellifluous, but rather I’ve coded information
into the exact sequences of hisses and hums and squeaks and pops that I’ll be making.
You have the ability to recover the information from that stream of noises allowing us to
share ideas. Now, the ideas we are going to share are about
this talent, language, but with a slightly different sequence of hisses and squeaks,
I could cause you to be thinking thoughts about a vast array of topics, anything from
the latest developments in your favorite reality show to theories of the origin of the universe.
This is what I think of as the miracle of language, its vast expressive power, and it’s
a phenomenon that still fills me with wonder, even after having studied language for 35
years. And it is the prime phenomenon that the science of language aims to explain.
Not surprisingly, language is central to human life. The Biblical story of the Tower of
Babel reminds us that humans accomplish great things because they can exchange information
about their knowledge and intentions via the medium of language. Language, moreover,
is not a peculiarity of one culture, but it has been found in every society ever studied
by anthropologists. There’s some 6,000 languages spoken on Earth,
all of them complex, and no one has ever discovered a human society that lacks complex language.
For this and other reasons, Charles Darwin wrote, “Man has an instinctive tendency
to speak as we see in the babble of our young children while no child has an instinctive
tendency to bake, brew or write.”
Language is an intricate talent and it’s not surprising that the science of language
should be a complex discipline. It includes the study of how language itself
works including: grammar, the assembly of words, phrases and sentences; phonology, the
study of sound; semantics, the study of meaning; and pragmatics, the study of the use of language
in conversation. Scientists interested in language also
study how it is processed in real time, a field called psycholinguistics; how is it
acquired by children, the study of language acquisition. And how it is computed in the
brain, the discipline called neurolinguistics.
Now, before we begin, it’s important to not to confuse language with three other things
that are closely related to language. One of them is written language. Unlike spoken
language, which is found in all human cultures throughout history, writing was invented a
very small number of times in human history, about 5,000 years ago.
And alphabetic writing where each mark on the page stands for a vowel or a consonant,
appears to have been invented only once in all of human history by the Canaanites about
3,700 years ago. And as Darwin pointed out, children have no instinctive tendency to write,
but have to learn it through construction and schooling.
A second thing not to confuse language with is proper grammar. Linguists distinguish
between descriptive grammar - the rules, that characterize how people to speak - and prescriptive
grammar - rules that characterize how people ought to speak if they are writing careful
written prose. A dirty secret from linguistics is that not
only are these not the same kinds of rules, but many of the prescriptive rules of language
make no sense whatsoever. Take one of the most famous of these rules, the rule not to
split infinitives. According to this rule, Captain Kirk made
a grievous grammatical error when he said that the mission of the Enterprise was “to
boldly go where no man has gone before.” He should have said, according to these
editors, “to go boldly where no man has gone before,” which immediately clashes
with the rhythm and structure of ordinary English. In fact, this prescriptive rule
was based on a clumsy analogy with Latin where you can’t splint an infinitive because it’s
a single word, as in facary[ph] to do. Julius Caesar couldn’t have split an infinitive
if he wanted to. That rule was translated literally over into English where it really
should not apply. Another famous prescriptive rule is that,
one should never use a so-called double negative. Mick Jagger should not have sung, “I can’t
get no satisfaction,” he really should have sung, “I can’t get any satisfaction.”
Now, this is often promoted as a rule of logical speaking, but “can’t” and “any”
is just as much of a double negative as “can’t” and “no.” The only reason that “can’t
get any satisfaction” is deemed correct and “can’t get no satisfaction” is deemed
ungrammatical is that the dialect of English spoken in the south of England in the 17th
century used “can’t” “any” rather than “can’t” “no.”
If the capital of England had been in the north of the country instead of the south
of the country, then “can’t get no,” would have been correct and “can’t get
any,” would have been deemed incorrect.
There’s nothing special about a language that happens to be chosen as the standard
for a given country. In fact, if you compare the rules of languages and so-called dialects,
each one is complex in different ways. Take for example, African-American vernacular English,
also called Black English or Ebonics. There is a construction in African-American where
you can say, “He be workin,” which is not an error or bastardization or a corruption
of Standard English, but in fact conveys a subtle distinction, one that’s different
than simply, “He workin.” “He be workin,” means that he is employed; he has a job, “He
workin,” means that he happens to be working at the moment that you and I are speaking.
Now, this is a tense difference that can be
made in African-American English that is not made in Standard English, one of many examples
in which the dialects have their own set of rules that is just as sophisticated and complex
as the one in the standard language. Now, a third thing, not to confuse language
with is thought. Many people report that they think in language, but commune of psychologists
have shown that there are many kinds of thought that don’t actually take place in the form
of sentences.
(1.) Babies (and other mammals) communicate without speech
For example, we know from ingenious experiments that non-linguistic creatures, such as babies
before they’ve learned to speak, or other kinds of animals, have sophisticated kinds
of cognition, they register cause and effect and objects and the intentions of other people,
all without the benefit of speech. (2.) Types of thinking go on without language--visual
thinking We also know that even in creatures that do
have language, namely adults, a lot of thinking goes on in forms other than language, for
example, visual imagery. If you look at the top two three-dimensional figures in this
display, and I would ask you, do they have the same shape or a different shape? People
don’t solve that problem by describing those strings of cubes in words, but rather by taking
an image of one and mentally rotating it into the orientation of the other, a form of non-linguistic
thinking. (3.) We use tacit knowledge to understand
language and remember the gist For that matter, even when you understand
language, what you come away with is not in itself the actual language that you hear.
Another important finding in cognitive psychology is that long-term memory for verbal material
records the gist or the meaning or the content of the words rather than the exact form of
the words. For example, I like to think that you retain
some memory of what I have been saying for the last 10 minutes. But I suspect that
if I were to ask you to reproduce any sentence that I have uttered, you would be incapable
of doing so. What sticks in memory is far more abstract than the actual sentences, something
that we can call meaning or content or semantics.
In fact, when it even comes to understanding a sentence, the actual words are the tip of
a vast iceberg of a very rapid, unconscious, non-linguistic processing that’s necessary
even to make sense of the language itself. And I’ll illustrate this with a classic
bit of poetry, the lines from the shampoo bottle. “Wet hair, lather, rinse, repeat.”
Now, in understanding that very simple snatch
of language, you have to know, for example, that when you repeat, you don’t wet your
hair a second time because its already wet, and when you get to the end of it and you
see “repeat,” you don’t keep repeating over and over in infinite loop, repeat here
means, “repeat just once.” Now this tacit knowledge of what the writers **** of
language had in mind is necessary to understand language, but it, itself, is not language.
(4.) If language is thinking, then where did
it come from? Finally, if language were really thought,
it would raise the question of where language would come from if it were incapable of thinking
without language. After all, the English language was not designed by some committee
of Martians who came down to Earth and gave it to us. Rather, language is a grassroots
phenomenon. It’s the original wiki, which aggregates the contributions of hundreds of
thousands of people who invent jargon and slang and new constructions, some of them
get accumulated into the language as people seek out new ways of expressing their thoughts,
and that’s how we get a language in the first place.
Now, this not to deny that language can affect thought and linguistics has long been interested
in what has sometimes been called, the linguistic relativity hypothesis or the Sapir-Whorf Hypothesis
(note correct spelling, named after the two linguists who first formulated it, namely
that language can affect thought. There’s a lot of controversy over the status of the
linguistic relativity hypothesis, but no one believes that language is the same thing as
thought and that all of our mental life consists of reciting sentences.
Now that we have set aside what language is not, let’s turn to what language is beginning
with the question of how language works. In a nutshell, you can divide language into
three topics. There are the words that are the basic components
of sentences that are stored in a part of long-term memory that we can call the mental
lexicon or the mental dictionary. There are rules, the recipes or algorithms that
we use to assemble bits of language into more complex stretches of language including syntax,
the rules that allow us to assemble words into phrases and sentences; Morphology, the
rules that allow us to assemble bits of words, like prefixes and suffixes into complex words;
Phonology, the rules that allow us to combine vowels and consonants into the smallest words.
And then all of this knowledge of language has to connect to the world through interfaces
that allow us to understand language coming from others to produce language that others
can understand us, the language interfaces.
Let’s start with words. The basic principle of a word was identified
by the Swiss linguist, Ferdinand de Saussure, more than 100 years ago when he called attention
to the arbitrariness of the sign. Take for example the word, “duck.” The word,
“duck” doesn’t look like a duck or walk like a duck or quack like a duck, but I can
use it to get you to think the thought of a duck because all of us at some point in
our lives have memorized that brute force association between that sound and that meaning,
which means that it has to be stored in memory in some format, in a very simplified form
and an entry in the mental lexicon might look something like this. There is a symbol for
the word itself, there is some kind of specification of its sound and there’s some kind of specification
of its meaning. Now, one of the remarkable facts about the
mental lexicon is how capacious it is. Using dictionary sampling techniques where you say,
take the top left-hand word on every 20th page of the dictionary, give it to people
in a multiple choice test, correct for guessing, and multiply by the size of the dictionary,
you can estimate that a typical high school graduate has a vocabulary of around 60,000
words, which works out to a rate of learning of about one new word every two hours starting
from the age of one. When you think that every one of these words is arbitrary as a
telephone number of a date in history, you’re reminded about the remarkable capacity of
human long-term memory to store the meanings and sounds of words.
But of course, we don’t just blurt out individual words, we combine them into phrases and sentences.
And that brings up the second major component of language; namely, grammar.
Now the modern study of grammar is inseparable to the contributions of one linguist, the
famous scholar, Noam Chomsky, who set the agenda for the field of linguistics for the
last 60 years. To begin with, Chomsky noted that the main
puzzle that we have to explain in understanding language is creativity or as linguists often
call it productivity, the ability to produce and understand new sentences.
Except for a small number of clichéd formulas, just about any sentence that you produce or
understand is a brand new combination produced for the first time perhaps in your life, perhaps
even in the history of the species. We have to explain how people are capable of doing
it. It shows that when we know a language, we haven’t just memorized a very long list
of sentences, but rather have internalized a grammar or algorithm or recipe for combining
elements into brand new assemblies. For that reason, Chomsky has insisted that linguistics
is really properly a branch of psychology and is a window into the human mind.
A second insight is that languages have a syntax which can’t be identified with their
meaning. Now, the only quotation that I know of, of a linguist that has actually made
it into Bartlett’s Familiar Quotations, is the following sentence from Chomsky, from
1956, “Colorless, green ideas sleep furiously.” Well, what’s the point of that sentence?
The point is that it is very close to meaningless. On the other hand, any English speaker can
instantly recognize that it conforms to the patterns of English syntax. Compare, for
example, “furiously sleep ideas dream colorless,” which is also meaningless, but we perceive
as a word salad. A third insight is that syntax doesn’t consist
of a string of word by word associations as in stimulus response theories in psychology
where producing a word is a response which you then hear and it becomes a stimulus to
producing the next word, and so on. Again, the sentence, “colorless green ideas sleep
furiously,” can help make this point. Because if you look at the word by word transition
probabilities in that sentence, for example, colorless and then green; how often have you
heard colorless and green in succession. Probably zero times. Green and ideas, those two words
never occur together, ideas and sleep, sleep and furiously. Every one of the transition
probabilities is very close to zero, nonetheless, the sentence as a whole can be perceived as
a well-formed English sentence. Language in general has long distance dependencies.
The word in one position in a sentence can dictate the choice of the word several positions
downstream. For example, if you begin a sentence with “either,” somewhere down
the line, there has to be an “or.” If you have an “if,” generally, you expect
somewhere down the line there to be a “then.” There’s a story about a child who says
to his father, “Daddy, why did you bring that book that I don’t want to be read to
out of, up for?” Where you have a set of nested or embedded long distance dependencies.
Indeed, one of the applications of linguistics
to the study of good prose style is that sentences can be rendered difficult to understand if
they have too many long distance dependencies because that could put a strain on the short-term
memory of the reader or listener while trying to understand them.
Rather than a set of word by word associations, sentences are assembled in a hierarchical
structure that looks like an upside down tree. Let me give you an example of how that works
in the case of English. One of the basic rules of English is that a sentence consists
of a noun phrase, the subject, followed by a verb phrase, the predicate.
A second rule in turn expands the verb phrase. A very phrase consists of a verb followed
by a noun phrase, the object, followed by a sentence, the complement as, “I told him
that it was sunny outside.”
Now, why do linguists insist that language must be composed out of phrase structural
rules? (1.) Rules allow for open-ended creativity
Well for one thing, that helps explain the main phenomenon that we want to explain,
mainly the open-ended creativity of language.
(2.) Rules allow for expression of unfamiliar meaning
It allows us to express unfamiliar meanings. There’s a cliché in journalism for example,
that when a dog bites a man, that isn’t news, but when a man bites a dog, that is
news. The beauty of grammar is that it allows us to convey news by assembling into familiar
word in brand new combinations. Also, because of the way phrase structure rules work, they
produce a vast number of possible combinations.
(3.) Rules allow for production of vast numbers of combinations
Moreover, the number of different thoughts that we can express through the combinatorial
power of grammar is not just humongous, but in a technical sense, it’s infinite. Now
of course, no one lives an infinite number of years, and therefore can shell off their
ability to understand an infinite number of sentences, but you can make the point in the
same way that a mathematician can say that someone who understands the rules of arithmetic
knows that there are an infinite number of numbers, namely if anyone ever claimed to
have found the longest one, you can always come up with one that’s even bigger by adding
a one to it. And you can do the same thing with language.
Let me illustrate it in the following way. As a matter of fact, there has been a claim
that there is a world’s longest sentence.
Who would make such a claim? Well, who else? The Guinness Book of World Records. You
can look it up. There is an entry for the World’s Longest Sentence. It is 1,300
words long. And it comes from a novel by William Faulkner. Now I won’t read all
1,300 words, but I’ll just tell you how it begins.
“They both bore it as though in deliberate flatulent exaltation…” and it runs on
from there. But I’m here to tell you that in fact, this
is not the world’s longest sentence. And I’ve been tempted to obtain immortality
in Guinness by submitting the following record breaker. "Faulkner wrote, they both bore
it as though in deliberate flatulent exaltation.” But sadly, this would not be immortality
after all but only the proverbial 15 minutes of fame because based on what you now know,
you could submit a record breaker for the record breaker namely, "Guinness noted that
Faulkner wrote" or "Pinker mentioned that Guinness noted that Faulkner wrote", or "who
cares that Pinker mentioned that Guinness noted that Faulkner wrote…"
Take for example, the following wonderfully ambiguous sentence that appeared in TV Guide.
“On tonight’s program, Conan will discuss sex with Dr. Ruth.”
Now this has a perfectly innocent meaning in which the verb, “discuss” involves
two things, namely the topic of discussion, “sex” and the person with who it’s being
discussed, in this case, with Dr. Ruth. But is has a somewhat naughtier meaning if you
rearrange the words into phrases according to a different structure in which case “sex
with Dr. Ruth” is the topic of conversation, and that’s what’s being discussed.
Now, phrase structure not only can account for our ability to produce so many sentences,
but it’s also necessary for us to understand what they mean. The geometry of branches
in a phrase structure is essential to figuring out who did what to whom.
Another important contribution of Chomsky to the science of language is the focus on
language acquisition by children. Now, children can’t memorize sentences because knowledge
of language isn’t just one long list of memorized sentences, but somehow they must
distill out or abstract out the rules that goes into assembling sentences based on what
they hear coming out of their parent’s mouths when they were little. And the talent of
using rules to produce combinations is in evidence from the moment that kids begin to
speak. Children create sentences unheard from adults
At the two-word stage, which you typically see in children who are 18 months or a bit
older, kids are producing the smallest sentences that deserve to be counted as sentences, namely
two words long. But already it’s clear that they are putting them together using
rules in their own mind. To take an example, a child might say, “more outside,” meaning,
take them outside or let them stay outside. Now, adults don’t say, “more outside.”
So it’s not a phrase that the child simply memorized by rote, but it shows that already
children are using these rules to put together new combinations.
Another example, a child having jam washed from his fingers said to his mother 'all gone
sticky'. Again, not a phrase that you could ever have copied from a parent, but
one that shows the child producing new combinations.
Past tense rule An easy way of showing that children assimilate
rules of grammar unconsciously from the moment they begin to speak, is the use of the past
tense rule. For example, children go through a long stage
in which they make errors like, “We holded the baby rabbits” or “He teared the paper
and then he sticked it.” Cases in which they over generalize the regular rule of forming
the past tense, add ‘ed’ to irregular verbs like “hold,” “stick” or “tear.”
And it’s easy to show… it’s easy to get children to flaunt this ability to apply
rules productively in a laboratory demonstration called the Wug Test. You bring a kid into
a lab. You show them a picture of a little bird and you say, “This is a wug.” And
you show them another picture and you say, “Well, now there are two of them.” There
are two and children will fill in the gap by saying “wugs.” Again, a form they
could not have memorize because it’s invented for the experiment, but it shows that they
have productive mastery of the regular plural rule in English.
And famously, Chomsky claimed that children solved the problem of language acquisition
by having the general design of language already wired into them in the form of a universal
grammar. A spec sheet for what the rules of any language
have to look like.
What is the evidence that children are born with a universal grammar? Well, surprisingly,
Chomsky didn’t propose this by actually studying kids in the lab or kids in the home,
but through a more abstract argument called, “The poverty of the input.” Namely,
if you look at what goes into the ears of a child and look at the talent they end up
with as adults, there is a big chasm between them that can only be filled in by assuming
that the child has a lot of knowledge of the way that language works already built in.
Here’s how the argument works. One of
the things that children have to learn when they learn English is how to form a question.
Now, children will get evidence from parent’s speech to how the question rule works, such
as sentences like, “The man is here,” and the corresponding question, “Is the
man here?” Now, logically speaking, a child getting
that kind of input could posit two different kinds of rules. There’s a simple word
by word linear rule. In this case, find the first “is” in the sentence and move
it to the front. “The man is here,” “Is the man here?” Now there’s a more
complex rule that the child could posit called a structure dependent rule, one that looks
at the geometry of the phrase structure tree. In this case, the rule would be: find
the first “is” after the subject noun phrase and move that to the front of the sentence.
A diagram of what that rule would look like is as follows: you look for the “is”
that occurs after the subject noun phrase and that’s what gets moved to the front
of the sentence. Now, what’s the difference between the simple
word-by-word rule and the more complex structured dependent rule? Well, you can see the difference
when it comes to performing the question from a slightly more complex sentence like, “The
man who is tall is in the room.” But how is the child supposed to learn that?
How did all of us end up with the correct structured dependent of the rule rather than
the far simpler word-by-word version of the rule?
“Well,” Chomsky argues, “if you were actually to look at the kind of language that
all of us hear, it’s actually quite rare to hear a sentence like, “Is the man who
is tall in the room? The kind of input that would logically inform you that the word-by-word
rule is wrong and the structure dependent rule is right. Nonetheless, we all grow
up into adults who unconsciously use the structure dependent rule rather than the word-by-word
rule. Moreover, children don’t make errors like, “is the man who tall is in the room,”
as soon as they begin to form complex questions, they use the structure dependent rule. And
that,” Chomsky argues, “is evidence that structure dependent rules are part of the
definition of universal grammar that children are born with.”
Now, though Chomsky has been fantastically influential in the science of language that
does not mean that all language scientists agree with him. And there have been a number
of critiques of Chomsky over the years. For one thing, the critics point out, Chomsky
hasn’t really shown principles of universal grammar that are specific to language itself
as opposed to general ways in which the human mind works across multiple domains, language
and vision and control of motion and memory and so on. We don’t really know that universal
grammar is specific to language, according to this critique.
Secondly, Chomsky and the linguists working with him have not examined all 6,000 of the
world’s languages and shown that the principles of universal grammar apply to all 6,000. They’ve
posited it based on a small number of languages and the logic of the poverty of the input,
but haven’t actually come through with the data that would be necessary to prove that
universal grammar is really universal. Finally, the critics argue, Chomsky has not
shown that more general purpose learning models, such as neuro network models, are incapable
of learning language together with all the other things that children learn, and therefore
has not proven that there has to be specific knowledge how grammar works in order for the
child to learn grammar.
Another component of language governs the sound pattern of language, the ways that the
vowels and consonants can be assembled into the minimal units that go into words. Phonology,
as this branch of linguistics is called, consists of formation rules that capture what is a
possible word in a language according to the way that it sounds. To give you an example,
the sequence, bluk, is not an English word, but you get a sense that it could be an English
word that someone could coin a new form… that someone could coin a new term of English
that we pronounce “bluk.” But when you hear the sound ****, you instantly know thatthat
not only isn’t it an English word, but it really couldn’t be an English word. ****, by
the way, comes from Yiddish and it means kind of to sigh or to moan. Oi. That’s to
****. The reason that we recognize that it’s not
English is because it has sounds like **** and sequences like ****, which aren’t part of
the formation rules of English phonology. But together with the rules that define
the basic words of a language, there are also phonological rules that make adjustments to
the sounds, depending on what the other words the word appears with. Very few of us realize,
for example, in English, that the past tense suffix “ed” is actually pronounced
in three different ways. When we say, “He walked,” we pronounce the “ed” like
a “ta,” walked. When we say “jogged,” we pronounce it as a “d,” jogged. And
when we say “patted,” we stick in a vowel, pat-ted, showing that the same suffix,
“ed” can be readjusted in its pronunciation according to the rules of English phonology.
Now, when someone acquires English as a foreign
language or acquires a foreign language in general, they carry over the rules of phonology
of their first language and apply it to their second language. We have a word for it;
we call it an “accent.” When a language user deliberately manipulates the rules of
phonology, that is, when they don’t just speak in order to convey content, they pay
attention as to what phonological structures are being used; we call it poetry and rhetoric.
So far, I’ve been talking about knowledge of language, the rules that go into defining
what are possible sequences of language. But those sequences have to get into the brain
during speech comprehension and they have to get out during speech production. And
that takes us to the topic of language interfaces.
And let’s start with production.
This diagram here is literally a human cadaver that has been sawn in half. An anatomist
took a saw and [sound] allowing it to see in cross section the human vocal tract. And
that can illustrate how we get out knowledge of language out into the world as a sequence
of sounds. Now, each of us has at the top of our windpipe
or trachea, a complex structure called the larynx or voice box; it’s behind your Adam’s
Apple. And the air coming out of your lungs have to go passed two cartilaginous flaps
that vibrate and produce a rich, buzzy sound source, full of harmonics. Before that vibrating
sound gets out to the world, it has to pass through a gauntlet or chambers of the vocal
tract. The throat behind the tongue, the cavity above the tongue, the cavity formed
by the lips, and when you block off airflow through the mouth, it can come out through
the nose. Now, each one of those cavities has a shape
that, thanks to the laws of physics, will amplify some of the harmonics in that buzzy
sound source and suppress others. We can change the shape of those cavities when we
move our tongue around. When we move our tongue forward and backward, for example,
as in “eh,” “aa,” “eh,” “aa,” we change the shape of the cavity behind the
tongue, change the frequencies that are amplified or suppressed and the listener hears them
as two different vowels. Likewise, when we raise or lower the tongue,
we change the shape of the resonant cavity above the tongue as in say, “eh,” “ah,”
“eh,” “ah.” Once again, the change in the mixture of harmonics is perceived as
a change in the nature of the vowel. When we stop the flow of air and then release
it as in, “t,” “ca,” “ba.” Then we hear a consonant rather than a vowel or
even when we restrict the flow of air as in “f,” “ss” producing a chaotic noisy
sound. Each one of those sounds that gets sculpted by different articulators is perceived
by the brain as a qualitatively different vowel or consonant.
Now, an interesting peculiarity of the human vocal track is that it obviously co-ops structures
that evolved for different purposes for breathing and for swallowing and so on. And it’s
an… And it’s an interesting fact first noted by Darwin that the larynx over the course
of evolution has descended in the throat so that every particle of food going from the
mouth through the esophagus to the stomach has to pass over the opening into the larynx
with some probability of being inhaled leading to the danger of death by choking. And in
fact, until the invention of the Heimlich Maneuver, several thousand people every year
died of choking because of this maladaptive of the human vocal tract.
Why did we evolve a mouth and throat that leaves us vulnerable to choking? Well, a
plausible hypothesis is that it’s a compromise that was made in the course of evolution to
allow us to speak. By giving range to a variety of possibilities for alternating the
resonant cavities, for moving the tongue back and forth and up and down, we expanded the
range of speech sounds we could make, improve the efficiency of language, but suffered the
compromise of an increased risk of choking showing that language presumably had some
survival advantage that compensated for the disadvantage in choking.
What about the flow of information in the other direction, that is from the world into
the brain, the process of speech comprehension?
Speech comprehension turns out to be an extraordinarily complex computational process, which we're
reminded of every time we interact with a voicemail menu on a telephone or you use a
dictation on our computers. For example, One writer, using the state-of-the-art speech-to-text
systems dictated the following words into his computer. He dictated “book tour,”
and it came out on the screen as “back to work.” Another example, he said, “I
truly couldn’t see,” and it came out on the screen as, “a cruelly good MC.” Even
more disconcertingly, he started a letter to his parents by saying, “Dear mom and
dad,” and what came out on the screen, “The man is dead.”
Now, dictation systems have gotten better and better, but they still have a way to go
before they can duplicate a human stenographer.
What is it about the problem of speech understanding that makes it so easy for a human, but
so hard for a computer? Well, there are two main contributors. One of them is the fact
that each phony, each vowel or consonant actually comes out very differently, depending on what
comes before and what comes after. A phenomenon sometimes called co-articulation.
Let me give you an example. The place called Cape Cod has two “c” sounds.
Each of them symbolized by the letter “C,” the hard “C.” Nonetheless, when you
pay attention to the way you pronounce them, you notice that in fact, you pronounce them
in very different parts of the mouth. Try it. Cape Cod, Cape Cod… “c,” “c”.
In one case, the “c” is produced way back in the mouth; the other it’s produced
much farther forward. We don’t notice that we pronounce “c” in two different
ways depending whether it comes before an “a” or an “ah,” but that difference
forms a difference in the shape of the resonant cavity in our mouth which produces a very
different wave form. And unless a computer is specifically programmed to take that variability
into account, it will perceive those two different “c’s,” as a different sound that objectively
speaking, they really are: “c-eh” “c-oa”. They really are different sounds, but our
brain lumps them together. The other reason that speech recognition is
such a difficult problem is because of the absence of segmentation. Now we have an
illusion when we listen to speech that consists of a sequence to sounds corresponding to words.
But if you actually were to look at the wave form of a sentence on a oscilloscope,
there would not be little silences between the words the way there are little bits of
white space in printed words on a page, but rather a continuous ribbon in which the end
of one word leads right to the beginning of the next.
It’s something that we’re aware of… It’s something that we’re aware of when
we listen to speech in a foreign language when we have no idea where one word ends and
the other one begins. In our own language, we detect the word boundaries simply because
in our mental lexicon, we have stretches of sound that correspond to one word that tell
us where it ends. But you can’t get that information from the wave form itself.
In fact, there’s a whole genre of wordplay that takes advantage of the fact that word
boundaries are not physically present in the speech wave. Novelty songs like Mairzy doats
and dozy doats and liddle lamzy divey A kiddley divey too, wooden shoe? Now,
it turns out that this is actually a grammatical sequence in words in English… Mares eat
oats and does eat oats and little lambs eat ivy, a kid'll eat ivy too, wouldn’t you?
When it is spoken or sung normally, the boundaries between words are obliterated and so the same
sequence of sounds can be perceived either as nonsense or if you know what they’re
meant to convey, as sentences. Another example familiar to most children,
Fuzzy Wuzzy was a bear, Fuzzy Wuzzy had no hair. Fuzzy Wuzzy wasn’t very fuzzy,
was he? And the famous dogroll, I scream, you scream, we all scream for ice cream.
We are generally unaware of how unambiguous language is. In context, we effortlessly
and unconsciously derive the intended meaning of a sentence, but a poor computer not equipped
with all of our common sense and human abilities and just going by the words and the rules
is often flabbergasted by all the different possibilities. Take a sentence as simple
as “Mary had a little lamb,” you might think that that’s a perfectly simple unambiguous
sentence. But now imagine that it was continued with “with mint sauce.” You realize
that “have” is actually a highly ambiguous word. As a result, the computer translations
can often deliver comically incorrect results.
According to legend, one of the first computer systems that was designed to translate from
English to Russian and back again did the following given the sentence, “The spirit
is willing, but the flesh is weak,” it translated it back as “The vodka is agreeable, but
the meat is rotten.” So why do people understand language so much
better than computers? What is the knowledge that we have that has been so hard to program
into our machines? Well, there’s a third interface between language and the rest of
the mind, and that is the subject matter of the branch of linguistics called Pragmatics,
namely, how people understand language in context using their knowledge of the world
and their expectation about how other speakers communicate.
The most important principle of Pragmatics is called “the cooperative principle,”
namely; assume that your conversational partner is working with you to try to get a meaning
across truthfully and clearly. And our knowledge of Pragmatics, like our knowledge of syntax
and phonology and so on, is deployed effortlessly, but involves many intricate computations.
For example, if I were to say, “If you could pass the guacamole, that would be awesome.”
You understand that as a polite request meaning, give me the guacamole. You don’t
interpret it literally as a rumination about a hypothetical affair, you just assume that
the person wanted something and was using that string of words to convey the request
politely. Often comedies will use the absence of pragmatics
in robots as a source of humor. As in the old “Get Smart” situation comedy, which
had a robot named, Hymie, and a recurring joke in the series would be that Maxwell Smart
would say to Hymie, “Hymie, can you give me a hand?” And then Hymie would go, {sound},
remove his hand and pass it over to Maxwell Smart not understanding that “give me a
hand,” in context means, help me rather than literally transfer the hand over to me.
Or take the following example of Pragmatics
in action. Consider the following dialogue, Martha says, “I’m leaving you.” John
says, “Who is he?” Now, understanding language requires finding the antecedents
pronouns, in this case who the “he” refers to, and any competent English speaker knows
exactly who the “he” is, presumably John’s romantic rival even though it was never stated
explicitly in any part of the dialogue. This shows how we bring to bear on language understanding
a vast store of knowledge about human behavior, human interactions, human relationships. And
we often have to use that background knowledge even to solve mechanical problems like, who
does a pronoun like “he” refer to. It’s that knowledge that’s extraordinarily difficult,
to say the least to program into a computer.
Language is a miracle of the natural world because it allows us to exchange an unlimited
number of ideas using a finite set of mental tools. Those mental tools comprise a large
lexicon of memorized words and a powerful mental grammar that can combine them. Language
thought of in this way should not be confused with writing, with the prescriptive rules
of proper grammar or style or with thought itself.
Modern linguistics is guided by the questions, though not always the answers suggested by
the linguist known as Noam Chomsky, namely how is the unlimited creativity of language
possible? What are the abstract mental structures that relate word to one another? How do children
acquire them? What is universal across languages? And
what does that say about the human mind? The study of language has many practical applications
including computers that understand and speak, the diagnosis and treatment of language disorders,
the teaching of reading, writing, and foreign languages, the interpreting of the language
of law, politics and literature. But for someone like me, language is eternally
fascinating because it speaks to such fundamental questions of the human condition. [Language]
is really at the center of a number of different concerns of thought, of social relationships,
of human biology, of human evolution, that all speak to what’s special about the human
species. Language is the most distinctively human talent.
Language is a window into human nature, and most significantly, the vast expressive
power of language is one of the wonders of the natural world. Thank you.