Posts Tagged ‘meaning’

Learning

21 Dec 2009

Consider how we humans learn language. Even with formal education, it takes a child about 15 years starting from infancy to be able to read and understand general news articles in the New York Times. Over this period, one would probably hear or read at least on the order of 10 billion words. Even so, most high schoolers will need many additional years of schooling to become able to comprehend technical material.

So, how can anyone expect a computer to understand something like medical text after training on only about 100 million words of data? A computer of course runs on nanosecond cycles while the human brain operates on millisecond cycles; but we have had about 50,000 generations to evolve our language software, while the electronic computer has had only about 10 generations.

The bottom line here is that language learning is difficult; and it requires sifting through immense amounts of data. There probably is no magic technological shortcut here, but we have reached now the stage where our systems can routinely handle the volumes of data that would support semantic capabilities equivalent to an 8th-grade education. Decent commercial language processing tools are also now available.

Consequently, we are making major progress on semantic dictionaries, but have to be realistic about the work still ahead of us. Expect no overnight miracles from us or anyone else, especially when these are based on measly samples of data. There is still no royal road to semantics.

In Chapter 8 of Lewis Carroll’s “Alice Through the Looking Glass,” our intrepid logical adventurer is talking to the White Knight, who wants to sing to her. He says, “The name of the song is called ‘HADDOCK’S EYES.’”

It turns out of course that the name of the song is really “THE AGED AGED MAN,” though the song is actually called “WAYS AND MEANS.” The confusion here about naming is quite understandable to anyone who has ever ordered TenderSweet™ clams at HoJo’s and discovered that they are neither tender nor sweet.

All of this would be hilarious except that we have to build semantic dictionaries that must deal extensively with the meaning of names in text. This problem will take a while to talk about adequately; and so please tune in tomorrow.