Posts Tagged ‘sense’

According to WordNet, the word BANK has multiple senses, and so any occurrence of it in a text document is ambiguous. For example, we can have a river BANK, a financial BANK, a fog BANK, or an aeronautical BANK. The intended sense in a particular document has to be determined by looking at the context of occurrence. So, to determine the actual meaning of BANK in a document, we have to ask in effect whether the document is talking about streams of water, financial meltdowns, marine navigation, or aircraft in flight.

Now the number of different possible contexts is probably huge.One cannot hope to recognize them all; but for disambiguation of words, we need only fairly general contexts to distinguish the word senses of prime interested to us. Furthermore, given a large of our target text, we can employ statistical methods to identify the most important of such contexts.

This is essentially what SemanticHacker is all about.The dimensions of one of our semantic dictionaries defines thousands of contextual reference points for the interpretation of terms. For example, if the words stream, water, flow, erosion, and grass are in a document, then with the ODP 2009 dictionary, we find that the top match dimension is 1461 (Top/Science/Environment/Water_Resources) with a weight of 0.5138. In this context, the word BANK would probably mean “river bank.”

Actually, we don’t need to make this explicit association. With a search engine user interface, one just needs a way of describing the context of ambiguous search terms, perhaps by listing contextual words. Then all a semantic search engine has to do is find a document containing the search term and having the same described context in its semantic signature. This is of course a part of our API for search.