Be sure to stop by the TextWise booth at the 2012 Cloudforce Tour: Powering the Social Enterprise in Chicago on Thursday May 3.  Let us show you how we can unify all your sources for answering customer questions, via your knowledge bases (including community forums), with just One Click!

TextWise is pleased to be previewing our latest Categorization and
Concept Tagging technology. It’s available now for you to try.

This enhanced service uses our new Semantic Gist technology to
provide significantly more relevant Category and Concept Tags.

In our testing, relevancy improved 10% for Category labels and 9%
for Concept tags, a dramatic increase in accuracy.

Semantic Gist technology for category and concept tag generation
will become the default API configuration in a month or so, but is
available for your testing and feedback now. We’d love to hear any
comments you have after trying it out.

Instructions for accessing the preview are available here:
http://textwise.com/api/documentation/general-docs/service-configurations

If you have any questions, comments or praise, please contact us at
support@textwise.com.

TextWise Booth 1517 Aug 30-Sept 2, 2011
Stop by the TextWise Booth 1517 to see our One-Click Findability App at Dreamforce 2011 in San Francisco. With TextWise semantic search technology, customers can reduce the time a call center agent spends on a call by 25 percent and increase the number of calls deflected from the call center to customer self-service.
TextWise facilitates linking of customer queries to resolutions with its patented contextual approach. Providing context to queries result in more relevant answers to customer questions. Built using Force.com, the social enterprise platform for employee apps, One-Click Findability is immediately available for a test drive and deployment on AppExchange at http://www.salesforce.com/appexchange/.
The TextWise One-Click Findability App is unique in that it supports all verticals and offers access for both call center agents and customer self-service. The app offers improved searching to quickly find resolutions for customers across repositories. It allows for the viewing of result sets from different sources of information, regardless of whether that information is contained within Salesforce knowledge bases or in other repositories. These results can be viewed as federated result sets for unified repositories, or faceted result sets for a single repository. Finally, the One-Click Findability App enables call center agents to specify and access continuously-updated external content from the web through RSS feeds.
Dreamforce is the industry’s leading global cloud computing event. The event is focused on inspiring customer, partner and developer success with social, mobile and open cloud computing. Attendees will learn how to maximize their current investments and explore new offerings across Salesforce Chatter, Database.com, Force.com, Sales Cloud, Service Cloud and more.

Tags:

The TextWise One-Click Findability application is now available on the Salesforce App Exchange.  If you manage a contact center and/or customer self-service using Salesforce then try this app. Our patented semantic technology uses queries of any length and complexity and directly matches the query to relevant information with one click.  One-Click Findability provides automatic matching of customer queries/cases to the most relevant information in Knowledge, Solutions and/or other information repositories.

Reduce the cost of running your call center with One-Click Findability:

  • Save time and resources
  • Generate consistently-relevant answers to
customer queries in one-click matching
  • Reduce the amount of time agents spend
on calls looking for answers
  • Expedite call deflection through self-service
  • Remove the guesswork from keyword searches
  • Eliminate repetitive key word query
attempts to find answers in your
knowledge repositories.

One-Click Findability provides a host of valuable features: Self-service and/or agented access; Support of all verticals; Federated and/or faceted result sets from multiple repositories; Incorporation of external web content.

Visit textwise.com and click on the One-Click Findability link to learn more.

 

 

Aristotle lived about 2,400 years ago, well before the advent of the Worldwide Web. Yet his ideas drive the still emerging Semantic Web. In fact, we could probably do a better job as modern information scientists if we paid a bit more attention to the ancient Greek philosopher.

In his writing called “Categories,” Aristotle addressed the problem of meaning in language and developed a logical framework for semantics. In this work, he invented the theory of subjects and predicates, which modern grammar and formal logic have adopted. This was in effect RDF version 0.0.0.

Aristotle also talked about using taxonomies (from the Greek τάξις + νόμος) to define the meanings of concepts, introducing “genus” and “species” as essential relationships. This approach was adopted by Linnaeus in the 18th Century to catalog the great diversity of life on earth; and more than a hundred years later, formal taxonomies made their way into library science.

Of special interest to us here is Aristotle’s classification of the predicates associated with definitions of meaning. He defined five types: genus, species, difference, property, and accident. The first two are already familiar to information scientists as IS-A relationships. A difference predicate relates to a defining characteristic for a concept. A property is an important characteristic for a concept, but not sufficient to define it. An accident is a true predicate that makes no contribution to meaning.

For example,

(genus/species) Angelina Jolie is an American movie star.
(difference) She is the daughter of American Actor John Voight.
(property) She trained with Lee Stasberg.
(accident) She visited Costa Del Sol.

In automated building of semantic dictionaries, our problem is with accidental predicates. Such predicates have only a weak relationship to a subject and tend to lead to noisy inferred associations. We probably do not want to retrieve a news item about Angelina Jolie given a query about Costa del Sol.

Unfortunately, many and perhaps most predicates in text data are accidental. In current data driven semantic learning systems, we make no distinction here yet, and so there are opportunities here for major improvements. A possible approach here is to employ the techniques of text summarization to identify the most important “predicates” in our data and thus bias our statistics away from accidents toward properties and differences. Aristotle would be amused.

The February issue of Scientific American had an article on the latest thinking about the Whorfian Hypothesis, which states that language strongly influences how humans think. This was a hot idea about sixty years ago, but eventually fell out of academic favor because of the lack of hard empirical evidence. Now that evidence is starting to show up, which has some implications for computational semantics.

The standard view on language and meaning has recently emphasized universality. This is to say that the understanding of language is hardwired in our heads, and so any competent human should qualify as an expert in the algorithmic delineation of meaning. The Whorfian hypothesis throws us a curve here in that we now have to consider language along with culture in our models of thought. A single well-crafted taxonomy or other semantic construct will not fit all.

We see something of this problem on the Worldwide Web. As Jimmy Wales noted this past week, the content of the Web, and Wikipedia in particular, is largely created by twenty- and thirty-something males and so is dominated by their interests. A set of semantic categories derived from the Web in general will certainly be insufficient for understanding text on finance or on medicine and may be challenged even when dealing with the pages frequented by twenty- and thirty-something females.

This does not mean that a given semantic scheme is invalid. Each scheme, however, is limited by the vocabulary it covers and in the kinds of distinctions that that it makes. That should be good news for those of us who make their living in computational semantics.

Watson, IBM’s Jeopardy computer, is showing everyone that its 900-pound gorilla of trivia and is likely to beat its human opponents. Watson could still do something stupid, but its formidable performance says much about the effectiveness of current natural language processing technology and computation resources.

Although Watson has a knowledge base of millions of documents gleaned from the Web, its weakness is that it really does not understand any of this data. It is just an extremely smart entity extraction system; Watson uses the terms of a Jeopardy clue as a selecting a particular entity as an answer, which of course then has to be phrased as a question. It has to figure what kind of entity to look for and what kind of context that entity would be found in.

In a sense, this is a simple kind of semantic search because it involves scanning its entire knowledge base of documents and scoring contexts statistically. The entities of the right kind in the highest-scoring contexts are then the prime candidates for an answer; and Watson can use their statistics to derive a level of confidence that a given candidate is the right answer. This basically relies heavily on brute computational power.

As can be seen in the Jeopardy competition, brute power can be quite effective. In most of the straightforward questions that one might expect that Google would do well on, Watson can simply outsearch its opponents. It can grab enough right answers in this way to make up for its frequent wrong answers on more subtle questions requiring a deeper understanding. This is as much gamesmanship as it is intelligence.

Now imagine how overwhelming Watson could be if it actually developed some understanding and made far fewer wrong answers. The first step in this direction is in fact quite easy: develop a large set of semantic categories corresponding to how humans understand language. Indexing a knowledge base by such predefined categories would have the immediate effect of simplifying the search process so that documents do not always have to be analyzed at the lowest linguistic level. That should allow the searches to be broader, much like allowing a chess computer to analyze more moves ahead.

We of course are in the business of semantic dictionaries, which provide a quick way of assigning semantic categories to text documents. Hey, Watson. If you are listening, give us a call.

TextWise and Innography® have announced a strategic partnership to incorporate TextWise semantic search into Innography’s intellectual property business intelligence solution. The new functionality available with Innography® Fall ’10™ enables Innography customers to perform contextual semantic search using specific patent numbers or long blocks of text as the query.

“TextWise facilitates near effortless querying for patent searchers and we’re pleased to be working with Innography as the leader in IP business intelligence,” says Connie Kenneally, President of TextWise. “The incorporation of our technology into Innography’s latest release mitigates the need to perform repetitive searches that rely solely on identifying keywords to generate relevant search results. In contrast, our semantic search performs well with very long queries and one does not have to find the perfect keyword combinations to get the most relevant results.”

A TextWise search is performed by incorporating the full context of either a paragraph, claim, abstract or any longer piece of text to generate more relevant matches to similar information contained in the US patent database. “TextWise takes an innovative approach to facilitating rapid identification of patents for monetizing IP assets when coupled with Innography’s intellectual property business intelligence solution,” said Doug Miller, Chief Marketing Officer at Innography. “Our customers have expressed great interest in patent semantic search, so we are very pleased to be offering this new functionality as an option with their Innography subscriptions.”

The target areas where this joint offering would be most advantageous are:
• Expediting products for clearance from infringing on other offerings
• Rapid idea screening for patenting
• Lead identification for patent licensing, sales and patent acquisitions
• NPE defense / invalidity protection

A demonstration of Innography Fall ’10™, including the TextWise patent semantic search feature, can be found at: http://innography.com/assets/files/fall10demo/fall10-demo.html.

About Innography
Innography® delivers a comprehensive, online Intellectual Property Business Intelligence (IPBI) application that enables companies of all types and sizes to achieve the optimal return on their IP investments. By correlating patent and trademark data with financial, litigation and other key business information, Innography instantly generates a variety of unique visualizations to help organizations reduce the time it takes to perform IP research and reduce associated legal expenditures. This enables corporations to get products to market faster, uncover new and more lucrative revenue sources, keep better track of competitors, manage litigation claims, and stay on top of additional IP-associated functions. Visit www.innography.com to view a brief online product demo or call 1.512.306.8688 for more information.

Last month Anthony Vito presented a 5 minute ‘lightning talk’ on our implementation of Semantic Signatures using the SemanticHacker API for CRM. Specifically this was an example using Salesforce.com.

Textwise lightning talk, Smart Content Anthony Vito from Seth Grimes on Vimeo.

The Value of Semantic Discovery in CRM, a lightning talk presented by Anthony Vito, TextWise, at Smart Content: The Content Analytics Conference, October 19, 2010, http://smartcontentconference.com

Back in the 60′s and 70′s of the last century, the Whorfian hypothesis was a hot subject on college campuses. This was the idea that one’s native language, its syntax and semantics, strongly shaped one’s worldview. For example, Eskimos speaking Inuit supposedly had thirty different words for snow and so had a more complex relationship with their environment than someone speaking English with only one word for snow.

The problem of course is that skiers can make plenty of distinctions about kinds of snow even in English. Despite Whorfian hypothesis being theoretically attractive, it did not square in the end with our actual experience with language. That pretty much took the steam out of the Whorfian hypothesis, but now in the 21st Century, empirical support has been accumulating for a weaker version of it. This was the subject of an article in New York Times Magazine (http://nyti.ms/boqzs5).

The weak Whorfian hypothesis rejects the idea that language establishes an absolute limit on thinking. Thus we can learn about distinctions in types of snow if we really need them. The structure of a language, however, definitely can bias our thinking; and this could have consequences in practical matters like the ranking of retrieved documents. The choice of a particular semantic framework like RDF may therefore affect the performance of an information system in unexpected ways.

So far, experimental results on language and thought have focused on highly specific biases in areas of language like giving spatial directions, assigning gender to nouns, and dividing the spectrum into colors. It seems plausible, though, that this should generalize to the overall semantic problem of dividing up meaning into some kind of compact space. There is more than one way to skin a cat here, and there are probably advantages and disadvantages in each possibility.

A dogmatist might be tempted to argue here that RDF with certain standard taxonomies is the right way and everything else is wrong, but that is probably overreaching. We are not yet savvy enough about semantics to carve tablets in stone about its implementation. At present, one can say only whether a given scheme is optimal in some formal sense; but if it makes no obvious sense to people, then something more comprehendable might be better in the long run even if it is less than optimal.

The weak Whorfian hypothesis forces us to be more honest. If each semantic scheme introduces its own biases, then we need to experiment to see how different approaches work out for a given target application. Given that humans operate with more than one linguistic framework, we should not be so quick to assume than machines can do better at semantics with just a single framework.