Testing our SemanticHacker WordPress plugin has some similarities to testing foof, our Firefox extension, in that we are testing within another application. As with testing Firefox extensions, WordPress plugin testing must include testing on multiple operating systems and multiple versions of Firefox, and it adds the need to test on additional browsers. Because WordPress has been releasing frequent updates we’ve had to focus attention on how to quickly verify our plugin on each WordPress upgrade. As a result, we have two major types of testing for our WordPress plugin: testing a new release of the plugin and verifying our plugin in a new WordPress release.
Regardless of which type of test sequence we’re on, there are some things that we always have to test. We need to validate all supported browser and OS combinations and we need to test all functionality of the SemanticHacker plugin. This functionality includes the ability to use text in a blog post to find relevant content links, tags, webpage links, and products.
When testing a new release of our WordPress plugin, we have two user paths we need to test: An update of an older plugin release and a fresh install of the new version of the plugin. We run our tests on all versions for WordPress that we are supporting following both paths. Of course, if there is new functionality or bug fixes, we need to add test cases to cover those cases.
When there is a new WordPress release, we also consider two paths in which our plugin can appear in that version of WordPress: One is an existing instance of WordPress with the Semantic Hacker plugin is upgraded to the new version. The other is that our plugin is installed fresh on the version being tested. All tests are run on the new version of WordPress following both possible paths. Assuming the new WordPress release passes our tests, we add that version to our list of supported WordPress releases. At the same time we determine if there are older versions on the list for which it is no longer worthwhile to continue testing because they are too little used.
2 Comments » Posted in qa by Maurice Forrester
Read More »
Every product we test at TextWise presents its own unique challenges. We recently released a new version of foof, our Firefox add-on that replaces ads with relevant content. Testing foof required us to consider issues such as Firefox versions, operating systems, the wide variety of sites and pages on the web, along with other Firefox extensions that users may have installed.
We’ve written previously about testing on multiple combinations of browsers and operating systems in our blog post “Will this work using PlanetWeb 2.6 on my Dreamcast?” Testing a Firefox add-on makes this a little simpler because we don’t have to consider non-Firefox browsers, but we do still have to consider multiple Firefox versions and multiple operating systems. In addition, we have to consider the potential impact of other Firefox add-ons. It’s not feasible to test every potential combination of add-ons, so we test with the most popular extensions that have the greatest potential to impact our product. For foof, that means paying particular attention to popular add-ons that touch the rendering of the page. Fortunately, the Mozilla site allows sorting of add-ons by popularity and by category making it relatively simple to identify add-ons like Stylish or Greasemonkey that use scripts to alter the look of the page.
Testing of any browser-based application is going to need to take into account the wide variety of sites and pages that are available on the web. We first use our internally hosted test pages and then supplement by browsing to external sites. Our internal test pages include both our own hand built pages which cover a variety of test cases and local versions of pages that have caused problems in the past. A tool like wget comes in handy for creating local versions of these problem pages, and having them in-house means we are not repeatedly hitting someone else’s server during testing. Only after we’ve tested successfully on our internal pages do we test on external pages focusing on the web’s most popular sites. Alexa is one source for web traffic data that can be used to identify sites that should be tested to ensure the browser extension will serve the needs of typical users. Firefox’s error console and the Firebug add-on allow testers to monitor for problems that may not be immediately visible on the test page.
Testers need to remember that there are two basic types of installation that could occur when the browser extension is released: a fresh installation and an upgrade. In addition, there can be multiple paths for installation and upgrade depending, for example, if a user goes to our foof site or to the add-ons on the Mozilla site. All of these installation options need to be tested. An added complication with Firefox extensions is that it’s not possible to test installation and upgrade from the Mozilla site until after release. We test installation and upgrading using our internal site and then do additional testing once the application hits the Mozilla download site. Of course, we also look at all the foof configuration options as well. And what gets installed also needs to be uninstalled so we verify that foof will uninstall cleanly (not that anyone would want to do that!).
Firefox add-on testing, as with any other product, does not end with the release. We know that we cannot test every possible combination of foof settings, web pages, operating system, Firefox version, and additional plugins. A forum on the foof site and email links both allow users to contact us with problems. Any reported problems can then feed back into our test cases which are living documents and regularly updated.
0 Comments » Posted in qa by Maurice Forrester
Read More »
One of the challenges with creating and maintaining applications for the web is keeping up with all of today’s different web browsers and their differing under-the-hood technologies and functionality. New versions of browsers and operating systems are released frequently for a number of reasons, such as feature enhancements to security fixes. There is a wide variety of web browsers available today, each offering something a bit different from the others. Operating system vendors have their own, some of them are cross-platform and work on other operating systems, then there are the third-party browsers, and we haven’t even explored the mobile browser realm yet… Creating and maintaining a set of browser and OS combinations as a company standard toward which applications can be developed and tested has become key for us.
Our standard has been created using statistics on browser and OS usage from W3Schools, broken down by brand and version. By collecting this data and observing trends over time, we can decide when it’s appropriate to either start or discontinue supporting a browser, OS, or combination of the two. Our process is to evaluate our browser/OS support matrix each time a new major or minor version of a browser or OS is released, or at most every 6 months (assuming no browser or OS updates have occurred). Doing an evaluation of the statistics is important even if no updates have occurred, because some browsers may fall below a percentage of use needed for support, or others may have increased enough in usage or popularity to now be supported.
It’s also important to be able to test those combinations to ensure compatibility. Rather than bearing the expense of having every possible combination in-house, we use a service on the web that specializes in providing those tools to help us test. The service that we use is called BrowserCam, which gives us the ability to take “snapshots” of our applications in various browser/OS combinations on the web, and remote access on those machines for interactive testing. And to answer the original question, we have no idea – PlanetWeb2.6 on Dreamcast is not one of our supported combinations.
0 Comments » Posted in General, qa by Jay Baker
Read More »
When is a semantic dictionary good? It really depends on the application, since more specialized content requires more specialized dictionary dimensions. Typically, validation of a given application will involve extensive benchmark testing, often entailing human judgments of the effectiveness of particular statistical characterizations of content.
TextWise does all of this in its product development process, but one would not want to go through an elaboration validation procedure to test the consequences of every small change. As it turns out, there are quick statistical ways to check whether a change is likely to be good or bad. This is no substitute for actual detailed validation at some point, but it allows one to experiment with new ideas at a fairly low cost.
A digital photography metaphor is apt here. One cannot use statistics to identify a prize-winning shot, it is certainly possible to detect major problems without human judgments. For example, areas of maximally white pixels indicate blown highlights, which typically detract from the quality of an image. Similarly, problems with white balance, dynamic range, focus, and other conditions are also readily detectable.
With any huge data object like a semantic dictionary, it is difficult to construct a benchmark that will cover every aspect of it thoroughly. Statistical testing provides an overall sanity check on quality. Otherwise, one would just be buying and selling pigs in a poke.
0 Comments » Posted in Semantic Signatures, semantics by Clinton Mah
Read More »