What Do 64 Interim Releases Have to Do with Today’s Quality Control?

This blog post was prompted by a discussion in our forum. One of our customers, who’s been using List & Label since 1995 (which actually goes back longer than my own experience!) sent us a message, telling us that due to fear of errors, he usually waits for up to a year after the release, before finally implementing a new version. Right here, I’d like to explain why I don’t think that’s a good idea, and give you a bit of an insight into our quality assurance. We want you to feel good about using List & Label in your applications on a daily basis.

Looking Back – 27 Years Ago

In 1995, our quality assurance was actually hardly existing. We didn’t have a department designated to this. The release procedure wasn’t fixed, and there were no automated tests to be found. The product was only a few years old, and combit was significantly smaller than today – it was a different time. I still recall version 5, where we finally ended up at minor version 64, i.e. there were 64 interim releases, a new one every few days. The changelog wasn’t public, so nobody knew what we actually changed. That was – from today’s point of view – not a good situation. It became quite obvious that things couldn’t go on like this.

As a consequence, we professionalized ourselves very quickly. The last time I documented our approach to this was about two years ago. Since then, we’ve continuously improved our quality control, and now it’s time for an update again.

combit Quality Assurance 2022

Our QA is now a separate department, managed by my co-worker Daniel Kenner. Every single process is standardized and recorded in detailed flow charts (checklists). The entire build process, including all automated tests, runs completely automatic every night. A fresh setup, including all tested modules, is available every morning.

Below is a quick overview of the various tests List & Label has to go through every night, and also before a release.

Static Code Analysis
We try to eliminate as many errors as possible already in the very beginning, while we’re coding. The earlier a bug is found, the easier (and cheaper) it is to fix it. In addition to the common analyzer libraries from Microsoft, we’re using Sonarqube for static code analysis. Here’s an insight into a recent analysis of List & Label’s .NET code:


With Sonarqube we cover both our .NET code and the TypeScript code of our web components.

Tests of All Web Components
In addition, our web components. the Web Report Designer and Web Report Viewer, are subject of thorough testing by Cypress. This library makes it possible to cover almost all frontend and backend functionalities with tests.

In this process, there’s currently 90 different test units that are being evaluated, with each test consisting of up to 20 assertions.

Unit-tests of our .NET Component With NUnit
A whole range of unit tests are performed at code level, and we’re using NUnit for this. The tests are running automatically with every build on our TeamCity build server, and check the behavior of the .NET programming interface using various assertions:

The list continues on, of course. In the event of any deviation, the build is marked as invalid, which also prevents the modules from being sent to you, e.g. via our support. In this case, as with all other tests mentioned below, the following applies: a module that has not successfully passed all tests must not be passed on to customers.

Automated Printing Tests
In my blog post linked above, I already listed the tests. Now, there’s a lot more to it. These tests cover both, the printing engine as well as the PDF generation, being our main export format. A total of 1236 printouts are currently generated for each build and compared with stored references. This number is constantly growing: with every new bug that shows up, we’ll add another test in order to rule out regression of that exact issue in the future.

Depending on the target format, we use different tools for the automated comparisons: PDF files are compared with DiffPDF, a commercial tool. For the preview files we use our own comparison tool, which is “hidden” and integrated into our DLLs. By using an API, we utilize it to compare two vector-level LL files. Not even the smallest change will be missed. As soon as any output shifts even by just 1/1000 mm, it’ll get noticed immediately. This happens from time to time, e.g. if we import a new PDF library and the rendering needs minimal adjustment. The failed tests will be viewed manually, evaluated and, if necessary, approved.

An (unfortunately) very recent example for this approach: we had updated the rendering library for SVG files for version 28. Without further notice in the release notes, the return value of a function was changed, so that the sizing for SVG files delivers wrong results, which is especially obvious with strongly rectangular SVGs, they are displayed too small. A customer pointed out the problem to us in the forum. Of course, we had covered the SVG output via a series of tests. A multi-page test uses a few dozen test SVGs from the W3C, covering every imaginable SVG feature we support. However, there was a problem with the tests:

Almost all images are square – exactly in this constellation the error was not really visible. On the left are the faulty images, on the right the corrected ones. Meanwhile there are of course tests with rectangular SVGs. But they come too late for this case. So we could only react as fast as possible and send the customer a new DLL with a fix 37 minutes after the problem was reported.

Automated UI Tests
The user interface is another subject for constant testing. For this one, we’re using Ranorex, which is another commercial tool. It carries out a complete release checklist for the designer every night, which is additionally tested manually when a service pack is released. This includes insertion, editing and deletion of various object types, output to various export formats, a test of the property editors, the hotspot functionality of charts and cross tables, and much more. The following screenshot gives you a glimpse of how Ranorex works in cooperation with TeamCity:


A full test run currently takes about 30 minutes, which is an eternity for an automatically clicking engine and just shows the depth of the functionality that’s getting covered here. The same applies here as well: no module that doesn’t pass all tests will be delivered to our customers.

Manual Whitebox and Blackbox Tests
But of course, it’s not only machines working at combit, but also a whole range of real QA professionals, who are constantly trying to come up with all conceivable (and sometimes absurd) ideas during manual tests in order to “break the feature”. This is an essential part of the process a feature has to go through, before it receives a final release mark. Ahead of every single service pack release, the most important parts of the program are manually re-tested according to detailed test plans in addition to the automated tests mentioned above.

Conclusion

We’ve come a long way since 1995 in regards to professionalism at all levels. If I ever would have waited a year before importing a new version in the past, I’d be convinced now that we’ll always deliver the best possible quality on a daily basis. The previously customary “I’ll wait until the tenth service pack” wouldn’t work any longer. As a rule, we only release one service pack per quarter year. The bugs we find are getting more and more exotic, although they don’t stop popping up. But you’re probably used to that from your own software. I’d be happy if this blog post gave you some ideas to extend your own QA. And if, in return, you have any suggestions or ideas as to what else we should implement into the process, I’d be happy to hear your comments.

Related Posts

Leave a Comment