QNX Software Systems is hosting a one-hour webinar at 10 am PT (1 pm ET) on Tuesday, May 13, 2008. The event is entitled “In-Field Debugging: Diagnosing Software Problems While Maintaining System Availability.” The webcast will be useful for system designers and architects, software engineers, and managers in all embedded markets.
Software bugs that make it to market not only cause incorrect system behavior and low system availability but also result in unhappy (and fewer) customers. Unfortunately, conventional debugging methods can themselves interfere with the availability, performance, and correct behavior of the affected system. The event will examine debug and information-gathering techniques that can maintain system availability while generating artifacts that help diagnose and resolve software failure. Topics include non-invasive system tracing techniques, kernel instrumentation, software watchdogs, debug partitions, and postmortem debugging.
A modern embedded system may employ hundreds of software tasks, all of them sharing system resources and interacting in complex ways. This complexity can undermine reliability for the simple reason that the more code a system contains, the greater the probability that coding errors will make their way into the field. (By some estimates, a million lines of code will ship with at least 1000 bugs, even if the code is methodically developed and tested.) Coding errors can also compromise security since they often serve as entry points for malicious hackers.
No amount of testing can fully eliminate these bugs and security holes as no test suite can anticipate every scenario that a complex software system may encounter. Consequently, system designers and software developers must adopt a “mission-critical mindset” and employ software architectures that can contain software errors and recover from them quickly. Just as important, developers must employ tools and debugging techniques that help maintain system integrity during the problem-solving process. The tools can’t introduce changes that adversely or unpredictably affect system behavior, particularly if the system is actively providing service to users. And once the developer has fixed any software component, the tools and underlying operating system should make it easy to upload and monitor the fixed version, again without affecting overall system behavior and availability.