The five most feared words in the IT support person’s vocabulary are “This. Page. Can’t. Be. Displayed.”
And yet, the growth of Service Oriented Architecture (SOA) based enterprises in the past eight years means that these dreaded words show up more and more, as services from different developers and vendors are consumed by larger, up stream platforms and and integrated to provide new capabilities.
In this kind of environment, “This Page Can’t Be Displayed” is a cry for help: the first indication of a problem. For enterprise support personnel, that message is often the first step in a long journey complete with Sherlock Holmes-style sleuthing to try to find which service along an orchestrated chain is the bad actor. And, unfortunately, when an application is being attacked or gets hacked, support personnel may not even have an error message to go on.
In both cases, the major roadblock for support and incident response staff is that application developers or development shops often use their own error codes. And they are far from intuitive. Here are some of the ‘unique’ codes that I uncovered researching this article:
- 3085 error
- error code 1383004
- database connection error
- adnpstpa (my personal favorite)
Your help desk and security operations staff have to be clairvoyants or have photographic memories or incredible code breaking skills to decipher all of the possible error codes. And his doesn’t even consider that new services and applications are constantly being integrated, each with new and different codes.
And this error code problem is about to go from bad to really, really, really bad as Internet of Things (IoT) technologies are added to the mix.
Why? I see two main problems that the IoT will exacerbate. First: the multi-vendor, distributed nature of a SOA enterprise is going to grow exponentially (or nearly so) by connected and “smart” devices, many designed for consumers rather than businesses, but dropped into enterprise IT environments anyway.
Second: the distance between the device reseller and the device’s third party support staff is growing.
Let’s look at the example of a reseller and installer of an HVAC systems. That company’s objectives are simple: it wants to install the HVAC system and related hardware and software, then move onto the next (paying) project. Training of local IT support staff isn’t high on their priority list – and forget about harmonizing error codes.
The job of supporting that building, which comprises both physical maintenance of the equipment and logical maintenance of the supporting hardware and software (including information security) will be the job of yet another third party contractor – likely in the building management space. Their staff will monitor numerous facilities using thousands, or even tens of thousands of devices. Just identifying a faulty device will be a challenge. Identifying whether the problem has failed or is the target of a software based attack (and, thus, an indicator of a possible security breach) is almost impossible.
Major vendors will tell you the solution to this problem is to simplify: buying a suite of products and services built on a single platform – their platform.
But the promise of the Internet of Things is really premised on a very different idea: interoperability between connected devices and services based on open architectures and open source code. Taking a “walled garden” approach, then, will prevent a company from realizing many of the benefits that go along with the IoT.
What is the solution? I think one answer is an error code standard that developers of IoT devices can integrate into their products, akin to how existing IT infrastructure firms embraced and extended SNMP (Simple Network Management Protocol) to allow for the monitoring of network devices across an enterprise.
Simply put: we need an SNMP for IoT devices and applications that can at least give support and security staff indications of which device, application and service has an error, characterize the error and rate its severity.
Ideally, this standard would be developed as a cross-industry standard managed by a group such as OASIS, Project Haystack or promoted through a federal government standard through NIST. So far, there is little progress to show in this regard. Event and status messages or alerting in the IoT sphere is still the province of various, private IoT platforms by large vendors (Microsoft, Oracle, SAP) and smaller ones (Axeda/Thingworx, Swarm, etc.). In fact, there are dozens of such platforms – an indicator both of the activity within IoT and the possibility for chaos.
Regardless of which standard wins out, the more time that goes on without a standard, the greater complexity problem becomes. The time to begin the process is now: starting the dialogue and saving future security staff, help desk and support personnel from a nightmare of unintelligible error messages.