Too many of our systems have been optimized for cost efficiency to the point where they’re so brittle, they start failing a lot.
The other day I had a bit of an issue with a UPS delivery: The driver had given a package for me to a neighbor; that neighbor had apparently just moved in and their name was not yet on the doorbell. This shouldn’t be a big deal, but the process of retrieving that package taught me a lot about the state of our large-scale, privately run, infrastructure systems.
I’ll include the mini saga of what happened for context below, but feel free to skip this section to get to the important bit.
———
The note I received only contained the name of my neighbor, but no location. I used UPS’s online package tracking to find our more details, but that name was all the system provided.
I tried to get in touch with customer service, which eventually worked, and they said there was only the neighbor’s name and nothing else. Could we contact the driver, they would know, I inquired: the concrete knowledge is usually situated at the working level, after all. No. I’ll send you a form to start an investigation, please fill it out and we’ll get started. It’ll take 7-10 days.
The form I received was not in fact a form, but a link to the UPS website’s lost package claims. I assumed the package wasn’t in fact lost, just needed to be found, but I filled it out nonetheless as instructed. After completing it, it just threw up a dialog box informing me that this case needs further information and cannot be filed, and to please call a different customer service hotline.
I reached it, quite quickly, just to learn that I should please fill in the form again, and otherwise might call technical customer service to see if I had technical issues. (I did not.)
Getting slightly annoyed, I send a (friendly) customer service with the background and a note that I appear to get stuck in a loop if I follow instructions, and received communications back to fill out the form (which pointed me back to the hotline). Eventually, I reached a second level customer support person who wrote in actual sentences, not copy & paste text blocks. Progress! After two exchanges they asked me to — you guessed it — fill out the claims form, or even better, have the sender fill it out.
———
There were no humans in the loop anywhere: the humans mostly were distinctly outside the loop and tasked with pointing strays like me back inside the loop. The loop — the whole process — strived to be fully automated.
It probably worked well as long as everything went according to plan. However, the moment something went off-script, as the real world tends to, it all broke. It was a highly efficient machine for the majority of cases that went according to plan, and completely, systematically failed for any case outside the aspirational norm case.
This system — just like many others like it — is highly efficient but brittle. It lacks the resilience that makes a system work under real world conditions.
A large scale system needs to have resilience, redundancy, elegant ways to escalate issues to someone who is equipped — with the skills, the mandate, the tools — to solve them.
Any system that is not equipped to handle issues that are to be expected — and then some — is a failed system.
There’s a school of thought that I find really quite interesting that goes something like this: A system’s real job is what it does. The job is not what it nominally tries to do, but what it really does: The actual outcome.
In this case, this would mean that the system is not set up to get packages to the recipient, but to hand a package to a person and then create a webpage that gives you indication about who a package was handed over to. That is vastly different from delivering packages, or tracking their whereabouts. But the system is designed to do the former, not the latter. And that is a real issue.
We see more and more companies try to replace customer service with chatbots. I’m sure, McKinsey has one of their famous $400K slide decks that boils down to: Just replace customer service staff with chatbots, it’s fine, line goes up! But of course it’s not fine at all. All you do is provide worse service, and make it harder for people to let you know if things don’t work as intended.
This also means that many companies are flying blind: They simply don’t have the backchannels and feedback loops anymore to learn what doesn’t work and the ways their systems fail to work. This is incredibly valuable information, though! But: Customer complaints go down (just on paper – there’s simply no mechanism to capture them anymore), so everyone’s happy, right? Right?!?
Chatbots are a great addition to the customer service toolkit: They can be an elegant, powerful way to navigate FAQs and other large sources of information, if they are done well. They cannot replace humans, though: Humans empowered to solve problems, to make decisions. This is such low hanging fruit.
Currently, we see how companies at scale accelerate in the wrong direction: Less human customer service, more chatbots and AI. This won’t end well. I very much expect this to end badly and for them to reverse course. Until then, it’s going to be a bumpy ride. Do you want to talk to our AI assistant?