At their Pixel 2 event at the beginning of the month, Google released a whole slew of new products. Besides new phones there were updated version of their smart home hub, Google Home, and some new types of product altogether.
I don’t usually write about product launches, but this event has me excited about new tech for the first time in a long time. Why? Because some aspects stood out as they stand for a larger shift in the industry: The new role of artificial intelligence (AI) as it seeps into consumer goods.
Google have been reframing themselves from a mobile first to an AI first company for the last year or so. (For full transparency I should add that I’ve worked with Google occasionally in the recent past, but everything discussed here is of course publicly available.)
We now see this shift of focus play out as it manifests in products.
Here’s Google CEO Sundar Pichai at the opening of Google’s Pixel 2 event:
We’re excited by the shift from a mobile-first to an AI-first world. It is not just about applying machine learning in our products, but it’s radically re-thinking how computing should work. (…) We’re really excited by this shift, and that’s why we’re here today. We’ve been working on software and hardware together because that’s the best way to drive the shifts in computing forward. But we think we’re in the unique moment in time where we can bring the unique combination of AI, and software, and hardware to bring the different perspective to solving problems for users. We’re very confident about our approach here because we’re at the forefront of driving the shifts with AI.
AI as a platform: Google has it.
First things first: I fully agree – there’s currently no other company that’s in as well positioned to drive the development of AI, or to benefit from it. In fact, back in May 2017 I wrote that “Google just won the next 10 years.” That was when Google just hinted at their capabilities in terms of new features, but also announced building AI infrastructure for third parties to use. AI as a platform: Google has it.
Before diving into some structural thoughts, let’s look at two specific products they launched:
Google Clips are a camera you can clip somewhere, and it’ll automatically take photos when some conditions are met: A certain person’s face is in the picture, or they are smiling. It’s an odd product for sure, but here’s the thing: It’s fully machine learning powered facial recognition, and the computing happens on the device. This is remarkable for its incredible technical achievement, and for its approach. Google has become a company of high centralization—the bane of cloud computing, I’d lament. Google Clips works at the edge, decentralized. This is powerful, and I hope it inspires a new generation of IoT products that embrace decentralization.
Google’s new in-ear headphones offer live translation. That’s right: These headphones should be able to allow for multi-language human-to-human live conversations. (This happens in the cloud, not locally.) Now how well this works in practice remains to be seen, and surely you wouldn’t want to run a work meeting through them. But even if it eases travel related helplessness just a bit it’d be a big deal.
So as we see these new products roll out, the actual potential becomes much more graspable. There’s a shape emerging from the fog: Google may not really be AI first just yet, but they certainly have made good progress on AI-leveraged services.
The mental model I’m using for how Apple and Google compare is this:
Apple’s ecosystem focuses on an integration: Hardware (phones, laptops) and software (OSX, iOS) are both highly integrated, and services are built on top. This allows for consistent service delivery and for pushing the limits of hardware and software alike, and most importantly for Apple’s bottom line allows to sell hardware that’s differentiated by software and services: Nobody else is allowed to make an iPhone.
Google started at the opposite side, with software (web search, then Android). Today, Google looks something like this:
Based on software (search/discovery, plus Android) now there’s also hardware that’s more integrated. Note that Android is still the biggest smartphone platform as well as basis for lots of connected products, so Google’s hardware isn’t the only game in town. How this works out with partners over time remains to be seen. That said, this new structure means Google can push its software capabilities to the limits through their own hardware (phones, smart home hubs, headphones, etc.) and then aim for the stars with AI-leveraged services in a way I don’t think we’ll see from competitors anytime soon.
What we’ve seen so far is the very tip of the iceberg: As Google keeps investing in AI and exploring the applications enabled by machine learning, this top layer should become exponentially more interesting: They develop not just the concrete services we see in action, but also use AI to build their new models, and open up AI as a service for other organizations. It’s a triple AI ecosystem play that should reinforce itself and hence gather more steam the more it’s used.
Google’s developer conference Google I/O has been taking place these last couple of days, and oh boy has there been some gems in CEO Sundar Pichai’s announcements.
Just to get this out right at the top: Many analysts’ reactions to the announcements were a little meh: Too incremental, not enough consumer-facing product news, they seemed to find. I was surprised to read and hear that. For me, this year’s I/O announcements were huge. I haven’t been as excited about the future of Google in a long, long time. As far as I can tell, Google’s future today looks a lot brighter still than yesterday’s.
Let’s dig into why.
Just as a quick overview and refresher, some of the key announcements (some links to write-ups included).
Let’s start with some announcements of a more general nature around market penetration and areas of focus:
Google has some new VR and AR products and collaborations in the making, both through their Daydream platform and stand-alone headsets
Impressive, but not super exciting; let’s move on to where the meat is: Artificial intelligence (AI), and more specifically machine learning (ML). Google announced a year ago to turn into an AI-first company. And they’re certainly making true on that promise:
Google Lens super-charges your phone camera with machine learning to recognize what you’re pointing the camera at and give you context and contextual actions (Wired)
Google turns Google Photos up to 11 through machine learning (via Lens), including not just facial recognition but also smart sharing.
Copy & paste gets much smarter through machine learning
Google Home can differentiate several users (through machine learning?)
Google Assistant’s SDK allows other companies and developers to include Assistant in their products (and not just in English, either)
Cloud TPU is the new hardware that Google launches for machine learning (Wired)
Google uses neural nets to design better neural nets
This is incredible. Every aspect of Google, both backend and frontend, is impacted by machine learning. Including their design of neural networks, which are improved by neural networks!
So what we see there are some incremental (if, in my book, significant) updates in consumer-facing products. This is mostly feature level:
Better copy & paste
Better facial recognition in photos (their “vision error rate of computer vision algorithms is now better than the human error rate”, says ZDNet)
Smarter photo sharing (“share all photos of our daughter with my partner automatically”)
Live translation and contextual actions based on photos (like pointing camera at wifi router to read login credentials and log you into the router automatically).
Google Home can tell different users apart.
As features, these are nice-to-haves, not must-haves. However, they’re powered by AI. That changes everything. This is large-scale deployment of machine learning in consumer products. And not just consumer products.
Google’s AI-powered offerings also power other businesses now:
Assistant can be included in third party products, like Amazon’s Alexa. This increases reach, and also the datasets available to train the machine learning algorithms further.
The new Cloud TPU chips, combined with Google’s cloud-based machine learning framework around TensorFlow means that they’re not in the business of providing machine learning infrastructure: AI-as-a-Service (AssS).
It’s this last point that I find extremely exciting. Google just won the next 10 years.
The market for AI infrastructure—for AI-as-a-Service—is going to be mostly Google & Amazon (who already has a tremendous machine learning offering). The other players in that field (IBM, and maybe Microsoft at some point?) aren’t even in the same ballpark. Potentially there will be some newcomers; it doesn’t look like any of the other big tech companies will be huge players in that field.
As of today, Google sells AI infrastructure. This is a mode that we know from Amazon (where it has been working brilliantly), but so far hadn’t really known from Google.
There haven’t been many groundbreaking consumer-facing announcements at I/O. However, the future has never looked brighter for Google. Machine learning just became a lot more real and concrete. This is going to be exciting to watch.
At the same time, now’s the best time to think about societal implications, risks, and opportunities inherent in machine learning at scale: We’re on it. In my work as well as our community over at ThingsCon we’ve been tracking and discussing these issues in the context of Internet of Things for a long time. I see AI and machine learning as a logical continuation of this same debate. So in all my roles I’ll continue to advocate for a responsible, ethical, human-centric approach to emerging technologies.
Full disclosure: I’ve worked many times with different parts of Google, most recently with the Mountain View policy team. I do not, at the time of this writing, have a business relationship with Google. (I am, however, a heavy user of Google products.) Nothing I write here is based on any kind of information that isn’t publicly available.
TL;DR: Machine learning and artificial intelligence (AI) are beginning to govern ever-greater parts of our lives. If we want to trust their analyses and recommendations, it’s crucial that we understand how they reach their conclusions, how they work, which biases are at play. Alas, that’s pretty tricky. This article explores why.
As machine learning and AI gain importance and manifest in many ways large and small wherever we look, we face some hard questions: Do we understand how algorithms make decisions? Do we trust them? How do we want to deploy them? Do we trust the output, or focus on process?
Please note that this post explores some of these questions, connecting dots from a wide range of recent articles. Some are quoted heavily (like Will Knight’s, Jeff Bezos’s, Dan Hon’s) and linked multiple times over for easier source verification rather than going with endnotes. The post is quite exploratory in that I’m essentially thinking out loud, and asking more questions than I have answers to: tread gently.
In his very good and very interesting 2017 shareholder letter, Jeff Bezos makes a point about not over-valuing process: “The process is not the thing. It’s always worth asking, do we own the process or does the process own us?” This, of course, he writes in the context of management: His point is about optimizing for innovation. About not blindly trusting process over human judgement. About not mistaking existing processes for unbreakable rules that are worth following at any price and to be followed unquestioned.
Bezos also briefly touches on machine learning and AI. He notes that Amazon is both an avid user of machine learning as well as building extensive infrastructure for machine learning—and Amazon being Amazon, making it available to third parties as a cloud-based service. The core point is this (emphasis mine): “Over the past decades computers have broadly automated tasks that programmers could describe with clear rules and algorithms. Modern machine learning techniques now allow us to do the same for tasks where describing the precise rules is much harder.”
Algorithms as a black box: Hard to tell what’s going on inside (Image: ThinkGeek)
That’s right: With machine learning, we can learn to get desirable results but without necessarily knowing how to describe the rules that get us there. It’s pure output. No—or hardly any—process in the sense that we can interrogate or clearly understand it. Maybe not even instruct it, exactly.
Let’s keep this at the back of our minds now, we’ll come back to it later. Exhibit A.
“Machine learning techniques – most recently and commonly, neural networks – are getting pretty unreasonably good at achieving outcomes opaquely. In that: we really wouldn’t know where to start in terms of prescribing and describing the precise rules that would allow you to distinguish a cat from a dog. But it turns out that neural networks are unreasonably effective (…) at doing these kinds of things. (…) We’re at the stage where we can throw a bunch of images to a network and also throw a bunch of images of cars at a network and then magic happens and we suddenly get a thing that can recognize cars.”
Dan goes on to speculate:
“If my intuition’s right, this means that the promise of machine learning is something like this: for any process you can think of where there are a bunch of rules and humans make decisions, substitute a machine learning API. (…) machine learning doesn’t necessarily threaten jobs like “write a contract between two parties that accomplishes x, y and z” but instead threatens jobs where management people make decisions.”
“Neural networks work the other way around: we tell them the outcome and then they say, “forget about the process!”. There doesn’t need to be one. The process is inside the network, encoded in the weights of connections between neurons. It’s a unit that can be cloned, repeated and so on that just does the job of “should this insurance claim be approved”. If we don’t have to worry about process anymore, then that lets us concentrate on the outcome. Does this mean that the promise of machine learning is that, with sufficient data, all we have to do is tell it what outcome we want?”
The AI, our benevolent dictator
Now if we answered Dan’s question with YES, then this is where things get tricky, isn’t it? It opens the door to a potentially pretty slippery slope.
In political science, a classic question is what the best form of government looks like. While a discussion about what “best” means—freedom? wealth? health? agency? for all or for most? what are the tradeoffs?—is fully legitimate and should be revisited every so often, it boils down to this long-standing conflict:
Can a benevolent dictator, unfettered by external restraints, provide a better life for their subjects?
Does the protection of rights, freedom and agency offered by democracy outweigh the often slow and messy decision-making processes it requires?
Spoiler alert: Generally speaking, democracy won this debate a long time ago.
(Of course there are regions where societies have held on to the benevolent dictatorship model; and the recent rise of the populist right demonstrates that populations around the globe can be attracted to this line of argument.)
The reason democracy—a form of government defined by process!—has surpassed dictatorships both benevolent and malicious is that overall it seems a human endeavor to have agency and freely express it, rather than be governed by an unfettered, unrestricted ruler of any sorts.
Every country that chooses democracy over a dictator sacrifices efficiency for process: A process that can be interrogated, understood, adapted. Because, simply stated, a process understood is a process preferred. Being able to understand something gives us power to shape it, to make it work for us: This is true both on the individual and the societal level.
Messy transparency and agency trumps blackbox efficiency.
Let’s keep that in mind, too. Exhibit B.
Who makes the AI?
Andrew Ng, who was heavily involved in Baidu’s (and before Google’s) AI efforts, emphasizes the potential impact of AI to transform society: “Just as electricity transformed many industries roughly 100 years ago, AI will also now change nearly every major industry?—?healthcare, transportation, entertainment, manufacturing?—?enriching the lives of countless people.”
“I want all of us to have self-driving cars; conversational computers that we can talk to naturally; and healthcare robots that understand what ails us. The industrial revolution freed humanity from much repetitive physical drudgery; I now want AI to free humanity from repetitive mental drudgery, such as driving in traffic. This work cannot be done by any single company?—?it will be done by the global AI community of researchers and engineers.”
While I share Ng’s assessment of AI’s potential impact, I got to be honest: His raw enthusiasm for AI sounds a little scary to me. Free humanity from mental drudgery? Not to wax overly nostalgic, but mental drudgery—even boredom!—has proven really quite important for humankind’s evolution and played a major role in its achievements. Plus, the idea that engineers are the driving force seems risky at least: It’s a pure form of stereotypical Silicon Valley think, almost a cliché. I’m willing to give him the benefit of the doubt and assume that by “researchers” he also meant to include anthropologists, philosophers, political scientists, and all the other valuable perspectives of social sciences, humanities, and other related fields.
Don’t leave something as important as AI to a bunch of tech bros (Image: Giphy)
Something as transformative as this should not, in the 21st century, be driven by a tiny group of people with very homogenous backgrounds. Diversity is key, in professional backgrounds and ways of thinking as much as in gender, ethnic, regional and cultural backgrounds. Otherwise, algorithms are bound to encode and help enforce unhealthy policies.
Engineering-driven, tech-deterministic, non-diverse expansionist thinking delivers sub-optimum results. File under exhibit C.
Bezos writes about the importance of making decisions fast, which often requires making them with incomplete information: “most decisions should probably be made with somewhere around 70% of the information you wish you had. If you wait for 90%, in most cases, you’re probably being slow. Plus, either way, you need to be good at quickly recognizing and correcting bad decisions. If you’re good at course correcting, being wrong may be less costly than you think, whereas being slow is going to be expensive for sure.”
This, again, he writes in the context of management—presumably by and through humans. How will algorithmic decision-making fit into this picture? Will we want our algorithms to start deciding—or issuing recommendations—based on 100 percent of information? 90? 70? Maybe there’s an algorithm that figures out through machine learning how much information is just enough to be good enough?
Who is responsible for making algorithmically-made decisions? Who bears the responsibility for enforcing them?
If the algorithmic load-optimizing (read: overbooking), tells airline staff to remove a passenger from a plane and it ends up in a dehumanizing debacle, who’s fault is that?
Teacher of Algorithm by Simone Rebaudengo and Daniel Prost
More Dan Hon! Dan takes this to its logical conclusion (s4e11): “We’ve outsourced deciding things, and computers – through their ability to diligently enact policy, rules and procedures (surprise! algorithms!) give us a get out of jail free card that we’re all too happy to employ.” It is, in extension, a jacked up version of “it’s policy, it’s always been our policy, nothing I can do about it.” Which is, of course, the oldest and laziest cop-out there ever was.
He continues: “Algorithms make decisions and we implement them in software. The easy way out is to design them in such a a way as to remove the human from the loop. A perfect system. But, there is no such thing. The universe is complicated, and Things Happen. While software can deal with that (…) we can take a step back and say: that is not the outcome we want. It is not the outcome that conscious beings that experience suffering deserve. We can do better.”
I wholeheartedly agree.
To get back to the airline example: In this case I’d argue the algorithm was not at fault. What was at fault is that corporate policy said this procedure has priority, and this was backed up by an organizational culture that made it seem acceptable (or even required) for staff to have police drag a paying passenger off a plane with a bloody lip.
Algorithms blindly followed, backed up by corporate policies and an unhealthy organizational culture: Exhibit D.
In the realm of computer vision, there’s been a lot of advances through (and for) machine learning lately. Generative adversarial networks (GANs), in which one network tries to fool another, seem particularly promising. I won’t pretend to understand the math behind GANs, but Quora got us covered:
“Imagine an aspiring painter who wants to do art forgery (G), and someone who wants to earn his living by judging paintings (D). You start by showing D some examples of work by Picasso. Then G produces paintings in an attempt to fool D every time, making him believe they are Picasso originals. Sometimes it succeeds; however as D starts learning more about Picasso style (looking at more examples), G has a harder time fooling D, so he has to do better. As this process continues, not only D gets really good in telling apart what is Picasso and what is not, but also G gets really good at forging Picasso paintings. This is the idea behind GANs.”
So we got two algorithmic networks sparring with one another. Both of them learn a lot, fast.
Impressive, if maybe not lifesaving results include so-called style transfer. You’ve probably seen it online: This is when you can upload a photo and it’s rendered in the style of a famous painter:
Collection Style Transfer refers to transferring images into artistic styles. Here: Monet, Van Gogh, Ukiyo-e, and Cezanne. (Image source: Jun-Yan Zhu)
Maybe more intuitively impressive, this type of machine learning can also be applied to changing parts of images, or even videos:
This is the kind of algorithmic voodoo that powers things like Snapchat “world lenses” and Facebook’s recent VR announcements (“Act 2“).
Wait, how did we get here? Oh yes, output v process!
Machine learning requires new skills (for creators and users alike)
What about skill sets required to work with machine learning, and to make machines learn in interesting, promising ways?
Google has been remaking itself as a machine learning first company. As Christine Robson, who works on Google’s internal machine learning efforts puts it: “It feels like a living, breathing thing. It’s a different kind of engineering.”
Technology Review features a stunning article absolutely worth reading in full: In The Dark Secret at the Heart of AI, author Will Knight interviews MIT professor Tommi Jaakkola who says:
“Deep learning, the most common of these approaches, represents a fundamentally different way to program computers. ‘It is a problem that is already relevant, and it’s going to be much more relevant in the future. (…) Whether it’s an investment decision, a medical decision, or maybe a military decision, you don’t want to just rely on a ‘black box’ method.’”
And machine learning doesn’t just require different engineering. It requires a different kind of design, too. From Machine Learning for Designers (ebook, free O’Reilly account required): “These technologies will give rise to new design challenges and require new ways of thinking about the design of user interfaces and interactions.”
Machine learning means that algorithms learn from—and increasingly will adapt to—their own performance, user behaviors, and external factors. Processes (however oblique) will change, as will outputs. Quite likely, the interface and experience will also adapt over time. There is no end state but constant evolution.
Technologist & researcher Greg Borenstein argues that “while AI systems have made rapid progress, they are nowhere near being able to autonomously solve any substantive human problem. What they have become is powerful tools that could lead to radically better technology if, and only if, we successfully harness them for human use.”
Borenstein concludes: “What’s needed for AI’s wide adoption is an understanding of how to build interfaces that put the power of these systems in the hands of their human users.”
Computers beating the best human Chess and Go players have given us Centaur Chess in which humans and computers play side-by-side in a team: While computers beat humans in chess, these hybrid team of humans and computers playing in tandem beat computers hands-down.
In centaur chess, software provides analysis and recommendations, a human expert makes the final call. (I’d be interested in seeing the reverse being tested, too: What if human experts gave recommendations for the algorithms to consider?)
How does this work? Why is it doing this?
Now, all of this isn’t particularly well understood today. Or more concretely, the algorithms hatched that way aren’t understood, and hence their decisions and recommendations can’t be interrogated easily.
Will Knight shares the story of a self-driving experimental vehicle that was “unlike anything demonstrated by Google, Tesla, or General Motors, and it showed the rising power of artificial intelligence. The car didn’t follow a single instruction provided by an engineer or programmer. Instead, it relied entirely on an algorithm that had taught itself to drive by watching a human do it.”
What makes this really interesting is that it’s not entirely clear how the algorithms learned:
“The system is so complicated that even the engineers who designed it may struggle to isolate the reason for any single action. And you can’t ask it: there is no obvious way to design such a system so that it could always explain why it did what it did (…) It isn’t completely clear how the car makes its decisions.”
Knight stresses just how novel this is: “We’ve never before built machines that operate in ways their creators don’t understand.”
We know that it’s possible to attack machine learning with adversarial examples: So-called adversarial examples are intentionally designed to cause the model to make a mistake, to train the algorithm incorrectly. Even without a malicious attack, algorithms also simply don’t always get the full—or right—picture: “Google researchers noted that when its [Deep Dream] algorithm generated images of a dumbbell, it also generated a human arm holding it. The machine had concluded that an arm was part of the thing.”
This—and this type of failure mode—seems relevant. We need to understand how algorithms work in order to adapt, improve, and eventually trust them.
Consider for example two areas where algorithmic decision-making could directly decide about life or death: Military and medicine. Speaking of military use cases, David Dunning of DARPA’s Explainable Artificial Intelligence program explains: “It’s often the nature of these machine-learning systems that they produce a lot of false alarms, so an intel analyst really needs extra help to understand why a recommendation was made.” Life or death might literally depend on it. What’s more, if a human operator doesn’t fully trust the AI output then that output is rendered useless.
Should we have a legal right to interrogate AI decision making? Again, Knight in Technology Review: “Starting in the summer of 2018, the European Union may require that companies be able to give users an explanation for decisions that automated systems reach. This might be impossible, even for systems that seem relatively simple on the surface, such as the apps and websites that use deep learning to serve ads or recommend songs. The computers that run those services have programmed themselves, and they have done it in ways we cannot understand. Even the engineers who build these apps cannot fully explain their behavior.”
It seems likely that this could currently not even be enforced, that the creators of these algorithmic decision-making systems might not even be able to find out what exactly is going on.
There have been numerous attempt of exploring this, usually through visualizations. This works, to a degree, for machine learning and even other areas. However, often machine learning is used to crunch multi-dimensional data sets. We simply have no great way of visualizing this in a way that makes it easy to analyze (yet).
This is worrisome to say the least.
But let me play devil’s advocate for a moment: What if the outcomes are really so good, so much better than the human-powered analysis or decision-making skills. Might not using them be simply irresponsible? Knight gives the example of a program at Mount Sinai Hospital in New York called Deep Patient that was “just way better” at predicting certain diseases from patient records.
If this prediction algorithm has a solid track record of successful analysis, but neither developers nor doctors understand how it reaches its conclusions, is it responsible to prescribe medication based on its recommendation? Would it be responsible not to?
Philosopher Daniel Dennett who studies consciousness of the mind takes it a step further. An explanation by an algorithm might not be good enough. Humans aren’t great at explaining themselves, so if an AI “can’t do better than us at explaining what it’s doing, then don’t trust it.”
It follows that an AI would need to provide a much better explanation than a human in order for it to be trustworthy. Exhibit E.
Now where does that leave us?
Let’s assume that the impact of machine learning, algorithmic decision-making and AI will keep increasing. A lot. Then We need to understand how algorithms work in order to adapt, improve, and eventually trust them.
Machine learning allows us to get desirable results, but without necessarily knowing how (exhibit A). It’s essential for a society to be able to understand and shape its governance, and to have agency in doing so. So in AI just like in governance: Transparent messiness is more desirable than oblique efficiency. Black boxes simply won’t do. We cannot have black boxes govern our lives (exhibit B). Something as transformative as this should not, in the 21st century, be driven by a tiny group of people with very homogenous backgrounds. Diversity is key, in professional backgrounds and ways of thinking as much as in gender, ethnic, regional and cultural backgrounds. Engineering-driven, tech-deterministic, non-diverse, expansionist thinking delivers sub-optimum results (exhibit C). Otherwise, algorithms are bound to encode and help enforce unhealthy policies. Blindly followed, backed up by corporate policies and an unhealthy organizational culture, this is bound to deliver horrible results (exhibit D). Hence we need to be able to interrogate algorithmic decision-making. And if in doubt, an AI should provide a much better explanation than a human in order for it to be trustworthy (exhibit E).
Machine learning and AI hold great potential to improve our lives. Let’s embrace it, but deliberately and cautiously. And let’s not hand over the keys to software that’s too complex for us to interrogate, understand, and hence shape to serve us. We must apply the same kinds of checks and balances to tech-based governance as to human or legal forms of governance—accountability & democratic oversight and all.