Chapter 9: The learning curve

Tom was a quick learner. He was amazed by the project, and thrilled by it. The way it evolved resembled the history of computer chess. The first chess computers would lose against chess masters and were limited by sheer computational power. But the programmers had gotten the structure right, and the machine’s learning curve resembled a typical S-curve: its proficiency improved only slowly at first, but it then reached a tipping-point, after which its performance increased exponentially – way beyond the proficiency of the best human players – to then finally hit the limits of its programming structure and level off, but at a much higher level than any expert player could dream off.

Chess proficiency is measured using a rating system referred to as the Elo rating system. It goes way beyond measuring performance in terms of tournament. It uses a model which relates the game results to underlying variables representing the ability of each player. The central assumption is that the chess performance of each player in a game is a normally distributed random variable. Yes, the bell curve again! It was literally everywhere, Tom thought…

Before IBM’s Deep Blue chess computer beat Kasparov in 1997, chess computers had been gaining about 40 Elo points per year on average for decades, while the best chess players only gain like 2 points per year. Of course, sheer computing power was a big factor in it. Although most people assume that a chess computer evaluates every possible position for x moves ahead, this is not the case. In a typical chess situation, one can chose from like thirty possible moves so it quickly adds up. Just evaluating all possible positions for just three moves ahead for each side would involve an evaluation of like one billion positions. Deep Blue, in the 1997 version which beat Kasparov, was able to evaluate 200 million positions per second, but Deep Blue was a supercomputer which had cost like a hundred million dollars, and when chess programmers started working on the issue in the 1950s, a computer which would be able to evaluate a million positions every second was to be built only forty years later.

Chess computers are selective. They do not examine obviously bad moves and will evaluate interesting possibilities much more thoroughly. The algorithms used to select those have become very complex. The computer can also draw on a database of historic games to help him determine what an ‘obviously’ bad move is because, of course, ‘obviously bad’ may not be all that obvious to a computer. Still, despite the selectivity, raw computing power is still a very big part of it. In that sense, artificial intelligence does not mimic human thought. Human chess players are much more selective – very much more: they look only at forty to fifty positions based on pattern recognition skills built from experience – not millions.

Promise (Tom stuck to her name: it seemed like everyone in the program had his/her own nickname for M) was selective as well, and she also had to evaluate ‘positions’. Of course, these ‘positions’ were not binary, like in chess. She determined the ‘position’ of the person using a complex set of rules combining the psychometric indicators and an incredible range of other inputs she gained from the conversation. For example, she actually analyzed little pauses, hesitations, pitch and loudness – even voice timbre. And with every new conversation, she discovered new associations, which helped her to recognize patterns indeed. She was getting pretty good at detecting lies too.

Psychological typology was at the core of her approach. It was amazing to see how, even after one session only, she was able to construct a coherent picture of the patient and estimate all of the variables – both individual as well as environmental – which were likely to influence the patient’s emotions, expectations, self-perception, values, attitude, motivation and behavior in various situations. She really was a smart ass – in every way.

Not surprisingly, all the usual suspects were involved. IBM’s Deep Computing Institute of course (the next version of Promise would run on the latest IBM Blue Gene configuration) as well as all of the other major players in the IT industry. This array of big institutional investors in the program was complemented by a lot of niche companies and dozens of individual geeks, all top-notch experts in one or the other related field.

The psychological side was covered through cooperation agreements with the usual suspects as well: Stanford, Yale, Berkeley, Princeton,… They were all there. In fact, they had a cooperation agreement with all of the top-10 psychology PhD programs through the National Research Council.

Of course, he was just working as a peon in the whole thing. The surprising thing about it all was the lack of publicity for the program, but he understood this was about to change. He suspected the program would soon not be limited to thousands of veterans requiring some degree of psychological attention. There would be many other spin-offs as well. From discussions, he understood they were discussing on how to make Promise’s remarkable speech synthesis capabilities commercially available. The obvious thing to do was to create a company around it, but then she was so good that most of the competition would probably have to file for bankruptcy, so the real problem was related to business: existing firms had claimed and had gotten a say in how this was all going to happen, and so that had delayed the IPO which had been planned already. Tom was told there were no technology constraint: while context-sensitive speech synthesis requires an awful lot of computer power (big expensive machines), the whole business model for the IPO was based on cloud computing: you would not need to ‘install’ Promise. You would just rent her on a 24/7 service basis. Tom was pretty sure everyone would.

The possibilities were endless. Tom was sure Promise would end up in each and every home in the longer run – in various versions and price categories of course, but providing basic psychological and practical comfort to everyone. She would wake you up, remind you of your business schedule and advice you on what to wear: ‘You have a Board meeting this morning. Shouldn’t you wear something more formal? Perhaps a tie?’ Oh… Sure. Thanks, Promise. ‘Your son has been misbehaving a couple of times lately. You may want to spend some time with him individually tonight.’ Oh… That sounds good. What do you suggest? ‘Why don’t you ask him to join for the gym tonight? You would go anyway.’ Oh… That sounds good. Can you text him? ‘I can but I think it is better you do it yourself to stress he should be there or, else, negotiate an alternative together.’ Yeah. I guess you’re right. Thanks, Promise. I’ll take care of it.

She would mediate in couples, assist in parenting, take care of elderly, help people advance their career. Wow! The sky was the limit really. Surprisingly, there was relatively little discussion on this in the Institute. People would tell him Promise worked fine within the limits of what she was supposed to do but that it would be difficult to adapt her to serve a wider variety of purposes. They told him that, while expert systems share the same architecture, building up a knowledge base and good inference engine took incredibly amounts of time and energy and, hence, money. In fact, that seemed to be the main problem with the program. As any Army program, it had ended up costing three times as much as originally planned for, and he was told it was just because a few high-ups in the food chain had fanatically stuck to it that it had not been shut down.

They needed to show results. The current customer base was way too narrow to justify the investment. That’s why they were eager to expand, to scale it up, and so that took everyone’s time and attention now. There was no time for dreaming. The shrinks were worried about the potential lack of supervision. It was true that Promise needed constant feedback. Human feedback. But the errors – if one could call it that way – were more like tiny little misjudgments, and Tom felt they were only improving Promise at the margin, which was the case. The geeks were less concerned and usually much more sympathetic to Tom’s ideas, but so they didn’t have much of a voice in the various management committees – and surely not in the strategic board meetings on the program. Tom had to admit he understood little of what they said anyway. Last but not least, from what he could gather, he also understood there were some serious concerns about the whole program at the very top of the administration – but he was not privy to that and wondered what they might be. Probably just bureaucratic inertia.

Of course, he could see the potential harm as well. If her goal function would be programmed differently, she could also be the perfect impostor on the Internet. She would be so convincing that she could probably talk you into almost anything. She’d be the best online seller of all times. Hence, Tom was not surprised to note the Institute was under surveillance, and he knew he would not get the access he had if he would not have served. People actually told him: his security clearance had been renewed as part of him entering the program. The same had been done for the other veterans on the program. It was quite an exceptional measure to take, but it drove the message home: while everyone was friendly and cooperative, there was no ambiguity in this regard. The inner workings of Promise was classified material, and anything linked to it too. There were firm information management rules in place and designated information management officers policed them tightly. That was another reason why they recruited patients from the program: they were all veterans, so they knew what classified really meant and they were likely to respect it.

The program swallowed him up completely. He took his supervision work seriously, and invested a lot in ‘his’ patients – M’s patients really. More than he should probably: although he had ‘only’ ten cases to supervise, these were real people – like him – and he gave him all the attention he could. Mostly by studying and preparing their file before their 30 minute interaction. That was all he could have, he was told. Once a week. The Institute strongly discouraged more meetings, and strongly discouraged meeting after working hours. He understood that. It would get out of hand otherwise and, when everything was said and done, it was M who had to do the real work. Not him. At the same, his patients did keep him busy. They called him for a chat from time to time. While the Institute discouraged that too, he found it hard to refuse, unless he was actually in the Institute itself: he did not want to be seen talking on the phone all of the time – not least of all because of the information management policy. Colleagues might suspect he was not only talking to patients so he wanted to be clear on that: no phone chats with patients in the Institute.

Not surprisingly, his relationship with Promise became somewhat less ‘affectionate’. The infatuation phase was over. He saw her more like she was: a warm voice – but a rather cold analytic framework behind. And then it did make a difference knowing she spoke with a different voice depending on who you were. She was, well… Less of an individual and more like a system. It did not decrease his respect for her. He thought she was brilliant. Just brilliant. And he didn’t hesitate to share that opinion with others. He really championed the program, and everybody seemed to like his drive and energy, as a result of which he did end up talking to the higher-ups in the Institute during the coffee break or lunch time, as he got introduced by Rick and others he had gotten to know better now. All fine chaps. They didn’t necessarily agree with his views – especially those related to putting her out on the market place – but they seemed to make for good conversation.

He focused on the file work in his conversations with her. While he still had a lot of ‘philosophical’ questions for her – more sophisticated ones he thought – he decided to only talk to her about these when he would have figured her out a bit better. He worked hard on that. He also wanted to master the programming language the geeks were using on her. They actually used quite a variety of tools but, in the end, everything was translated into a program-specific version of FuzzyCLIPS: an extension of an expert system programming language developed by NASA (CLIPS) which incorporated fuzziness and uncertainty. It was hard work: he actually felt like he was getting too old for that kind of stuff, but then Tom was Tom: once he decided to bite into something, he didn’t give up easily. Everyone applauded his efforts – but the higher-ups cautioned him: do explore but don’t talk about it to outsiders. Tom wondered if they really had a clear vision for it all. Perhaps the higher-ups did but, if so, they hid it well. He assumed it was the standard policy: strategic ambiguity.

And so the days went by. The program expansion went well: instead of talking to a few hundred veterans only, in one city only, Promise got launched in all major cities and started to help thousands of veterans. Tom saw the number explode: it crossed the 10,000 mark in just three months. That was a factor of more than twenty as compared to the pilot phase, but then there were millions of veterans. 21.5 million to be precise, and about 55% of them had been in theater fairly recently – mainly Iraq and Afghanistan. Tom wanted Promise to reach out to all of them. He thought it could grow a lot faster. He knew the only thing which restrained it was supervision. Even now, everyone on the program said they were going too fast. They called for a pause. Tom was thinking bolder. Why did no one see the urgency of the needs as he saw them?

The Turing Test

Chapter 9: The learning curve

Leave a comment

Share this:

Related

Leave a comment