Something began in the 1970s that has been described as âthe AI winterâ, but to call it that is to miss the point, because the social illness it represents involves much more than artificial intelligence (AI). AI research was one of many casualties that came about as anti-intellectualism revived itself and society fell into a diseased state.
One might call the âAI winterâ (which is still going on) an âinteresting work winterâ and it pertains to much more of technology than AI alone, because it represented a sea change in what it meant to be a programmer. Before the disaster, technology jobs had an R&D flavor, like academia but with better pay and less of the vicious politics. After the calamitous 1980s and the replacement of R&D by M&A, work in interesting fields (e.g. machine learning, information retrieval, language design) became scarce and over 90% of software development became mindless, line-of-business makework. At some point, technologists stopped being autonomous researchers and started being business subordinates and everything went to hell. What little interesting work remained was only available in geographic âsuper-hubsâ (such as Silicon Valley) where housing prices are astronomical compared to the rest of the country. Due to the emasculation of technology research in the U.S., economic growth slowed to a crawl, and the focus of the nationâs brightest minds turned to creation of asset bubbles (seen in 1999, 2007, and 2014) rather than generating long-lasting value.
Why did this happen? Why did the entrenched public- and private-sector bureaucrats (with, even among them, the locus of power increasingly shifting to private-sector bureaucrats, who canât be voted out of office) who run the world lose faith in the research being done by people much smarter, and who work much harder, than them? The answer is simple. Itâs not even controversial. End of the Cold War? Nah, it began before that. At fault is the lowly perceptron.
Interlude: a geometric puzzle
This is a simple geometry puzzle. Below are four points at the corners of the square, colored (and numbered) like so:
0 1
1 0
Is it possible to draw a line that separates the red points (0âČs) from the green points (1âČs)?
The answer is that itâs not possible. Any separating line would have to separate two points from each other. Now draw a circle passing through all four points. Any line can intersect that circle at no more than two points. Therefore, a line separating two points from the other two would have to separate two adjacent points, which would be of opposing colors. Itâs not possible. Another way to say this is that the classes (colors) arenât linearly separable.
What is a perceptron?
âPerceptronâ is a fancy name given to a mathematical function with a simple description. Let w be a known âweightâ vector (if thatâs an unfamiliar term, a list of numbers) and x be an input âdataâ vector of the same size, with the caveat that x[0] = 1 (a âbiasâ term) always. The perceptron, given w, is a virtual âmachineâ that computes, for any given input x, the following:
- 1, if w[0]*x[0] + ⊠+ w[n]*x[n] > 0,
- 0, if w[0]*x[0] + ⊠+ w[n]*x[n] < 0.
In machine learning terms, itâs a linear classifier. If thereâs a linear function that cleanly separates the âYesâ class (the 1 values) from the âNoâ class (the 0 values) it can be expressed as a perceptron. Thereâs an elegant algorithm for, in that linearly separable case, finding a working weight vector. It always converges.
A mathematician might say, âWhatâs so interesting about that? Itâs just a dot product being passed through a step function.â Thatâs true. Perceptrons are very simple. A single perceptron can solve more decision problems than one might initially think, but it canât solve all of them. Itâs too simple a model.
Limitations
Letâs say that you want to model an XOR (âexclusive orâ) gate, corresponding to the following function:
| in_1 | in_2 | out |
+------+------+-----+
| Â 0 Â | Â 0 Â | Â 0 Â |
| Â 0 Â | Â 1 Â | Â 1 Â |
| Â 1 Â | Â 0 Â | Â 1 Â |
| Â 1 Â | Â 1 Â | Â 0 Â |
+------+------+-----+
One might recognize that this is identical to the âbrainteaserâ above, with in_1 and in_2 corresponding to the x- and y- dimensions in the coordinate plane. This is the same problem. This function is nonlinear; it could be expressed as f(x, y) = x + y â 2xy. and thatâs arguably the simplest representation of it that works. A separating âplaneâ in the 2-dimensional space of the inputs would be a line, and thereâs no line separating the two classes. Itâs mathematically obvious that the perceptron canât do it. I showed this, above, using high-school geometry.
To a mathematician, this isnât surprising. Marvin Minsky pointed out the mathematically evident limitations of a single perceptron. One can model intricate mathematical functions with more complex networks of perceptrons and perceptron-like units, called artificial neural networks. They work well. One can also, using what are called âbasis expansionsâ, generate further dimensions from existing data in order to create a higher-dimensional space in which linear classifiers still work. (Thatâs what people usually do with support vector machines, which provide the machinery to do so efficiently.) For example, adding xy as a third âderivedâ input dimension would make the classes (0âČs and 1âČs) linearly separable. Thereâs nothing mathematically wrong with doing that; itâs something that statisticians do when they want to build complex models but still have some of the analytic properties of simpler ones, like linear regression or nearest-neighbor modeling.
The limitations of the single perceptron do not invalidate AI. At least, they donât if youâre a smart person. Everyone in the AI community could see the geometrically obvious limitation of a single perceptron, and not one of them believed that it came close to invalidating their work. It only proved that more complex models were needed for some problems, which surprised no one. Single-perceptron models might still be useful for computational efficiency (in the 1960s, computational power was about a billion times as expensive as now) or because the data donât support a more complex model; they just couldnât learn or model every pattern.
In the AI community, there was no scandal or surprise. That some problems arenât linearly separable is not surprising. However, some nerd-hating non-scientists (especially in business upper management) took this finding to represent more than it actually did.
They fooled us! A brain with one neuron canât have general intelligence!
The problem is that the world is not run, and most of the wealth in it is not controlled, by intelligent people. Itâs run by social-climbing empty-suits who are itching for a fight and would love to take some âeggheadsâ down a notch. Insofar as an artificial neural network models a brain, a perceptron models a single neuron, which canât be expected to âthinkâ at all. Yet the fully admitted limitations of a single perceptron were taken, by the mouth-breathing muscleheads who run the world, as an excuse to shit on technology and pull research funding because âAI didnât deliverâ. That produced an academic job market that can only be described as a pogrom, but it didnât stop there. Private-sector funding dried up as short-term, short-tempered management came into vogue.
To make it clear, no one ever said that a single perceptron can solve every decision problem. Itâs a linear model. That means itâs restricted, intentionally, to a small subspace of possible models. Why would people work with a restricted model? Traditionally, it was for a lack of data. (Weâre in the 1960s and â70s, when data was contained on physical punch cards and a megabyte weighed something and a disk drive cost more than a car.) If you donât have a lot of data, you canât build complex models. For many decision problems, the humble perceptron (like its cousins, logistic regression and support vector machines) did well and, unlike other computationally intensive linear classification methods (such as logistic regression, which requires gradient descent, or a variant thereof, over the log-likelihood surface; or such as the support vector machine, which are a quadratic programming problem that we didnât know how to solve efficiently until the 1990s) it could be trained with minimal computational expense, in a bounded amount of time. Even today, linear models are surprisingly effective for a large number of problems. For example, the first spam classifiers (Naive Bayes) operated using a linear model, and it worked well. No one was claiming that a single perceptron was the pinnacle of AI. It was something that we could build cheaply on 1970-era hardware and that could build a working model on many important datasets.
Winter war
Personally, I donât think that the AI Winter was an impersonal, passive event like the changes of seasons. Rather, I think it was part of a deliberate resurgence of anti-intellectualism in a major cultural warâ one which the smart people lost. The admitted limitations of one approach to automated decision-making gave the former high school bullies, now corporate fat cats, all the ammo they needed in order to argue that those âeggheadsâ werenât as smart as they thought they were. None of them knew exactly what a perceptron or an âXOR gateâ were, but the limitation that Iâve described was morphed into âneural networks canât solve general mathematical problemsâ (arguably untrue) and that turned into âAI will never deliverâ. In the mean-spirited and anti-liberal political climate of the 1980s, this was all that anyone needed as an excuse to cut public funding. The private sector not only followed suit, but amplified the trend. The public cuts were a mix of reasonable fiscal conservatism and mean-spirited anti-research sentiment, but the business elites responded strongly to (and took to a whole new level) the mean-spirited aspect, flexing their muscles as elitism (thought vanquished in the 1930s to â50s) became âsexyâ again in the Reagan Era. Basic research, which gave far too much autonomy and power to âeggheadsâ, was slashed, marginalized, and denigrated.
The claim that âAI didnât deliverâ was never true. What actually happened is that we solved a number of problems, once thought to require human intelligence, with a variety of advanced statistical means as well as some insights from fields like physics, linguistics, ecology and economics. Solving problems demystified them. Automated mail sorting, once called âartificial intelligenceâ, became optical character recognition. This, perhaps, was part of the problem. Successes in âAIâ were quickly put into a new discipline. Even modern practitioners of statistical methods are quick to say that they do machine learning, not AI. What was actually happening is that, while we were solving specific computational problems once thought to require âintelligenceâ, we found that our highly specialized solutions did well on the problems they were designed for, and could be adapted to similar problems, but with very slow progress toward general intelligence. As it were, weâve learned in recent decades that our brains are even more complicated than we thought, with a multitude of specialized modules. That no specific statistical algorithm can replicate all of them, working together in real time, shouldnât surprise anyone. Is this an issue? Does it invalidate âAIâ research? No, because most of those victories, while they fell short of replicating a human brain, still delivered immense economic value. Google, although it eventually succumbed to the sociological fragility and failure that inexorably follow closed allocation, began as an AI company. Itâs now worth over $360 billion.
Also mixed in with the anti-AI sentiment is the religious aspect. Itâs still an open and subjective question what human intelligence really is. The idea that human cognition could be replicated by a computer offended religious sentiments, even though few would consider automated mail sorting to bear on unanswerable questions about the soul. Iâm not going to go deep into this philosophical rabbit hole, because I think itâs a waste of time to debate why people believe AI research (or, for a more popular example, evolution by natural selection) to offend their religious beliefs. We donât know what qualia is or where it comes from. Iâll just leave it at this. If we can use advanced computational techniques to solve problems that were expensive, painful, or impossible given the limitations of human cognition, we should absolutely do it. Those who object to AI on religious grounds fear that advanced computational research will demystify cognition and bring about the end of religion. Ignoring the question of whether an âend of religionâ is a bad thing, or what âreligionâ is, there are two problems with this. First, if there is something to us that is non-material, we wonât be able to replicate it mechanically and there is no harm, to the sacred, in any of this work. Second, computational victories in âAIâ tend to demystify themselves and the subfield is no longer considered âAIâ. Instead, itâs âoptical character recognitionâ or âcomputer game-playingâ. Most of what we use on a daily basis (often behind the scenes, such as in databases) comes from research that was originally considered âartificial intelligenceâ.
Artificial intelligence research has never told us, and will never tell us, whether it is more reasonable to believe in gods and religion or not to believe. Religion is often used by corrupt, anti-intellectual, politicians and clerics to rouse sentiment against scientific progress, as if automation of human grunt work were a modern-day Tower of Babel. Yet, to show what I mean by AI victories demystifying themselves, almost none would hesitate to use Google, a web-search service powered by AI-inspired algorithms.
Why do the anti-intellectuals in politics and business wish to scare the public with threats of AI-fueled irreligion and secularism (as if those were bad things)? Most of them are intelligent enough to realize that theyâre making junk arguments. The answer, I think, is about raw political dominance. As they see it, the ânerdsâ with their âcushyâ research jobs canât be allowed to (gasp!) have good working conditions.
The sad news is that the anti-intellectuals are likely to take the economy and society down with them. In the 1960s, when we were putting billions of dollars into âwastefulâ research spending, the economy grew at a record pace. The world economy was growing at 5.7 percent per year, and the U.S. economy was the envy of the world. Now, in our spartan time of anti-intellectualism, anti-science sentiment, and corporate elitism, the economy is sluggish and the society is stagnantâ all because the people in charge canât stand to see âeggheadsâ win.
Has AI âdeliveredâ?
If youâre looking to rouse religious fear and fury, you might make a certain species of fantastic argument against âartificial intelligenceâ. The truth of the matter, however, is that while weâve seen domain-specific superiority of machines over human intelligence in rote processes, weâre still far from creating an artificial general intelligence, i.e. a computational entity that can exhibit the general learning capability of a human. We might never do it. We might not need to and, I would argue, we should not if it is not useful.
In a way, âartificial intelligenceâ is a defined-by-exclusion category of âcomputational problems we havenât solved yetâ. Once we figure out how to make computers better at something than humans are, it becomes âjust computationâ and is taken for granted. Few believe theyâre using âan AIâ when they use Google for web search, because weâre now able to conceive of the computational work it does as mechanical rather than âintelligentâ.
If youâre a business guy just looking to bully some nerds, however, you arenât going to appeal to religion. Youâre going to make the claim that all this work on âartificial intelligenceâ hasnât âdeliveredâ. (Side note: if someone uses âdeliverâ intransitively, as business bullies are wont to do, you should punch that person in the face.) Saying someone or something isnât âdeliveringâ is a way to put false objectivity behind a claim that means nothing other than âI donât like that personâ. As for AI, itâs true that artificial general intelligence has eluded us thus far, and continues to do so. Itâs an extremely hard problem: far harder than the optimists among us thought it would be, fifty years ago. However, the CS research community has generated a hell of a lot of value along the way.
The disenchantment might be similar to the question about âflying carsâ. We actually have them. Theyâre called small airplanes. In the developed world, a person of average means can learn how to fly one. Theyâre not even that much more expensive than cars. The reason so few people use airplanes for commuting is that it just doesnât make economic sense for them: the savings of time donât justify increased fuel and maintenance costs. But a middle-class American or European can, if she wants, have a âflying carâ right now. Itâs there. Itâs just not as cheap or easy to use as weâd like. With artificial intelligence, that research has brought forth a ridiculous number of victories and massive economic growth. It just hasnât brought forth an artificial general intelligence. Thatâs fine; itâs not clear that we need to build one in order to get the immense progress that technologists create when given the autonomy and support.
Back to the perceptron
One hard truth Iâve learned is that any industrial effort will have builders and politicians. Itâs very rare that someone is good at both. In the business world, those unelected private-sector politicians are called âexecutivesâ. They tend, for a variety of reasons, to put themselves into pissing contests with the builders (âeggheadsâ) who are actually making stuff. One time-tested way to show up the builders is to take something that is obviously true (leading the builders to agree with the presentation) but present it out of context in a way that is misleading.
The incapacity of the single perceptron at general mathematical modeling is a prime example of this. Not one AI researcher was surprised that such a simple model couldnât describe all patterns or equational relationships. The fact that can be proven (as I did) with high school geometry. That a single perceptron canât model a key logical operation is, as above, obviously true. The builders knew it, and agree. Unfortunately, what the builders failed to see was that the anti-intellectual politicians were taking this fact way out of context, using the known limitations of a computational building block to ascribe limitations (that did not exist) to general structures. This led to the general dismantling of public, academic, and private support for technological research, an anti-intellectual and mean-spirited campaign that continues to this day.
Thatâs why there are so few AI jobs.
