The problem with artificial intelligence is that it obeys our commands

There is a wide debate about the dangers that artificial intelligence may bring with it, especially the superintelligence, which Bill Gates touched upon in a recent article. As Gates saw that this kind of intelligence would do everything that the human mind could advance, but without any practical limits imposed on the size of its memory or the speed of its work. And then, we will be facing a profound change. On the other hand, Swedish philosopher and computer scientist at the University of Oxford Nick Bostrom offers classic intellectual musings about the danger that such intelligence might entail. The professor envisioned a super-intelligent robot programmed by its developers with the seemingly innocuous purpose of making paper clips. In the end, the robot turns the entire world into a giant paper clip factory.

An even more disturbing example is the fact that it is affecting billions of people around the world. YouTube aims to maximize watch time, and employs artificial intelligence-based content recommendation algorithms for this purpose. Two years ago, computer scientists and users began noticing that the website’s algorithm seemed to be working [زيادة وقت المشاهدة] By recommending extremist and conspiracy content. One researcher reported that after she saw footage from former US President Donald Trump’s campaign rallies, YouTube then showed her videos filled with rants about “white supremacy, Holocaust denial and other disturbing content.”

Perhaps the engineers of YouTube did not intend to radicalize people. A key aspect of the problem is that humans often don’t know what goals to give our AI systems, because we don’t even know what we want.

In a lengthy and interesting article, Quanta Magazine talks about ways to avoid such pitfalls, noting that a number of researchers are developing an entirely new way to program useful smart machines. The method is similar to the ideas of computer research professor Stuart Russell, author of the book “Artificial Intelligence: A Modern Approach” used in more than 1,500 universities around the world.

read more

This section contains related articles, placed in the (Related Nodes field)

In Russell’s view, asking the machine to improve its reward-related functions will lead to an unbalanced AI, because it is impossible to include goals, sub-goals, exceptions, and caveats all in the reward function and correctly determine the importance of each, or even know which ones are true. And while their intelligence is still on the rise, giving targets to “autonomous” robots will be fraught with danger, as they will be ruthless in their pursuit of reward function and try to stop us from turning them off.

Based on this, instead of machines seeking to achieve their goals, it is more beneficial for them, according to that new thinking, to seek to satisfy human preferences, meaning that their only goal should be to delve deeper into knowing what our preferences are. In his latest book, Human Compatible AI, Russell laid out his thesis in the form of three “principles of useful machines,” that is, whose developers expect them to achieve “our” goals, not their “goals.” This is reminiscent of the Three Laws of Robotics formulated by the famous science fiction author Isaac Asimov in 1942.

[توفي الكاتب الأميركي أسيموف في 1992 عن عمر 72 سنة. واشتهر بوضع ثلاثة مبادئ للروبوت هي أن يحمي البشر ولا يؤذيهم بأي شكل كان، ويطيع البشر في كل شيء إلا ما يتعارض مع المبدأ السابق، ويحمي نفسه بكل وسيلة إلا إذا تعارضت مع المبدأين السابقين].

Russell’s version of the Three Principles of Robotics states the following,

1- The sole purpose of the machine is the optimal realization of human preferences.

2. The machine must be fundamentally uncertain what those preferences are.

3- The primary information source for human preferences is human behaviour.

Over the past few years, Russell and his team have collaborated with groups of researchers at Stanford University, the University of Texas, and others, hoping to develop innovative ways for AI systems to discover our preferences without having to define them.

These labs teach robots how to learn preferences of humans who have never expressed them, and can even devise new behaviors that help solve the human mystery.

This is how people understand

Years ago, Russell understood that the job of robots should not be to achieve goals like increasing viewing time or making paper clips, but simply trying to improve our lives.

In this regard, reinforcement learning is a subset of machine learning in which an AI system learns through trial and error. Thus, the AI ​​learns to improve its own reward function, such as its score in a game, as it tries several behaviors, and then “works out” which behaviors to reinforce in order to get more of the reward function.

According to the Quanta Magazine website, since 1998 Russell, along with his assistant Andrew Ng, has set out to create what is called an “inverse reinforcement learning” system. Whereas the “reinforcement learning” system determines the best actions to be taken to achieve a goal, the “reverse reinforcement learning system” decodes the primary goal when given a set of actions. Rather, Russell went further and started working with his assistants to develop a new type of “reversed, cooperative learning,” meaning enabling the robot and the human to work together to learn the real preferences of the human being in the various “assistance games.”

One of the games they developed, called The Shutdown Game, addresses one of the most obvious ways in which autonomous robots may deviate from our true preferences is by having machines disable their shutdown switches. In his book, Human Artificial Intelligence, Russell argues that the shutdown problem is “at the heart of the problem of controlling intelligent systems. If we can’t turn off a machine because it will stop us, then we really are in trouble. And if we can, maybe we can control it too.” in other ways as well.”

In a parallel vein, Scott Nycomb’s lab at the University of Texas is running preference and learning algorithms in actual bots. When Gemini, a two-armed robot in that lab, watches a human place a fork to the left of a plate in a table setting demonstration, he doesn’t know at first if the fork is always to the left of the plates. New algorithms allow Gemini to learn the pattern after a few demonstrations.

Human behavior is not rational

In contrast, Russell notes, there are two kinds of challenges, “one of which is the fact that our behavior is so far from rational that it is very difficult to reconstruct our true baseline preferences.” AI systems will need to think about the hierarchy of long, medium and short term goals, all those myriad preferences and commitments we have. If bots are going to help us (and avoid making grave mistakes), they will have to work their way around the murky webs of our unconscious beliefs and unarticulated desires.

The second challenge is that human preferences change. Our thoughts change over the course of our lives, but rather constantly, according to our mood or changing circumstances, so a robot may find it very difficult to detect them. And we ask, what about the huge number of humans that will be born in the future, how will machines take their preferences into account? In addition, our actions sometimes do not live up to our ideals. Even people hold conflicting values ​​simultaneously.

Like bots, we also try to define our preferences. Like the best possible AI, we too, at least some of us, are striving to understand what good looks like. Thus, AI systems, like humans, may be forever stuck in a vortex of asking questions.

Also on Russell’s list of concerns is a third major problem: preferences for evil people. So what prevents a robot from serving its evil owner’s ends? AI systems tend to find ways around taboos just as the wealthy find loopholes in tax laws.

And in a darker scene, we may ask about the evil that is in all of us. In that regard, Russell is optimistic. In his opinion, programmers can limit harmful choices, even if they need additional algorithms and more research in the future, and the same approach may be useful in “the way we raise our children.” In other words, by teaching robots how to be good, we might find a way to teach ourselves good, too.