Ali Minai at Bounded Alignment:
The main point made in the paper is that, to be meaningful, general intelligence should refer to the kind of intelligence seen in biological agents, and AGI should be given a corresponding meaning in artificial ones. Importantly, any general intelligence must have three attributes – autonomy, self-motivation, and continuous learning – that make it inherently uncertain and uncontrollable. As such, it is no more possible to perfectly align an AI agent with human preferences than it is to align the preferences of individual humans with each other. The best that can be achieved is bounded alignment, defined as demonstrating behavior that is almost always acceptable – though not necessarily agreeable – for almost everyone who encounters the AI agent, which is the degree of alignment we expect from human peers, and which is typically developed through consent and socialization rather than coercion. A crucial point is that, while alignment may refer in the abstract to values and objectives, it can only be validated in terms of behavior, which is the only observable.
More here.
Enjoying the content on 3QD? Help keep us going by donating now.
