If you could really plug an AI’s intellectual knowledge into its motivational system, and get it to be motivated by doing things humans want and approve of, to the full extent of its knowledge of what those things are3 – then I think that would solve alignment. A superintelligence would understand ethics very well, so it would have very ethical behavior.
Setting aside the whole language of “motivation,” which I think wildly inappropriate in this context, I would ask Alexander a question: Are professors of ethics, who “understand ethics very well,” the most ethical people?
The idea that behaving ethically is a function or consequence of understanding is grossly misbegotten. Many sociopaths understand ethics very well; their knowledge of what is generally believed to be good behavior is essential to their powers of manipulation. There is no correlation between understanding ethics and living virtuously.