Solving Sutton’s Transfer Problem
Richard Sutton’s four pillars of reinforcement learning, policy, value function, perception, and transition, are elegant in their clarity, yet they also reveal a persistent weakness. Each is learned within the boundaries of a particular environment, and each tends to collapse when those boundaries change. The policy that works in one task rarely works in another, the value function is bound to a predefined reward, the perceptual mapping is tied to a narrow distribution of states, and the transition model is valid only for one set of dynamics. This is the transfer problem, the difficulty of carrying knowledge forward into new situations. World Mind begins from a different ground. Rather than mapping states to actions or estimating returns against fixed rewards, it is built upon existential structures of disclosure, the layered ways in which beings…
