After multiple rounds of reinforcement learning (where the AI is rewarded for acting appropriately), the AI is tested in simulations. (discovermagazine.com)
In contrast, when participants could hold information in mind, signals associated with reinforcement learning were weaker, suggesting an increased role for working memory. (news.brown.edu)
This is because they were trained using reinforcement learning, which rewards them if their score goes up or they pick up spare health and ammo. (theverge.com)