AI Scientist X

What is Quantum AI?

Posted on May 18, 2021 by Doctor Birdbrain

Google just announced that they are opening a new quantum AI campus. But, what exactly is quantum AI? Quantum AI is a conceptual research field that is still in its early infancy. The basic idea is to use quantum computers Continue reading What is Quantum AI?→

Introduction to Reinforcement Learning

Posted on February 28, 2021 by Doctor Birdbrain

This post is the first in a series of notes on Reinforcement Learning. In the past few months, I have been learning about this topic, and I have found it completely fascinating. Aside from the computational beauty and usefulness of Continue reading Introduction to Reinforcement Learning→

Paper Summary: Proximal Policy Optimization Algorithms

Posted on February 28, 2021 by Doctor Birdbrain

This is a summary of the paper Proximal Policy Optimization Algorithms, a paper on Reinforcement Learning. A friend of mine who has taken the trouble to implement several different algorithms for various research projects swears by the excellent performance of Continue reading Paper Summary: Proximal Policy Optimization Algorithms→

Trust Region Policy Optimization (TRPO)

Posted on February 28, 2021 by Doctor Birdbrain

In TRPO, a “surrogate” objective function is maximized subject to a constraint on the size of the policy update. $latex \max_{\theta} \hat{\mathbb{E}}_t \left[\frac{\pi_{\theta}(a_t|s_t)}{\pi_{\theta_{old}}(a_t|s_t)}\hat{A}_t \right]$ subject to the constraint $latex \hat{\mathbb{E}}_t[ KL[\pi_{\theta_{old}}(\cdot|s_t), \pi_{\theta}(\cdot|s_t)]]< \delta $ This problem can be efficiently approximately Continue reading Trust Region Policy Optimization (TRPO)→

A Natural Explanation of Nash Equilibria

Posted on July 25, 2020 by Doctor Birdbrain

Imagine a world in which agents may interact, but may notcoordinate in an enforceable way. Agents may communicate, but there isno mechanism to hold another agent accountable to commitments reachedwhile communicating. In this world, all conflicts of interest betweenagents would Continue reading A Natural Explanation of Nash Equilibria→

Prisoner’s Dilemma

Posted on July 4, 2020 by Doctor Birdbrain

Prisoner’s Dilemma is a two-player, non-zero sum game. Two criminalsare being interrogated and they cannot communicate with eachother. They have two options: They may “defect” from their criminal partnership and inform thepolice of the guilt of their parner. They may Continue reading Prisoner’s Dilemma→