Since there’s no guarantee the return distribution will be nice, we need to think of how to allow an agent to flexibly represent distributions. C51’s approach was to use a categorical distribution (hence the C).