“Our freedom to doubt was born of a struggle against authority in the early days of science. It was a very deep and strong struggle. Permit us to question - to doubt, that's all - not to be sure.“
by Nott, Zhang, Yau and Jasra (Journal of Computational and Graphical Statistics 2014)
Previously, Wang and Dunson (2011) proposed a fast inference algorithm for Dirichlet process mixture models that could be applied in an online setting, fancifully called Sequential Updating and Greedy Search (SUGS). The authors actually didn’t intend for their algorithm to function online, recommending that to remove the dependency on observation order, one should average over random orderings of the observations. The way SUGS made inference easy was by taking the MAP of the latent cluster’s posterior, deterministically placing the observation at a single cluster.
Suppose at time t, we have an approximate posterior for the previous time step
p(z1:t−1,θ1:t−1|x1:t−1)≈∏t′<tqt−1(zt′)∏t′<tqt−1(θt′)We can then update the previous time step’s variational posterior by assuming the current step will have a similar form:
∏t′≤tqt(zt′)∏t′≤tqt(θt′)For the new terms, qt(zt) and qt(θt), first initialize qt(zt) as
qt(zt)=qij⏟Var. Latent Prior∫qt−1(θk)⏟Var. Param Priorp(yt|θk)⏟Obs. Likelihooddθkwhere qij is defined using the truncated CRP (with truncation level T):
$$q_{ij} = \frac{1}{\alpha + t - 1} \begin{cases} \sum_{t'< t} q_{t-1}(z_{t'} = j) + \alpha T\\
\alpha (1 - ((i-1) \wedge T)/T) \end{cases}$$