Variational inference is just treating inference as an optimization problem:
… likelihood function - probability of observing your data given parameters
… variational distribution - your approximation to the true posterior
You usually choose as the KL-divergence. I.e. you maximize the likelihood of data from a stochastic model with an arbitrary prior.