You want to use some tasty tool, such as a hierarchical model without anyone getting cross at you for apostasy by doing it in the wrong discipline? Why not use whatever estimator works, and then show that it works on both frequentist and Bayesian grounds?
There is a basic result, due to Doob, which essentially says that the Bayesian learner is consistent, except on a set of data of prior probability zero. That is, the Bayesian is subjectively certain they will converge on the truth. This is not as reassuring as one might wish, and showing Bayesian consistency under the true distribution is harder. In fact, it usually involves assumptions under which non-Bayes procedures will also converge. […]
Concentration of the posterior around the truth is only a preliminary. One would also want to know that, say, the posterior mean converges, or even better that the predictive distribution converges. For many finite-dimensional problems, what’s called the “Bernstein-von Mises theorem” basically says that the posterior mean and the maximum likelihood estimate converge, so if one works the other will too. This breaks down for infinite-dimensional problems.
(Bernardo and de Valencia 2006), in the context of “Objective Bayes”, argues for frequentist methods as necessary.
Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist statistics. It is argued that it may be appropriate to reverse this procedure. Indeed, the emergence of powerful objective Bayesian methods (where the result, as in frequentist statistics, only depends on the assumed model and the observed data), provides a new unifying perspective on most established methods, and may be used in situations (e.g. hierarchical structures) where frequentist methods cannot. On the other hand, frequentist procedures provide mechanisms to evaluate and calibrate any procedure. Hence, it may be the right time to consider an integrated approach to mathematical statistics, where objective Bayesian methods are first used to provide the building elements, and frequentist methods are then used to provide the necessary evaluation.
I am not sure how this works. But it is important (Wang and Blei 2017).