The essence of Bayesian analysis is model building, but every so often we find a situation where we simply have no idea how to define a model, or are unwilling to do so. While there are failrly general model-building tools that can be used here (e.g. mixture models), it can also be useful to fall back on more empitical methods.
The simple idea behind the bootstrap is that, since our data set is drawn from its own sampling distribution, it can be used directly as an estimate of that same distribution. Typically, the thing we're not willing to model is an intrinsic scatter. The classical procedure is
There is a certain amount of contention as to what the "statistic or estimate" is allowed to be. Most commonly, and most robustly defended in the literature, is the case of a simple function of the data, e.g. the mean or median, or an unweighted regression. When in doubt, remember that the validity of the bootstrap rests on your ability to say, with a straight face, "the measured values of $X$ that I'm bootstrapping are a fair representation of the underlying distribution of $X$."
In this variant, instead of resampling the rows of a data table, each data point is scattered according to its measurement error. This is often done in weighted regression problems, for example.
Because the bootstrap interprets the data as a kernel estimate of the sampling distribution, in principle it can be fit into a Bayesian analysis. The most obvious route is to attach a weight to each data point encoding how "real" it is, with the weights summing to the number of data points. This is not widely done, since it's not obviously easier than the alternative of building a flexible hierarchical mixture model.
Brain food: in what limit would the distribution of estimates in the simple bootstrap above correspond to a Bayesian posterior?
Similar to (but pre-dating) the bootstrap, the jackknife procedure is
Note: as far as I can tell, our CMB collegues use "jackknife" to refer to a different procedure.
The idea behind ABC is to provide a way forward for Bayesian analysis in cases where the likelihood function is too expensive to be practical or simply too difficult to write down. However, it does still use (and require) a generative model, which in principle contains the same information as the likelihood. To perform ABC, we need to be able to actually use the generative model, i.e. to create fake data sets using all the components of the model.
The simplest implementation is
There are clearly some choices to be made here:
The logic here is simple: by brute force, we're trying to generate a list of model parameter values that can produce a data set very like the one we have. By drawin from the prior to start with, then requiring samples to (almost) reproduce our data, we end up with samples whose density is proportional to the prior $\times$ likelihood. How efficient this is in practice ultimately depends on the choices above.
Hopefully, you can see a similarity between the procedure above and some of the stupider algorithms we've looked at for sampling posterior distributions. As you might guess, there are more intelligent ways to do this than drawing samples straight from the prior and rejecting most of them, and they look a lot like MCMC.