Which of the following data sets is most likely to be well-modeled by a Gaussian mixture model ?
This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.
Student scores on a standardized test are likely to be normally distributed, but there is no reason to think that the distribution would not be unimodal without more information. Here's an example of SAT scores.
Numbers of pregnancies across humans is tempting, but probably not a good fit. There are two reasons: if we don't separate males/females, then the distribution is simply not going to be multimodal. If we do separate males/females, we have two issues: (a) the females don't actually follow a normal distribution - it often looks something like this , and (b) this model will do a poor job of distinguishing all of the males with 0 and the sizable number of females with 0. If anyone has ideas for other cohorts to use to fit a GMM to pregnancies, post a comment!
Speeds are a great fit, though! When the light is green (and some yellow times), the cars will be going around the speed limit. When the light is red (and some yellow times), the cars will be close to a stopping speed (0). A GMM with two Gaussian distributions will fit this data reasonably well.