In some cases, the maximum likelihood method fails to yield a consistent estimator. We describe why the ML method breaks down with some examples and explore how usual MLE can be modified to get consistency. The doubly-smoothed maximum likelihood estimation (DSMLE) is proposed based on kernel smoothing and minimum distance estimation. We show how it works and prove its universal consistency. Some computational aspects are discussed with fundamental guidelines for the choice of a kernel and a tuning parameter. Under this theoretical basis, the proposed method is applied to some important statistical models such as normal mixture models, measurement error models.