Skip to main content

Table 5 CCCSMF accuracy of Random Allocation and Random-From-Train with and without resampling the test CSMF distribution.

From: Measuring causes of death in populations: a new metric that corrects cause-specific mortality fractions for chance

J Random-From-Train (Same CSMF) Random Allocation Random-From-Train (Resampled CSMF)
5 0.980 0.075 0.092
15 0.964 0.028 0.027
25 0.953 0.016 0.016
35 0.945 0.010 0.007
50 0.933 0.006 −0.005
  1. This table demonstrates the importance of resampling the CSMF distribution in the test set; if the test and train sets have the same CSMF distribution, then simple approaches like Random-From-Train, as well as state-of-the-art approaches like King-Lu [23], can appear to have better performance than is justified, due to “overfitting”