When using binary classifiers, the initial prevalence estimate can be misleading due to inherent biases. For instance, if a classifier predicts a 20% positive rate with a 50% precision, the true prevalence might be around 10%. However, this calculation assumes perfect recall, which is rarely the case. Adjusting for recall and specificity provides a more accurate estimate. In a simulation, a classifier with a recall of 0.049 and specificity of 0.875 initially predicted a prevalence of 0.087, but after adjustment, the true prevalence was revealed to be 0.498. This adjustment is crucial for tracking prevalence over time, as precision varies with prevalence, whereas recall and specificity remain stable. This method allows for a more reliable understanding of trends without the need for constant re-estimation of precision.
Source: towardsdatascience.com















