Predicting the success of IVF: external validation of the van Loendersloot's model

Biancardi, R; Reschini, M; Busnelli, A; Paffoni, A; Somigliana, E

Study question: Is the predictive model for IVF success proposed by van Loendersloot et al. valid in a different geographical and cultural context? Summary answer: The model discriminates well but less than in the original context where it was developed. What is known already: Several independent groups have developed models with the aim of estimating the chance of pregnancy with IVF but only four of them were externally validated. One of these, the van Loendersloot’s model, deserves particular attention and further investigation for at least three reasons: 1) the Area under the receiver operating characteristics curve (c-statistics) in the temporal validation setting was the highest reported to date (0.68), 2) the perspective of the model is clinically wise since it includes variables obtained from previous failed cycles. It thus adapts to any woman entering an IVF cycle, 3) the model lacks geographical external validation. Study design, size, duration: Retrospective cohort study of women undergoing oocytes retrieval for IVF between January 2013 and December 2013 at the infertility unit of the Fondazione Ca’ Granda, Ospedale Maggiore Policlinico of Milan, Italy. Women were enrolled only for their first oocytes retrieval cycle performed during the study period. They were excluded if they underwent previous IVF cycles in other centers. The main outcome was the cumulative live birth rate per oocytes retrieval. Participants/materials, setting, methods: Seven hundred seventy-two women were selected. Variables included in the van Loendersloot’s model and the relative weights (beta) were used. The variable resulting from this combination (Y) was transformed into a probability. The discriminatory capacity was assessed using the c-statistics. Data is presented using both the original and the calibrated models obtained with a logistic regression. Performance was evaluated correlating the mean predicted chances of live births in the five quintiles and the observed rates. Main results and the role of chance: Two-hundred-eleven live births (27%) were obtained. The c-statistics was 0.64 (95%CI: 0.61–0.67, p < 0.001). The slope of the linear predictor (calibration slope) expressed as an Odds Ratio was 1.81 (95%CI: 1.46–2.24, p < 0.001), corresponding to a beta of 0.630. The calibration intercept was +0.349 (p = 0.13). While a clear discrepancy exists using the original model, data appears properly distributed with the calibrated model. The Pearson coefficient of the correlation between the mean predicted chances of live births in the five quintiles and the observed rates was 0.99 (p = 0.002). Limitations, reasons for caution: The selection criteria for access to IVF adopted in our center might be too stringent, leading to the exclusion of women with yet acceptable chances of live birth. The validity of the model in women at very low chance of live birth could not thus be tested. Wider implications of the findings: The van Loendersloot’s model can be used in other contexts but needs important calibration. It may be of help for counselling couples about their chance of success but it cannot be used to decline treatments. Further research is needed to improve the discriminatory performance of IVF predictive models. Trial registration number: Not applicable.