Summary This abstract presents an integrated database created for the TNW wind farm and evaluates the importance of various features for CPT prediction. The database includes data from the geological structural model, seismic attributes, and geotechnical CPTs. We train different regression and classification models on the database to evaluate the models' predictive power and discuss the importance of features. We use leave-group-out cross-validation to assess the best estimate CPT predictions and the predictive intervals. The evaluation shows that the different models based on random forest, gradient boosting, or artificial neural networks all have similar predictive accuracy. It also shows that the most essential features of the different models were the same. Based on that, we see that the most critical feature is the soil unit defined in the geological structural model. Our models are less accurate when a soil unit's main grain sizes vary. It is, therefore, essential for the predictive models we have worked with to get soil units correct. To do that, integrated interpretation of the geology, geophysical and geotechnical data is essential. We also see that seismic attributes strongly linked to geotechnical soil properties seem more critical than pure geophysical attributes.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.