Machine learning-based classification of petrofacies in fine laminated limestones


Characterization and development of hydrocarbon reservoirs depends on the classification of lithological patterns from well log data. In thin reservoir units, limited vertical data impedes the efficient classification of lithologies. We present a test case of petrofacies classification using machine learning models in a thin interval of finely laminated limestones using pseudo-well data created over outcrops (radiometric and unconfined compressive strength logs). We tested Gaussian naïve Bayes (GNB) and support vector machine (SVM) techniques to classify eight petrofacies types, divided into two groups. The objective was to observe the capacity of some well-known models to classify petrofacies with a high-frequency vertical variation of diagenetic heterogeneities in an extreme scenario within a thin sedimentary interval. The GNB was less effective (F 1 score of 0.29), and the SVM achieved the best results in classifying the main facies patterns (F 1 = 0.47). However, the GNB performed better when the analysis was focused on distinguishing the two main groups of petrofacies. The results demonstrate that high-frequency facies variations present a challenge to the automatic identification of lithofacies, mainly due to local variations in horizontal heterogeneities (on the mm- to cm-scale) created by depositional and diagenetic processes, which impact the flow in porous media.

Key words reservoir modeling; machine learning; petrofacies classification; thin reservoir unit; laminated limestones


Successful hydrocarbon extraction from conventional and unconventional reservoirs depends on constructing reliable 3D models describing various properties of the sedimentary successions (i.e. lithology, petrophysics, geomechanics, i.e) based on measurements from wells and seismic data (Bonnell & Hurich 2008, Heidsiek et al. 2020, Jones et al. 2008, Yu et al. 2008, Zhang et al. 2006). The distribution of properties in an industry-standard 3D model draws on data from wellbores (lithology succession, petrophysics, fluid types) with a spatial resolution of centimeters to meters, and seismic reflection data (2D and 3D volumes) with a spatial resolution of tens to hundreds of meters (Burchette 2012, Grana et al. 2016, Ozkan et al. 2011, Raeesi et al. 2012, Worden et al. 2018). Heterogeneities formed by depositional and diagenetic processes present a challenge for reservoir modeling, especially for carbonate rocks (Burchette 2012, Worden et al. 2018). Populating a large 3D model using the 1D measurements from wellbores (cores and logging) is usually treated using geostatistical tools (Anna et al. 2009, Correia et al. 2016, Haese 2019, Ringrose et al. 2008). Reservoir modeling is improved by information from natural exposures (reservoir analogs), which helps in understanding the patterns formed by depositional processes and heterogeneities created by diagenesis (Adams et al. 2011, Enge et al. 2007, Questiaux et al. 2009). Reservoir analogs can provide large amounts of data, including geometries, mechanical and petrophysical properties, and fracture system properties (Anna et al. 2009, Bayer et al. 2015, Belayneh et al. 2006, Enge et al. 2007, Falivene et al. 2006, Heidsiek et al. 2020, Howell et al. 2014, Jones et al. 2008, Milad & Slatt 2019, Yan et al. 2020).

Static reservoir modeling depends on the power to predict intrinsic properties in regions far from the wellbore (Grana et al. 2016, Zhao et al. 2014), and well observations are not always sufficient to inform the predictions desired (Burchette 2012, Laubach et al. 2019), such as in thin reservoir units with great horizontal extensions. Furthermore, the manual labeling of properties of interest, either lithofacies (groups of rocks that share similar lithologic or physical characteristics), or petrofacies (groups of rocks that share similar petrographic or mineralogical characteristics) regarding variations found from site to site over the borehole data can also be ambiguous, expensive, and time-consuming (Edwards et al. 2017, Lineman et al. 1987). This challenge has led to the development of automated identification and classification computation tools that process well log observations (Hall 2016, Halotel et al. 2019, Merembayev et al. 2021, Silva et al. 2020) and core images (Chawshin et al. 2021, Lima et al. 2019, Thomas et al. 2011). These approaches allow fast and reliable identification/correlation of geological properties from well logs (Wu et al. 2018), to build sophisticated models (Ertekin & Sun 2019, Othman et al. 2021). Most of these new techniques are based on machine learning (ML) algorithms.

When modeling complex, thin reservoir units from noisy and limited data from few wells and limited well log records, more sophisticated classification algorithms such as neural networks are not applicable since they require large amounts of data for their training. In contrast, less complex algorithms are more robust when dealing with noisy or constrained training data. Even with fewer parameters or assumptions about the data, they tend to be less prone to overfitting, than more complex models. Thus, we chose GNB and SVM models, among the less sophisticated ML techniques, for the classification task (Bisho 2006, Murphy 2012). Another aspect we considered is the opportunity of comparing two algorithms with different approaches, one probabilistic (GNB) and the other discriminative (SVM), for the limited-data case studied.

Both of these models (SVM mostly) are well-established ML approaches to lithofacies classification (for SVM, see Al-Anazi & Gates 2010, Alexsandro et al. 2017, Xie et al. 2018, Deng et al. 2019, Sarkar & Majumdar 2020, Fadokun et al. 2020, Liu et al. 2020, Verma et al. 2021, Kumar et al. 2022, Gonzalez et al. 2023, for GNB, see Li & Anderson-Sprecher 2006, Horrocks et al. 2015, Babasafari et al. 2022, Nwaila et al. 2022, Nguyen et al. 2022). However, the literature presents very few studies with applications of these methods in the task of classifying petrofacies (López & Thomas 2009, Duarte et. al 2023, Silva et al 2020). The situation becomes particularly challenging when searching for research papers that apply SVM and/or GNB for petrofacies classification in data-limited settings, as no such studies were found in the literature.

This work studied the capacity of ML techniques to treat a specific scenario comprising a thin unit of laminated limestones with a high-frequency vertical variation of petrofacies resulting from depositional and diagenetic processes. It represents a critical case study involving a relatively thin and continuous interval of carbonate rocks used to verify the challenge faced in the petrofacies distribution due to the horizontal extension of the sampled interval. We tested the performance of GNB and SVM in the automated identification of petrofacies in a thin limestone interval. The main goal was to treat the petrofacies classification problem in an extreme scenario of scarce data, a condition in which these algorithms have not yet been tested. The choice of these two algorithms also aimed to compare the two completely different approaches, one probabilistic (GNB) and the other discriminative (SVM), which can provide insights into the performance of different modeling techniques in the context of the specific reservoir units and data constraints.

The succession studied is the C6 unit, a stratigraphic interval of laminated limestones in the upper part of the Crato Formation of the Araripe Basin in northeastern Brazil (Fig. 1a-c). We used the petrofacies classification in previous works (Araujo et al. 2020, Ramos et al. 2020), which integrated lithological, depositional, and diagenetic characteristics. Petrofacies complement the lithofacies concept, which can integrate properties like porosity and permeability (Bhattacharya et al. 2005, Cao et al. 2020, 2021, Jardim et al. 2011, Kadkhodaie & Kadkhodaie-Ilkhchi 2018). This succession was chosen because it exemplifies two assessment challenges in reservoir modeling: a thin vertical interval and the high-frequency vertical variation of petrophysical properties of the fine laminations (mm-scale). This type of high-frequency variation in depositional and diagenetic properties strongly influences reservoir quality (over the mm- to cm-scale) (Mikes et al. 2006, Creusen et al. 2007, Likuan et al. 2021), and its integration through up-scaling techniques for reservoir characterization represents a major challenge (Mikes et al. 2006, Heidsiek et al. 2020). The study tests the performance of the ML models on the extreme case, with limited data, and demonstrates that the vertical variation of the petrofacies is hard to resolve, even in a unit with apparent good horizontal continuity of lithofacies.

Figure 1
C6 Interval of laminated limestone from the Crato Formation, Araripe Basin. a) View of the interval of limestones studied in a quarry, Nova Olinda region. b) A vertical strip (dotted red lines) used to define the acquisition of data and samples emulating a vertical well (pseudo-well) (yellow stick = 1 m). c) A plug showing the fine laminations that were classified in thirteen petrofacies. The greyish and yellowish colors indicate variations caused by depositional and diagenetic processes (Araujo et al. 2020).

Geological Setting and Pseudo-Wells Dataset

The geological data used in this study were collected from outcrops in two quarries in Nova Olinda, Araripe Basin, Ceará State, Brazil (Fig. 1). Three pseudo-wells were built over vertical exposures with lateral continuity of tens of meters (Araujo et al. 2020). The C6 interval studied is a succession consisting of finely laminated limestones, composed predominantly of micritic calcite (mudstones), with local occurrences of dolomite and silica (Araujo et al. 2020, Miranda et al. 2018). The C6 interval presents a regional distribution and is part of the first post-rift sequence of the basin (Assine et al. 2014, Neumann et al. 2003). These rocks were previously studied as an analog for tight fractured reservoirs (Miranda et al. 2018, Santos et al. 2015) and as an analog for lithofacies found in the Brazilian pre-salt fields (Barra Velha Formation) (Catto et al. 2016). The C6 interval contains centimetric- to metric-scale vertical calcite veins, shear fractures and centimetric vertical stylolites. Furthermore, metric-scale joints are related to late exhumation (Miranda et al. 2018). The laminations are 3–5 mm thick, and the variation in depositional conditions (climate, salinity, and sedimentation rates) resulted in a varying content of organic matter (OM) in the laminae (Fig. 1). Variation in the sedimentation rate also influenced matrix cementation during the early stages of diagenesis (Araujo et al. 2020, Heimhofer et al. 2010, Osés et al. 2017), resulting in cemented sets of laminations with early silica formation. Early silicified sets of laminae also contain higher contents of OM and pyrite. These levels present a grey-greenish color (G2 group of laminations). The C6 succession also presents lamination sets with iron oxides, and a minor content of silica and OM, with a yellowish color (G1 group of laminations). The variation in porosity and mechanical strength between the two groups of laminations is expressive. The G1 group presents more dissolution features, higher porosity, and fewer deformational structures (Araujo et al. 2020).

The study used data from three vertical pseudo-wells created with high-resolution stratigraphic descriptions of the petrofacies from two quarries located one kilometer apart (Figs. 2a,b and 3) Araujo et al. (2020). The first and second pseudo-wells located in panel 1 (PW1 and PW2 – Idemar Quarry) are separated by 9.95 m, and the third pseudo-well was located in panel 2 (PW3 – William Quarry). Petrofacies can be defined by integrating the rock sample observations and well logs (Cao et al. 2021, Jardim et al. 2011). This approach considers the characterization of lithofacies (lithology, sin-depositional structures) and other physical aspects like diagenetic features (cementing and concretions), petrophysics (porosity and permeability), and mechanical properties (Araujo et al. 2020, Cao et al. 2021, Watney et al. 1998, Gómez 2020, Jardim et al. 2011). The main advantage of the petrofacies approach for sedimentary succession classification is that it can constrain the distribution of parameters directly related to fluid flow. The petrofacies classification provided by Araujo et al. (2020) for the C6 interval integrates sedimentological, diagenetic, and sin-depositional features. The authors defined thirteen (13) petrofacies in two main groups, seven in the beige-yellowish G1 group and five in the grayish G2 group, as shown in Table I.

Table I
Description of the two petrofacies groups identified in the laminated limestone interval, with the main characteristics of depositional and diagenetic features (Araujo et al. 2020).

The succession shows a high-frequency alternation between the G1 and G2 petrofacies components. The G2 petrofacies dominate the basal part of the succession. Stratigraphic information was acquired with a vertical limit of 5 mm for the definition of the individual petrofacies, resulting in a high resolution stratigraphic description in the pseudo-wells (Araujo et al. 2020). The pseudo-wells data also included: 1) - gamma ray logs acquired in situ using a portable gamma spectrometer. Measurements comprise the total gamma emissions and the potassium (K), thorium (Th), and uranium (U) contents (Araujo et al. 2020), and 2) - in situ unconfined compressive strength (UCS) measurements acquired with a portable sclerometer (Schmidt Hammer N type). The sampling spacing for the gamma and UCS tests was 15 cm for the two pseudo-wells in panel 1 and 20 cm for the pseudo-well in panel 2 (Araujo et al. 2020). Figs. 2 and 3 show the composite well data with the gamma logs, which are related to the mineral composition of the strata, and the UCS data, which are related to its mechanical properties. Previous tests indicate that these logs could help to distinguish between the petrofacies because the gamma measurements are sensitive to variations in diagenetic features (primary and authigenic mineral content), and the UCS shows good relation to properties like density and cementation index.

Figure 2
Composite well logs with petrofacies, Total GR, U, Th, K and UCS data of the two pseudo-wells acquired in panel 1 (a) PW1 and b) PW2 – Idemar Quarry). Spacing for the acquisition of geophysical data was 15 cm.
Figure 3
Composite well logs with petrofacies, Total GR, U, Th, K and UCS data of the pseudo-well acquired in panel 2 (PW3 – William Quarry). Spacing for the acquisition of geophysical data was 20 cm.

Because of the high resolution used to describe the vertical profiles (mm-scale), the sampling of gamma and mechanical strength (15 to 20 cm spacings) resulted in these measurements being captured and integrated for only eight petrofacies. That result demonstrated the first challenges of capturing information for this type of succession (Araujo et al. 2020). Thus, the research considered the eight petrofacies which were integrated with the physical logs for ML-based processing: GLLCV, YLL, GLLVUG, GLL, YLLCON, YLLLB, YLLCV, and YLLGP. Fig. 4a-c shows the distribution of each petrofacies in the total vertical succession sampled. A substantial imbalance is observed in the dataset, as YLL (52.31%) and GLL (23.08%) are the most frequent facies.

Figure 4
a) Distribution of the eight petrofacies from both groups G1 (grey) and G2 (yellow) tested in the vertical space of the pseudo-wells. b) Distribution of petrofacies in the vertical profiles of the three pseudo-wells. c) Distribution of petrofacies classes in the dataset.


Gaussian Naïve Bayes

NB is one of the most traditional ML algorithms for classification purposes. The method is based on the Bayes’ Theorem and assumes that features are independent (Witten & Frank 2002).

The prediction of class​​ ˆ y ​​ with the NB algorithm is given by the maximum a posteriori probability (MAP) estimate:

​​ y^=argmax(yk)[ln[P(yk)]+i=1mln(P(Xi|yk))],with k=1,2,,K (1)

Assuming that the features ​X​ follow a normal distribution, a particular case of NB—Gaussian naïve Bayes (GNB)—is used, in which the conditional probability P​(​​X​ i​​​|​​y​ k​​)​​ is given by:

P ( X i = x | y k ) = 1 2 π σ k 2 e ( x μ k ) 2 2 σ κ 2 (2)

Support Vector Machine

SVM is a fundamentally discriminative classification model whose idealization is based on statistical learning frameworks (Cortes & Vapnik 1995, Vapnik & Chervonenkis 1971, Vapnik 1995).

For a binary classification problem with a linearly separable dataset, the SVM looks for the hyperplane (Eq. 3) that separates the classes with the maximum distance or margin (Eq. 4) with respect to a subset of training points called support vectors.

W T X + b = 0 (3)
ρ=2||W||​​ (4)

The optimal hyperplane maximizes​​ ​|​​​|W|​​|​​​​, which can be found by solving the following equation:

m i n ( W , b ) 1 2 W T W + C i = 1 m ξ i s . t . y i ( W T X i + b ) 1 ξ i ,     ξ i 0 (5)

In this formula, the parameter ​C​ controls the magnitude of the margin and, consequently, the precision of the model. Furthermore, ​ξ​ indicates the number of support vectors that the model must use to construct the regression function. Figure 5 shows a cross-plot of the features obtained through the pseudo-wells. Most features indicate a reasonable separation between the wells. However, the separation of the features of PW3 from PW1 and PW2 is notable, indicating a spatial dependence of the measured features and, therefore, of the lateral variation of geological properties. This will impact upon the performance of the models depending on the training and testing configurations. Besides the features mentioned before—total gamma ray counts, separated K, U and Th counts, and UCS measurements—we also use the depth of each measurement. Normally, layered rocks exhibit good lateral continuity of some parameters, and this was also observed in the C6 interval. (Araujo et al. 2020) found three distinct mechanical zones in the C6 interval, which information was used to improve the occurrence probability of the petrofacies through the pseudo-well sections.

Figure 5
A cross-plot showing the relationships between all features (Depth, K, U, Th, Total GR and UCS) collected from the three pseudo-wells PW1, PW2 and PW3.

ML Training and Evaluation

We applied the algorithms to three different classification sets, as shown in Fig. 6. In each classification set, two pseudo-wells were used for training and the third was used for testing. k-fold cross-validation was used to evaluate the model’s performance on the training data and then the model was applied to the test data. The SVM hyperparameters were tuned using sklearn GridSearchCV. The GNB represents a nonparametric model and, therefore, didn’t need a hyperparameter tuning scheme. We determined the model’s generalization capacity by comparing the accuracy, precision, recall, and F1-scores it obtained over the specified training data, and then the test data.

Figure 6
1st set: PW2 and PW3 used for training and PW1 used for model evaluation. 2nd set: PW1 and PW3 used for training and PW2 used for model evaluation. 3rd set: PW1 and PW2 used for training and PW3 used for model evaluation (See the color code for petrofacies in Figures 2 and 3).

Besides the asymmetry of the petrofacies distribution in the three pseudo-wells, another important aspect is the absence of some petrofacies types in the training dataset that appear in the test dataset, as shown in Fig. 7. This occurs due the complex vertical and horizontal distribution of the petrofacies, and demonstrates real conditions found in sedimentary successions. For example, petrofacies GLLVUG appears only in PW1, YLLLB which is found only in PW2, and YLLGP is present only in PW3.

Figure 7
Petrofacies distribution in the training and test datasets for all classification sets (See Petrofacies codes in Table I).


This section describes the results of training data cross-validation using all the features of the dataset, and the combination of features that yielded the best classification performance. We also present the results for the three classification scenarios involving the three wells (Fig. 6) using the GNB and SVM models.

GNB Performance

We conducted an ANOVA to compare different combinations of features and the performance of the GNB model under 5-fold cross-validation. The results of this analysis are presented in Tables II and III.

Table II
ANOVA performed on the features (Depth, K, U, Th, Total GR and UCS), sorted by decreasing p values.
Table III
Feature combinations and their respective mean accuracy, precision, recall, and F1 score in the cross-validation of GNB classification.

The cross-validation results for the GNB model based on the Depth, K, and U features are shown in Fig. 8.

Figure 8
5-fold cross-validation for the GNB model based on the Depth, K, and U features.

Figure 9 shows the means and standard deviations obtained by 5-fold cross-validation with the three training sets for both cases: using all features, and using the best configuration of features.

Figure 9
Means and standard deviations of the cross-validation (k-fold = 5) for the GNB models with all features and best features in the training sets.

Figure 10 shows the classifications obtained by each GNB model considering the test data, according to the configuration described in Fig. 7.

Figure 10
Results of the GNB models with the three test sets, for all features and for the best configuration of features.

Figure 11 shows the confusion matrices used to analyze the performance of the three model configurations, for petrofacies classification.

Figure 11
Confusion matrices of the GNB models for the three classification sets. Top: model that used all features. Bottom: model that used the best features. Petrofacies dictionary: 1-GLLCV, 2-YLL, 3-GLLVUG, 4-GLL, 5-YLLCON, 6-YLLLB, 7-YLLCV, and 8-YLLGP.

SVM Performance

We processed the SVM models using the same procedures for feature selection as used for the GNB models, and the five best mean results are shown in Table IV, sorted by F 1 score.

Table IV
Feature combinations and respective means of accuracy, precision, recall, and F1 score, from SVM processing.

We used the results in Table IV to choose the best SVM model: the set processed with all features. The cross-validation results for this model are shown in Fig. 12.

Figure 12
Means and standard deviations of the 5-fold cross-validation obtained by SVM processing using all the features.

The performance metrics for the SVM and best GNB processing on the test data across all training datasets are shown in Figure 13, and Figure 14 shows the confusion matrices for the SVM and best GNB models.

Figure 13
Performance metrics for the SVM and best GNB models processed with the three test sets.
Figure 14
Confusion matrices for the best GNB model (top) and the SVM model (bottom). Petrofacies dictionary: 1-GLLCV, 2-YLL, 3-GLLVUG, 4-GLL, 5-YLLCON, 6-YLLLB, 7-YLLCV, and 8-YLLGP.

The facies classification results obtained with the models are shown in Figure 15, along with the actual petrofacies distribution: a) the original vertical petrofacies succession interpreted in the field; b) the “actual” classification used by this research, which presents a coarser vertical classification of the petrofacies succession due to the sampling spacing of the features; and c and d) the classification produced by the GNB and SVM models. The overall predominance of the petrofacies in the three wells reveals that the C6 Interval in the PW3 site is dominated by petrofacies of the G1 group, which presented a problem for the classification. The models were evaluated on their capacity to achieve classification against the actual distribution of the eight petrofacies in the two groups (G1 and G2), as well to classify the strata for the two groups in general (Fig. 15).

Figure 15
Description of the pseudo-wells and the classifications by the models. a) Original distribution of the thirteen petrofacies in the vertical profiles of the pseudo-wells. b) Actual distribution of the eight petrofacies integrated with the physical parameters used for classification. c) GNB results, and d) SVM results. See the color code for all the petrofacies in Figures 2 and 3.

Table V shows the proportion of the petrofacies classified regarding their relationship to the two groups against the actual proportion present in the three wells, according to the results obtained by the best GNB and SVM models.

Table V
Comparison of proportions of the classified petrofacies within groups G1 and G2 provided by the models for the three pseudo-wells against the actual proportions of each group.


Figure 9 shows that the GNB model using the best features (best GNB) performed better on the first and third training dataset, which demonstrates the importance of feature selection in this case. For both models, the cross-validation results are similar across all training datasets. The positive impact of using the best features on the performance of the GNB model can be observed in the results obtained from the test data (Fig. 10). The classification accuracy on the first set increased by 8% while there was no improvement in the other metrics. In the second set, the accuracy improved by just over 33% in the best GNB model, while the precision, recall, and F1 score more than doubled. The third classification set showed slight improvements in all metrics.

Furthermore, the best GNB model performed better on the test data than on the training data on the first and second classification sets, and lower performance on the third set. The training data used for the cross-validation between the first and second sets, PW1 and PW2, were from locations 9.95 m apart, and therefore there was better lateral continuity of geological characteristics between them. In the two other configurations, the data used for PW3 was 1.0 km distant from the data from PW1 and PW2 (Fig. 6), and the results showed that the lateral variation in the geological characteristics within the C6 interval made petrofacies identification more difficult, as expected. This also explains the high standard deviations obtained in the cross-validation of the training set. Moreover, the limited amount of data available also impacted the performance of the models with the training sets. The small data volume explains the improvements in the metrics, particularly accuracy, obtained from test data versus those obtained from cross-validation.

Analyzing the model performance showed that, in the first set, which was devised to determine the petrofacies distribution in PW1 based on data from PW2 and PW3, the training set was unbalanced because the petrofacies YLL has a frequency greater than the sum of all others. The test set comprised only four petrofacies, with the following frequencies: YLL (9), GLL (10), GLLCV (1), and GLLVUG (1) (Fig. 7). The frequency of petrofacies GLL in the test data was twice that of the training data. However, as it has different geological characteristics from the G1 Group, the model was able to differentiate them, despite the differences in sampling and representation. With these characteristics, the model training tends to generate a bias for the YLL and GLL petrofacies, which can be seen in the confusion matrix in Fig. 11. Regarding the petrofacies GLLCV and GLLVUG, the latter occurs only in the test data, making its identification impossible, while the former occurs exactly once in both sets, which makes its identification probability very low. However, due to their limited occurrence in the test data, the failure in the classification of these petrofacies did not have a great impact on the evaluation metrics, especially on the accuracy (Figs. 7 and 10).

In the second test set for classification in PW2, the spatial distribution of the pseudo-wells had the same characteristics used in the first configuration (Fig. 6). The arrangement of classes in the training and testing data was also very similar to the first configuration (Fig. 7). The frequency of the YLL class in both datasets is much higher than that of the other classes, which creates the same bias previously described. However, the GLL and YLLCV petrofacies were more frequent in the training data than in the test data. This tended to improve the model learning for these classes, thereby improving their performance in the subsequent classification task. These results explain why the model had better performance with the second training set, and why the improvement in the metric scores due to the use of the best features was more significant in this case (Fig. 10).

The third testing scenario proved more challenging: the training pseudo-wells (PW1 and PW2) show more lateral correlation due to their proximity, and they show significant geological lateral differences from the target pseudo-well (PW3), as expected, due to the distance between them. The training dataset (Fig. 7) was dominated by the YLL (20) and GLL (15) petrofacies. The other classes each occurred once, except for YLLCON petrofacies, which occurred three times. The test set was dominated by petrofacies YLL (14), while petrofacies GLL was absent and the other petrofacies presented a very low frequency in the PW3 set. Petrofacies YLLGP was absent from the training data. This scenario led to a higher bias effect due to the differences between the training and test sets. Thus, it is expected that the model tended to emphasize the occurrence of petrofacies YLL and GLL, which explains the overall inferior performance for petrofacies classification compared to the previous model tests. The processing showed a slight improvement in classification with the use of the best features.

Figure 11 shows that, for all classification configurations, the model that used all features was able to identify only the most frequent classes (YLL and GLL), resulting in a strongly biased classification. Using only of the best features, this bias was reduced and the model could correctly classify the YLLCON petrofacies, which illustrates the improvement in model performance for this feature selection (Fig. 11).

Table IV demonstrated that the best performance of the SVM model was obtained using all features. A comparison with Table III reveals similar results between the SVM and the best GNB. The SVM cross-validation results are also similar to those for the best GNB (Fig. 9). As was done in the cross-validation task, the amount of data used for training in each fold was indeed less than used in the sets shown in Fig. 7, and a comparison of these results reveals that using less data results in more similar performances of the SVM and GNB models. The reverse is also true, as shown by the performance of the SVM and the best GNB models through the classification sets (Fig. 13). When the amount of data available for training was larger than the amount available for cross-validation, the SVM model outperformed the GNB model running through the first two classification sets, especially regarding the precision, recall, and F1 score metrics. Through the third training set, both models had very similar performances, with a slight advantage for the best GNB in terms of accuracy. These results are linked to the characteristics of the features used to build the models and their variations, which are controlled by the geological properties, as discussed above. As this application deals with unbalanced labels, the most suitable metric for evaluating models is the F1 score. The SVM model obtained an average F1 score of 0.47, while the best GNB was 0.29.

Figure 14 shows that, in general, the SVM model presented better results than the best GNB model for the first two classification sets. Especially considering the data imbalance, the model was able to correctly classify the occurrences of the two petrofacies with the lower frequencies: GLLCV (1) in the first classification set, and the YLLCV (7) in the second. Furthermore, the model showed better performance in determining the most frequent petrofacies, classifying 7 out of 9 YLL (Group 1) petrofacies and 9 out of 11 GLL (Group 2) petrofacies in the first classification set, and 9 out of 11 and 5 out of 5, respectively, in the second. The third classification set was challenging for the models due to the distance between the pseudo-wells and the consequent lateral variation of geological properties, and because occurrence and proportion of petrofacies in PW3. Thus, the greater bias observed in the GNB model may have improved its performance, which could explain its better performance in the classification of petrofacies in the third set used.

Previous classifications of C6 interval laminated limestones considered a small number of lithofacies, or microfacies. Neumann et al. (1999) divided the C6 interval into 5 microfacies: 1 – parallel and wavy-parallel laminations with loop beddings, 2 – parallel laminations with peloids, 3 – parallel and wavy-parallel laminations with micro-slumps, 4 – parallel and wavy-parallel lamination, and 5 – parallel and wavy-parallel laminations with ostracods. Catto et al. (2016) considered mineralogy, microscopic textures, and influence of microorganisms and suggested four microfacies types: 1 – planar laminated, 2 – crustiform, 3 – nodular, and 4 – rhythmic (interbedded sub-millimetric to millimetric lenses of micritic calcite, organic matter, and clay minerals). Osés et al. (2017) analyzed the preservation characteristics of microfossils in the laminated limestones of C6 interval and pointed to the influence of sedimentation rate on the alteration of organic matter as a key factor to the early diagenesis of the laminated limestones. They proposed two dominant microfacies: 1 – BL beige limestones and 2 – GL grey limestones. These authors suggested that GL differs from BL due to its pyrite, argilominerals, and OM contents. They also observed that GL microfacies present more structures like convoluted laminations, wavy laminations, and micro-faults. The classification of petrofacies adopted by the present research (Araujo et al. 2020) was built on the integration of the sedimentological aspects with chemical properties, diagenetic features and mechanical parameters (UCS) and aimed to provide information for reservoir characterization (dissolution zones, cemented zones). The differences observed between PW1/PW2 and PW3 are related to the variation in depositional conditions and diagenetic processes (local and regional effects), including the sedimentation rates and composition of sediments across the depositional system (content of clay minerals and OM). The sedimentation rate influenced the eo-diagenesis and the compaction of the deposits (Osés et al. 2017). For example, the petrofacies classification (Araujo et al. 2020) considered the occurrence of laminations bearing horizontal gypsum veins in the PW3, which were probably created by natural hydraulic fracturing linked to the proximity of fault zones—a localized diagenetic aspect (Araujo et al. 2020, Celestino et al. 2020). Based on studies of the C6 interval discussed herein, one can argue that regarding only the mineralogical composition of the succession the C6 interval can be divided into roughly two main lithofacies types. Thus, the criteria used here to improve the interpretation of reservoir characteristics was much more sophisticated and improve the characterization, but prove much more challenging for the automatic classification process. The research proved that the limited data used were sufficient to help the models recognize the main differences between the two general lithofacies (Fig. 15) recognized by previous works. Interestingly, for the third set processed, both algorithms confused the petrofacies YLLCON (concretions) and YLLCV (convoluted laminations) from the G1 group with the GLL petrofacies of the G2 group (Fig. 15). These petrofacies are linked to the increase in the mechanical strength of the strata due to early silica cementation (Figs. 2 and 3). The basal section of PW3 presented higher values of UCS, which is linked to G2 petrofacies in PW1 and PW2. As the SVM used the USC feature — unlike GNB — the SVM possibly tended to choose the “generic” GLL petrofacies because it had a higher frequency than the G2 group in general. This correlation between the mineralogical composition and the variation in the mechanical strength hindered the classification for part of the PW3 succession, which explains why this model performed worst on the third classification set.

An analysis of the efficiency of the models in the classification of the two groups of petrofacies was also considered because, in terms of reservoir modeling, a coarse scale of parameterization can also be considered, as proposed in previous literature about these deposits. The models showed similar performances in interpreting the original proportions of groups G1 and G2 (Table V) for the first set in terms of percentage changes (12% for the G1 group and 9% for G2), with the best GNB model tending to overestimate the G1 group and underestimate G2, and the SVM presenting opposite results. Through the second set used, the best GNB model underestimated G1 by 12% and overestimated G2 by 38%, while the SVM reached a similar pattern with 18% and 58%, respectively. For the third set analyzed, the proportions of each group were extremely unbalanced, with 96% for G1 and 4% for G2. In this case, the best GNB model found a proportion of 61% and 39% for G1 and G2, and the SVM model found 52% and 48%, respectively. Consequently, the SVM model presented the lower performance.

The cross-validation and classification sets produced generally similar results, and also better achievements in terms of scoring metrics (especially in the case of the SVM model), when compared with other studies. Dunham et al. (2020) conducted an experiment on classifying 9 lithofacies in a scarce-data scenario (439 points with a sampling interval of 150cm), comparing GNB and SVM trained with semi-supervised approaches. The study demonstrated that the best accuracy results in cross-validation (5-fold) were 49.21% (GNB) and 50.41% (SVM), which are below the results presented in the present work. The results obtained by Silva et al. (2020) show that GNB had an accuracy slightly higher than 80%. However, the amount of available data (1,477 samples), the number of facies to be predicted (3), the application of previous preprocessing (SMOTE technique), and the method (using only cross-validation, without employing a blind test in a real well) used can affect a more accurate and unbiased diagnosis of model performance and make this experiment much less challenging than the present work. The same arguments can be extended to the work conducted by López & Thomas (2009).


We used ML models (GNB and SVM) to automatically classify petrofacies in three pseudo-wells in a thin laminated limestone unit (~20m thickness) with kilometers of lateral continuity, considering a reservoir cell scale (1 km between wells) and a limited number of logs (radiometry and mechanical strength). Both models successfully classified eight petrofacies for the wells PW1 and PW2 separated by 9.95 m. But, the classification for the third pseudo-well PW3, located 1 km away, was less effective. However, despite the challenges faced, the models presented relative success in defining the proportions of the two general lithological groups.

The results show that the selection of features used to build the GNB model was fundamental for maximizing its performance, whereas the SVM model performed best when using all available features. Thus, for the individual classification of petrofacies, the SVM model showed the best overall performance in the three configuration sets proposed. However, the GNB model performed better when analyzing the classification capacity for the two general groups of petrofacies (G1 and G2).

The study showed the main problems treating the classification of fine vertical variations in thin vertical intervals using the chosen ML models. It can help future understanding of how the increasing complexity of facies description can affect classification using ML in extreme scenarios such as the one treated here.


This study was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Energi Simulation, and Petrobras that funded the projects “Geomecarb” and “Pseudo-poços” through research agreements approved by the Agência Nacional do Petróleo, Gás Natural e Biocombustíveis (ANP).


