Application of artificial intelligence for technical screening of enhanced oil recovery methods

An artificial intelligence technique based on five (5) layered feedforward backpropagation algorithm is applied in this study for technical screening of enhanced oil recovery (EOR) methods. Explicit knowledge pattern associated with the field data are extracted by taking advantage of the robustness of fuzzy logic reasoning and learning capability of neural networks. Associated field data from successful EOR projects include parameters such as depth, porosity, permeability, viscosity, oil API and oil saturation. These parameters were used as input and predicted output in the training and validation processes, respectively. The developed model was then tested by using data set from Block T of the Angolan oilfield. Sensitivity analysis was performed between the Mandani and the Takagi Sugero (TSK) model approach incorporated in the algorithm. The results of the sensitivity analysis have shown the robustness of the ANFIS approach in comparison to other approaches for the prediction of suitable EOR technique. Five nonregression models (linear, potential, logarithm, power and polynomial) were applied to evaluate the accuracy of the model between the trained and the tested data set. The results of simulation show that hydrocarbon gas, polymer, combustion and CO2 are the suitable EOR techniques and could be used for further experimental and numerical studies.


Introduction
The current decline of oil production in Angolan oilfields and the prolonged downtown in the industry has necessitated the need to recover more oil from existing hydrocarbon reservoirs rather than moving to the deep and ultra-deep waters. Effective planning on the selection of the suitable technique is crucial to the success of the EOR implementation. This is due to the fact that most of the implementation of the EOR techniques in new brown fields can be challenging and expensive in comparison to secondary recovery processes if not cautiously planned. This process involves integration of a set of parameters governing technical and economic performance of a reservoir [1] but not limited to the environmental, commercial, political and governmental factors [2][3][4].
The application of AI in screening reservoir candidates for the purpose of EOR was first published Guerillot [14]. Subsequent upon this, several works have been published to improve the quality and accuracy of the models. These models are based on fuzzy-logic (FL) and expert system approach [11,17], artificial neural network (ANN) [18], least square support vector machine (LSSVM) [19], and very recently combination of both fuzzy-logic (FL) and neuro-fuzzy (NF) [1,10,20]. These works and other recent works on screening techniques published in literature are summarised in the work presented by Ramos and Akanji [1].
In this study, a neuro-fuzzy (NF) approach based on five-layered feedforward-backpropagation technique was employed in the technical selection of suitable EOR technique of Block T in offshore Angola. The model combines both searching potential of fuzzylogic (FL) and the learning capability of neural network (NN) to make a priori decision [1]. 365 data set from multiple successful

Data gathering and Analysis
Data sets are initially analysed before running the simulation model to ensure that reliable and accurate results are obtainable from the simulation runs. A total of 365 data sets obtained from 365 worldwide successful EOR projects ( Figure 1) are used in the training and validation (prediction) process. The data consists of ten (10) different EOR techniques including miscible hydrocarbon gas, CO 2 , combustion, steam, polymer, surfactants, nitrates, microbial, hot water, and acid gas [10,21].
The testing data from Block T in offshore Angola were collected from ten (10) fields (Table 1) and from several reports including well test, geochemistry, fluid sampling, final well, thermodynamic, geological, DST and log interpretation reports.  Data analysis: Due to insufficient training/validation data from some of the EOR techniques such as surfactants, nitrates, microbial, hot water, acid gas and missing variables data such as pressure, thickness, salinity, temperature these techniques and parameters were not investigated. Duplicated, inconsistent data from Angolan oilfields were re-moved to ascertain the quality of the data-set and results obtainable from simulation. Analysis of each variable (i.e API, depth, porosity, permeability, saturation, and viscosity) for different EOR techniques was performed by using cross plots (   [21]; Angolan oilfield data (Sonangol EP).
The first analysis was performed using the scatter (cross) plots. This was achieved by combining data from both successful EOR projects and those from the investigated oilfield (Block T). These plots are used to display the relationships between pair of variables and also to detect the outliers [6]. The variables relationships screened in this study are data associated the reservoir rock (reservoir depth, log permeability and porosity) and fluid properties (oil gravity, log oil viscosity and oil saturation). This investigation led to the analysis of the impact of the investigated parameters on the existing EOR methods ( Figure 2).
The plots show that the six parameters investigated from the existing EOR projects characterise most of the EOR techniques. Although some data for steam for cross plots between permeability and porosity, oil saturation and porosity (Figures 2b and 2c) are distant from the rest, we cannot consider these as outliers due to the fact that they are data from successfully implemented EOR projects. From the plots, it can be seen that steam is the least suitable EOR technique of the investigated Block (Block T).
Similarly, box-plots were used to illustrate the distribution of the parameters for both successful EOR projects and the investigated Block T. A data set from successful EOR projects was used to obtain the required statistics involving minimum, maximum, average, 1 st quartile, 2 nd quartile (median) and 3 rd quartile for each variable. As an example, considering miscible CO 2 as an EOR technique (Figure 3 Following the outcome of the box-plots and scatter plots, histogram technique is then used to represent the distribution of the data sets. The range of the values of the histogram was defined by the statistical data obtained from the box-plots (minimum, maximum, average, 1 st quartile, 2 nd quartile (median) and 3 rd quartile). These parameters were determined from the data set of the successful EOR data set and then, defined the intervals for the data distribution in horizontal axis. Besides values aforementioned, zero is added to represent the interval of zero < minimum, and value > maximum values, which represents the first and last column of the histogram. Figure 4 represents the distribution of the hydrocarbon gas of the data from Angola oilfield within the defined interval. The data from the first and last column are the investigated data considered out of the range, or not suitable for the investigated technique. This procedure was applied in the analysis of other EOR techniques (steam, combustion, CO 2 and polymer). These plots (cross plots, box-plots and histograms) provide a quick and efficient way to analyse the data as well as a quick way of defining the suitability of the parameters under investigation. However, it does not quantify the degree of uncertainty or add weight of each parameter, which requires a sophisticated system such as Neuro-Fuzzy, simulation or laboratory test for further investigation [1].

Neuro-Fuzzy Model Development
The NF model adopted in this work is a five (5) layered feedforward-backpropagation neural networks ( Figure 5). The first and last layers are input and output layers, the intermediate layer, called the hidden layer and their neurons [3,19]. The number of neurons for the input and output layers are dependent on the type of problem and the number of input and output variables [3] while the number of neurons and hidden layers are based on the accuracy of the model [3].
The modelling process consists of training, validation and testing processes. During the training process, sensitivity analysis was performed by employing the developed model based on TSK (Takagi and Sugeno) approach against the Mamdani approach incorporated in the mode [22][23][24][25]. The data for the training and validation are from successful EOR projects grouped in two sets of 4/5 (80%) as the training and 1/5 (20%) as the validation data set, generating more than 1,350 runs as described in Ramos and Akanji [1]. During the training and validation stage, the weights are estimated to minimise the deviations between the actual and predicted outputs, whilst the testing data are used for checking the perform of the model [3]. The objective function of the model is the root mean square error (RMSE) with the threshold designed to be 0.01 and the number of epochs per each training case is set to a maximum number of 2,000. The accuracy of the model was examined by the least RMSE which also leads to a least nondimensional error index (NDEI) [10].

5/12
( ) is the standard deviation of the target series, i is the data point that varies from 1 to N. The error computation is crucial for training and validation process. This ensures the accuracy and performance of the model at the testing stage thereby ascertaining the suitability of the choice of EOR process or technique under investigation.
The NF model adopted in this investigation has some advantages compared to other models. This includes the use of raw data in the training and validation process. Further, it does not require any normalisation neither does it make an assumption about the process, but matches the pattern from the reservoir field under investigation to the least error data from the successful EOR projects [1]. It provides the degree of suitability of a typical EOR project obtained from the model prior to full field implementation as well as permits to segregate more oil properties and reservoir characteristics that could impact on EOR projects [1]. In order to reduce the cost of function that may lead to prediction that are less robust by using raw data [3,26], NDEI was used as decision making in testing process. The model has good performance when run with enough training, validation and testing data sets. Not having enough data may result in over-fitting leading to unexpected results. If the NN is successfully trained, it can now be used to predict the suitability of the test data for the respective EOR technique under investigation. Detailed description of the model has been presented in [1,10].

Results and Discussion
The five (5) layered NF model based on feedforwardbackpropagation neural networks was employed for technical screening of reservoir candidates of Block T in offshore Angolan oilfields. First, the data from worldwide successful EOR projects was grouped by variables (depth, permeability, porosity, oil viscosity, oil saturation, oil gravity) and EOR techniques (steam, miscible hydrocarbon gas, CO 2 and combustion). Data was analysed using combined scattered plots, box-plots and histogram. This analysis is a quick look of suitability of EOR techniques and if all set of the parameters are not within the range of successful EOR projects, this set could be ignored from the NF simulation. Although, it is not a guaranteed, the RMSE used as object function is considered small and acceptable when the actual and predicted values are close to each other.
Training and validation process: The available data set from successful EOR projects was randomly divided into two sub-data sets of training (80%) and validation (20%). This data was used to construct and optimise the model parameters by using RMSE and NDEI [19].
Two approaches employed in the model were used to perform the sensitivity analysis of the model for the six parameters investigated (depth, porosity, permeability, viscosity, saturation and oil gravity) and five EOR techniques (steam, misc. gas, CO 2 , combustion and polymer). Besides the six variables and the five EOR techniques, three membership functions (triangular, trapezoidal and Gaussian) were employed in the training and validation process to determine the optimal model for testing purposes. The approach based on TSK and Mamdani approach were tested; generating 1, 350 runs with 15 runs for each variable and 90 runs for each technique. The results in general, showed that the TSK approach is more accurate (see Tables 2-4). In these tables, the selected optimum model for oil gravity of steam EOR technique is from Run 3 of TSK approach of trapezoidal MFs, with RMSE and NDEI of 0.000197 and 0.00038, respectively. This constructed model, is far better than the ones resulted from Mamdani approach for both COG and MOM as highlighted in bold in Tables 3 and 4. This process was performed for other variables and techniques. Then, the best validated data set (with the least RMSE) from the model, is then used as predicted output in the testing process.     Testing process: The testing process was performed using the data from the Block T (Angola), the actual output, and the predicted output, represented by the least RMSE while the decision making was based on NDEI. In this stage, the investigation of the model performance and accuracy is employed. During the testing process, the best validation data set generated during the training and validation process is used as validation data set or predicted output in the testing process. Figure 6 shows the plots of the testing process for steam EOR technique of data set from Angola oilfield (Block T).
For the statistical error analysis of the model four indicators were presented: mean, standard deviation (STDV), RMSE, NDEI. These are summarised in Figure 7, where the four values are plotted from the five (5) EOR techniques investigated in this study.
For the purpose of the interpretation and analysis of the results, three scenarios were investigated: (1) the least RMSE combined with 20 < NDEI 30%; (2) the least RMSE combined with 10 < NDEI 20%; (3) the least RMSE combined with NDEI 10% [1]. This procedure was performed on the six variables investigated (API, depth, porosity, saturation, permeability and viscosity) for five EOR techniques (miscible gas, steam, CO 2 , polymer and combustion). The results are summarised in Table 5.
Regression analysis: Scatter plots of predicted output versus actual output (experimental data) for the model were generated and fitted for five different regression methods: linear, exponential, logarithmic, polynomial, and power. The steam EOR technique in the Figure 8 shows that oil gravity, porosity, and oil saturation for the model matched better the actual field data. The values estimated by the model are closer to the field data or the fitting data is within the unit slope line.
From the Figure 6, the permeability data does not display good match between the predicted and actual output which as illustrated in Figure 8, the distribution is outside the unit slope. This exercise is performed for all parameters and techniques investigated in this study. Equations 1 and 2 used for the model simulation were used to verify the code analytically. The results of the analysis are presented from Tables 6 to 10.
The results from simulation and hand calculation values are the same showing that model calculations are correct. On the other hand, the predicted output and actual output of the model for most of the parameters are distributed around the unit slope line than the other analytical solutions obtained from other four regression techniques (polynomial, exponential, logarithmic and power). This means that the predicted output data are close to the actual output data (field data), representing the high capability of the model based on the TSK approach in predicting the EOR techniques of heterogeneous reservoirs. The R-squared (R 2 ) representing the percent variance of the model for different regression are listed in Tables 11 and 12.

Conclusion
A comprehensive study of data analysis has been presented using scatter plots, box-plots and histograms. These plots were generated by combining data set of the successful EOR techniques and data set of the investigated oilfield data (Block T in offshore Angola). The plots served as a useful tool for exploring the relationship between the reservoir parameters, the range and distribution of the investigated parameters within the screening criteria data set.
After preliminary screening, a five layer feed forward-back propagation model based on TSK and Mamdani was used in the training, validation and testing of the successful EOR data against the oilfield data from Block T in offshore Angola. The results in general, showed that the TSK approach being more accurate.
Then, the performance of the model was proved by fitting scatter plots of the predicted output and actual output with five different regressions methods: linear, exponential, logarithmic, polynomial, logarithmic and power law. The distribution of the parameters around the unit slope line, presents the capability of the model in predicting the suitability screening candidate reservoirs for EOR applications.
The model was tested by using the oilfield data from Block T in offshore Angola, and results presented that the Polymer, hydrocarbon gas, combustion are suitable techniques and CO 2 and steam being least suitable. However, additional evaluations such as laboratory core analysis, reservoir simulation and field pilot tests are required in order to confirm the applicability of the selected EOR technique.