webnovel

Decay Farea

X_Novel · Sci-fi
Not enough ratings
5 Chs

Lie Machine

You are in the machine but

Which machine ? You tell a lie

I must be to make by machinery

Such as

You musn't do it ,A game of life

Where is it?

A mechanical or electrical device that performs or assists in the performance of human tasks, whether physical or computational, laborious or for entertainment

These people In the lab ,They've been solvering problem

A computer Thisone adjusted the seat, put in the key, and then drove away. 

Game developers assume they're pushing the limits of the machine.

A vehicle operated mechanically; an automobile

Especially, the group that controls a political or similar organization

A person or organisation that seemingly acts like a machine, being particularly efficient, single-minded, or unemotional

The government has become a money-making machine.

An answering machine or, by extension, voice mail

I called you earlier, but all I got was the machine.

to shape or finish by machinery

{n} any engine to aid human power in the application of force

(n) A combination of interrelated parts used for applying, storing, or transforming energy to do work Machines consist of one or more assemblies, which are analyzed using techniques such as kinematics and dynamics

Any computer, including file or computer servers, diskfull workstations, or diskless workstations

Supernatural agency in a poem, or a superhuman being introduced to perform some exploit

A screw with a thread along the entire length of the shaft

make by machinery; "The Americans were machining while others still hand-made cars"

belonging to a machine

İlgili Terimler

machine gun: A type of firearm that continuously fires bullets in rapid succession

machine guns: plural form of machine gun

machine independent: Not dependent on the type of computer system on which software is being executed

Many techniques have been tried to get a workable system of machine independent code.

machine instruction: Any of a set of discrete instructions, each with its own binary representation and associated assembly language mnemonic, that a CPU can execute; typically they will involve the movement, comparison or manipulation of data

machine instructions: plural form of machine instruction

machine language: The set of instructions that a particular computer is designed to execute; generated from a high-level language by an assembler, compiler or interpreter

Though machine language is efficient for computers, it is inefficient for programmers.

machine languages: plural form of machine language

machine learning: A field concerned with the design and development of algorithms and techniques that allow computers to learn

machine of government: A political machine

machine pistol: A lightweight submachine gun, capable of being fired with one hand

machine politician: a politician who belongs to a small clique that controls a political party for private rather than public ends

machine room: A dedicated room for computers and related data processing equipment, often having a raised floor, abundant and dedicated air conditioning, and specialized power supply and fire suppression systems

machine room: A limited-access room in a larger building dedicated to HVAC equipment and other machinery

machine rooms: plural form of machine room

machine screw: A screw designed for metal or a similar material, usually with no point and a relatively fine thread, intended for prepared, threaded holes

Machine learning (ML) refers to a system's ability to acquire, and integrate knowledge through large-scale observations, and to improve, and extend itself by learning new knowledge rather than by being programmed with that knowledge. ML techniques are used in intelligent tutors to acquire new knowledge about students, identify their skills, and learn new teaching approaches. They improve teaching by repeatedly observing how students react and generalize rules about the domain or student. The role of ML techniques in a tutor is to independently observe and evaluate the tutor's actions. ML tutors customize their teaching by reasoning about large groups of students, and tutor-student interactions, generated through several components. A performance element is responsible for making improvements in the machine, using perceptions of tutor/student interactions, and knowledge about the student's reaction to decide how to modify the tutor to perform better in the future. ML techniques are used to identify student learning strategies, such as, which activities do students select most frequently and in which order. Analysis of student behavior leads to greater student learning outcome by providing tutors with useful diagnostic information for generating feedback.

ML

Skip to Main content

Machine Learning

ML comprises a variety of methods and algorithms, for example, artificial neural networks (ANN), Bayesian methods, case-based reasoning, decision trees, support vector machines, and so on.

From: Treatise on Estuarine and Coastal Science, 2011

Related terms:

Artificial Neural Network

Support Vector Machine

Solar Wind

Drought

Ecology

Groundwater

Microbiology

View all Topics

Data Mining and Knowledge Discovery

Sally I. McClean, in Encyclopedia of Physical Science and Technology (Third Edition), 2003

IV.G Engineering

Machine Learning has an increasing role in a number of areas of engineering, ranging from engineering design to project planning. The modern engineering design process is heavily dependent on computer-aided methodologies. Engineering structures are extensively tested during the development stage using computational models to provide information on stress fields, displacement, load-bearing capacity, etc. One of the principal analysis techniques employed by a variety of engineers is the finite element method, and Machine Learning can play an important role in learning rules for finite element mesh design for enhancing both the efficiency and quality of the computed solutions.

Other engineering design applications of Machine Learning occur in the development of systems, such as traffic density forecasting in traffic and highway engineering. Data Mining technologies also have a range of other engineering applications, including fault diagnosis (for example, in aircraft engines or in on-board electronics in intelligent military vehicles), object classification (in oil exploration), and machine or sensor calibration. Classification may, indeed, form part of the mechanism for fault diagnosis.

As well as in the design field, Machine Learning methodologies such as Neural Networks and Case-Based Reasoning are increasingly being used for engineering project management in an arena in which large scale international projects require vast amounts of planning to stay within time scale and budget.

View chapterPurchase book

Models of Ecological Responses to Flow Regime Change to Inform Environmental Flows Assessments

J. Angus Webb, ... Julian D. Olden, in Water for the Environment, 2017

14.4.5 Machine Learning Approaches

Machine learning (ML) is a rapidly growing area of ecoinformatics that is concerned with identifying structure in complex, often nonlinear data and generating accurate predictive models. ML algorithms can be organized according to a diverse taxonomy that reflects the desired outcome of the modeling process. A number of ML techniques have been promoted in ecology as powerful alternatives to traditional modeling approaches. These include supervised learning approaches that attempt to model the relationship between a set of inputs and known outputs such as artificial neural networks (ANNs), cellular automata, classification and regression trees, fuzzy logic, genetic algorithms and programming, maximum entropy, support vector machines, and wavelet analysis (Olden et al., 2008). The growing use of these methods in recent years is the direct result of their ability to model complex, nonlinear relationships in ecological data without having to satisfy the restrictive assumptions required by conventional, parametric statistical approaches.

In one example, Kennard et al. (2007) used multiresponse artificial neural networks to model fish community composition of two coastal Australian rivers as a function of both environmental and hydrological attributes (Fig. 14.4). In addition to the flexibility of neural networks to model multiple response variables, neural networks have the advantage of modeling nonlinear associations with a variety of data types, require no specific assumptions concerning the distributional characteristics of the independent variables, and can accommodate interactions among predictor variables without any a priori specification (Olden et al., 2006). In addition to achieving high predictive performance, the artificial neural networks produced by Kennard et al. (2007) also demonstrated that landscape- and local-scale habitat variables and characteristics of the long-term flow regime were generally more important predictors of fish assemblage structure than variables describing the short-term history of hydrological events. Moreover, the modeling approach revealed the importance of interactions among environmental and hydrologic factors operating at multiple spatial and temporal scales.

Sign in to download full-size on the computer

Figure 14.4. Structure and link weights of the multiresponse neural network relating different fish species to long-term flow characteristics, short-term changes in hydrological environment, and landscape-scale attributes.

Source: Kennard et al. (2007). © Canadian Science Publishing or its licensors

ML methods are powerful tools for prediction and explanation, and they will enhance our ability to model ecological systems. They are not, however, a solution to all ecological modeling problems. Over the past decade there has been considerable research on the development of methods for understanding the explanatory contributions of the independent variables in ANNs (Giam and Olden, 2015; Olden and Jackson, 2002). This was, in part, prompted by the fact that neural networks were considered a black box approach to modeling ecological data because of the perceived difficulty in understanding their inner workings. This is no longer the case. No one ML approach will be best suited to addressing all problems nor will ML approaches always be preferable to traditional statistical approaches. Although ML methods are generally more flexible with respect to modeling complex relationships and messy data sets, the models they produce are often more difficult to interpret, and the modeling process itself is often far from transparent. Although ML technologies strengthen our ability to model ecological phenomena, advances in understanding the fundamental processes underlying those phenomena are clearly critical as well.

View chapterPurchase book

Concluding Remarks

Srikanta Mishra, Akhil Datta-Gupta, in Applied Statistical Modeling and Data Analytics, 2018

9.2.3 One Model, or Many?

Machine-learning techniques such as random forest, support vector machine, and artificial neural networks are becoming increasingly popular for building input-output models as opposed to basic linear regression or its nonlinear/nonparametric variants. Often, the choice of which advanced model should be used is based on the analyst's preference. However, our experience suggests that no single method works best for all problems—making the a priori selection of a single modeling technique quite difficult. Sometimes, multiple competing models may arise when the goodness-of-fit measured in terms of training or test error is quite similar across the model set. Such an example is shown in Fig. 9.2, where the model fit expressed in terms of cross validation scaled RMSE is found to be very similar for four different models—ordinary kriging, quadratic fit with LASSO, multiple adaptive regression spline (MARS), or additivity and variance stabilization (AVAS). Here, the bars represent different experimental design strategies, that is, orange, Box-Behnken (BB); purple, augmented pairs (AP); green, maximum entropy (ME); and black, maximin LHS (MM).

Sign in to download full-size image

Fig. 9.2. Model fits for multiple experimental design and response surface combinations.

After Schuetter, J., Mishra, S., Zhong, M., LaFollette, R., 2015. Data analytics for production optimization in unconventional reservoirs. In: Proc. Unconventional Resources Technology Conference. DOI: 10.15530/URTEC-2015-2167005.

Compounding this problem even further, we have the possibility that each modeling approach provides different insights regarding the relative importance of the predictors. This was demonstrated earlier in Fig. 8.13. Our recommended solution is to accept the multiplicity of models and combine them using the process of model aggregation as discussed in Section 8.3.3. Aggregating over a large set of completing models provides more robust understanding and predictions compared with a single model, which may not be the most accurate for the problem at hand and may not capture the full range of variable interactions as part of model building.

View chapterPurchase book

Introduction

Enrico Camporeale, ... Jay R. Johnson, in Machine Learning Techniques for Space Weather, 2018

Machine Learning and Space Weather

Space weather is the study of the effect of the Sun's variability on Earth, on the complex electromagnetic system surrounding it, on our technological assets, and eventually on human life. It will be more clearly introduced in Chapter 1, along with its societal and economic importance.

This book presents state-of-the-art applications of machine learning to the space weather problem. Artificial intelligence has been applied to space weather at least since the 1990s. In particular, several attempts have been made to use neural networks and linear filters for predicting geomagnetic indices and radiation belt electrons (Baker, 1990; Valdivia et al., 1996; Sutcliffe, 1997; Lundstedt, 1997, 2005; Boberg et al., 2000; Vassiliadis, 2000; Gleisner and Lundstedt, 2001; Li, 2001; Vandegriff, 2005; Wing et al., 2005). Neural networks have also been used to classify space boundaries and ionospheric high frequency radar returns (Newell et al., 1991; Wing et al., 2003), and total electron content (Tulunay et al., 2006; Habarulema et al., 2007). A feature that makes space weather very remarkable and perfectly posed for machine learning research is that the huge amount of data is usually collected with taxpayer money and is therefore publicly available. Moreover, the released datasets are often of very high quality and require only a small amount of preprocessing. Even data that have not been conceived for operational space weather forecasting offer an enormous amount of information to understand processes and develop models. Chapter 2 will dwell considerably on the nature and type of available data.

In parallel to the above-mentioned machine learning renaissance, a new wave of methods and results have been produced in the last few years, which is the rationale for collecting some of the most promising works in this volume.

The machine learning applications to space weather and space physics can generally be divided into the following categories:

Automatic event identification: Space weather data is typically imbalanced, with many hours of observations covering uninteresting/quiet times, and only a small percentage of data of useful events. The identification of events is still often carried out manually, following time-consuming and nonreproducible criteria. As an example, techniques such as convolutional neural networks can help in automatically identifying interesting regions like solar active regions, coronal holes, coronal mass ejections, and magnetic reconnection events, as well as to select features.

Knowledge discovery: Methods used to study causality and relationships within highly dimensional data, and to cluster similar events, with the aim of deepening our physical understanding. Information theory and unsupervised classification algorithms fall into this category.

Forecasting: Machine learning techniques capable of dealing with large class imbalances and/or significant data gaps to forecast important space weather events from a combination of solar images, solar wind, and geospace in situ data.

Modeling: This is somewhat different from forecasting and involves a higher level approach where the focus is on discovering the underlying physical and long-term behavior of the system. Historically, this approach tends to develop from reduced descriptions based on first principles, but the methods of machine learning can in theory also be used to discover the nonlinear map that describes the system evolution.

We will certainly see increasing applications of machine learning in space physics and space weather, falling in one of these categories. Yet, we also believe it is still an open question whether the amount and the kind of data at our disposal today is sufficient to train accurate models.

Read full chapterView PDF

Solar Wind Classification Via k-Means Clustering Algorithm

Verena Heidrich-Meisner, Robert F. Wimmer-Schweingruber, in Machine Learning Techniques for Space Weather, 2018

5 Model Selection, or How to Choose k

Machine learning techniques can typically be described on an abstract level as extracting a model from the available data set. This process is based on the assumptions that (1) the resulting model is representative not only for the training data set it was based on but for all data generated by the same (unknown) probability distribution (training and test data set are assumed to be independent identically distributed), and (2) the optimal model is a model that generalizes well, which means that the model describes only the essential structure of the training data. A model that generalizes well is expected to perform well not only on the training data set used to generate the model but also on unseen test data. This represents the central objective of model selection. Finding the right degree of model complexity that also generalizes well for a given problem can be a nontrivial task. A too-simple model fails to represent relevant structure in the data. A simple example of this so-called underfitting would be to fit a linear model to a third-order polynomial. A first-order model might represent an underlying trend of the polynomial reasonably well, but its complexity is too low to capture the true graph of the polynomial. But an overly complex model is undesirable as well. In real-world applications that usually show signatures of noise and uncertainty, an overly complex model tends to model the noise as well or even instead of the underlying structure of the data. This effect is called overfitting. Both overfitting and underfitting manifest in the case of k-means clustering with respect to inappropriate choices for the number of clusters k. If the data was generated from the optimal number of clusters k*, any k-means clustering result for k < k* or k > k* cannot be expected to represent the data as well. Thus, one should always take care to choose the best value for k.

However, in real-world applications such as solar wind categorization, the true number of clusters contained in the data is typically unknown. Therefore, model selection tools are necessary to choose an appropriate approximation for k*. For this, first, a measure of the generalization error is required. The easiest way to estimate the generalization error is to apply the method of interest to previously unseen data points. We cannot simply use the already defined test data set Xtest because then our model selection procedure would no longer be independent from this data set and we would not be able to use this data set as a test bed to compare different approaches. Thus, instead, we need to set aside a subset of the available training data set Xtrain that is no longer used to train the model, but that is only used for testing its generalizing ability. However, which part of the training data set is used to estimate the generalization error can affect the results and we would like to remove or at least reduce the influence of the particular partition into training and test data set. A commonly used concept to achieve this is n-fold cross-validation (Geisser, 1993; Kohavi et al., 1995; Devijver and Kittler, 1982). In n-fold cross-validation, the data set is randomly partitioned into n subsets. Then, from these n subsets, instead of only one, n pairs of training and test data sets are constructed in the following way: One of the n subsets is chosen as test data set Xi, testCV and the remaining n − 1 subsets form the training data set Xi, trainCV. Because there are n different possible choices for the test data set, this leads to n pairs of training and test data set (X1, trainCV, X1, testCV), …, (Xn, trainCV, Xn, testCV). For each of these, an error measure τi, i ∈{1, …, n}, is computed, and the cross-validation error τCV is then given as τCV(k)=1n∑i=1nτi(k).

Before we can apply n-fold cross-validation to our solar wind clustering data, we need to define an appropriate error measure τi(k), i ∈{1, …, n}. A simple choice that is consistent with the optimization criterion for k-means clustering is

(3)τi(k)=1Ni,test∑j=1k∑x∈Sj∥x−cj∥2

where Ni, test is the number of data points in the ith test data set Xi, test.

Thus, we define (similar to, e.g., Arlot et al., 2010) the error of a clustering as the average squared innerclass distance per data point. Then (again as in Arlot et al., 2010), the n-fold cross-validation error τCV is given as

(4)τCV(k)=1n∑i=1nτi(k)

Choosing the best number of folds n for cross-validation represents a trade-off between low variance and low computational cost (Kohavi, 1996; Fushiki, 2011; Zhang, 1993). The variance can be reduced by increasing the number of folds or repeating the cross-validation process for different partitions of the training data set. In the following, we use as a common choice n = 10 (Kohavi, 1996), and we, therefore, partition our training data Xtrain into 10 subsets, sort these into 10 pairs

(X1,trainCV,X1,testCV),…,(X10,trainCV,X10,testCV)

apply for all k ∈{2, …, 20} k-means clustering to each training data set Xi, trainCV, determine a clustering error τi(k) for each corresponding test data set Xi, testCV, and average over these to obtain the cross-validation error τCV(k). This process is then repeated for 50 random partitions of the training data set.

Fig. 2 shows the cross-validation error for k-means with k ∈{2, …, 20}. The cross-validation error decreases with an increasing number of clusters. This is expected because more cluster centers distributed in the same hypervolume lead to shorter average distances to the closest cluster center. However, this does not mean that larger values of k are always more appropriate because overfitting also becomes more likely the larger k is. The extreme overfitting case of one cluster center per data point would trivially lead to a cross-validation error of 0. This is not a desirable solution. But where to stop? This is a nontrivial question without an easy answer (see Pham et al., 2005 for a short overview). A simple approach is the so-called elbow method (Thorndike, 1953). This heuristic approach assumes that overfitting becomes the dominant effect as soon as adding more clusters has a smaller and smaller influence on the result. Consequently, the elbow method suggests selecting the largest number of clusters k that still leads to a noticeable improvement. Therefore, Fig. 2 also shows the error difference τCV(k) − τCV(k − 1) of two consecutive values of k. Based on this, we consider k = 7 as the best compromise between small generalization error and avoiding overfitting. However, rare solar wind types could be missed by this approach because they are, compared with the other solar wind types, underrepresented in the data set.

Sign in to download full-size image

Fig. 2. Tenfold cross-validation error τCV depending on the number of clusters k are shown as ×-shaped symbols. The median of 50 random initial partitions is shown as well as the 1st and 99th percentiles. In addition, the difference between the median values for two consecutive values of k is shown with +-shaped symbols.

View chapterPurchase book

Next Generation Biomonitoring: Part 1

Stéphane A.P. Derocles, ... Darren M. Evans, in Advances in Ecological Research, 2018

5.2 Exploiting eDNA-Derived Information as a Source for Network Data

While machine learning could greatly speed up the process of constructing ecological networks, the abundance data used for the reconstruction are currently something of a bottleneck. Highly replicated and taxonomically resolved data sets, such as the farm-scale evaluations data used by Bohan et al. (2011a) and Tamaddoni-Nezhad et al. (2013) to reconstruct an agroecological network, are few and costly to create. One solution will be to move towards assessing the presence of species, and potentially their relative abundance and variation, using highly replicated samples of DNA taken from the environment. These eDNA samples, which in principle could be sampled (relatively) easily and cheaply, could then be used to identify the taxa present at a sampling point using NGS approaches. NGS describes a number of similar technologies for generating large numbers of nucleic acid sequences for the identification of species (OTUs) and functions. The great beauty of these methods is that the nucleic acids with which they work are common to all life forms and ubiquitous. The fact that NGS might be applied to the identification of OTUs and functions in environmental samples from any biome, habitat and environment and any source material with minimal change in protocol has driven interest in eDNA as a generic source of data (Barnes and Turner, 2015; Evans et al., 2016; Thomsen and Willerslev, 2015). Coupling machine learning and NGS data could greatly speed up the reconstruction of networks in all ecosystems.

The raw OTU information that would be produced by eDNA sampling would contain all the interactions that structure the data. Treating this complexity of information has been highly problematic and difficult to date, and thus network researchers have tended to use DNA data in which many of the interactions are filtered out. Probably the best example is gut content data in which the sample is effectively an individual predator, and the data are the OTUs contained in the predator's gut (e.g. Fayle et al., 2015). Adopting this predator-derived data effectively selects for realized links, without a process of learning, and allows trophic ecological networks to be constructed directly. While learning approaches might be applied to these data to learn something about the processes underscoring trophic interactions, such as the traits that make some species predators of particular prey via background information, these gut content DNA samples are essentially limited to describing food webs which alone cannot explain community-level species cooccurrence and ecosystem structure (Bohan et al., 2013; Pocock et al., 2012). DNA analysis of gut content samples (or other type of DNA samples allowing a direct highlight of interactions such as host parasitized or faeces) is nevertheless mandatory as it allows to confirm the predictions from machine learning. In biomonitoring using NGS networks established with learning approach (or statistical models), it would be necessary that new interactions discovered by machine learning are systematically tested (with molecular tools or with experiments) for an accurate understanding of ecosystems.

The challenge for the future is to use appropriate machine learning and background information to tackle the problem of the many interactions that in combination have created the eDNA data we observe. While the best machine learning approaches have yet to be determined (Bohan et al., 2017), it is clear that background information will play a pivotal role here. The background information 'model' proposed by Bohan et al. (2011a) and Tamaddoni-Nezhad et al. (2013) for their agroecological network was essentially a model for a hypothetical trophic interaction and consequently selected interactions that conformed to this—trophic interactions. Subsequent work showed that a logic-based machine learning approach, called metainterpretative learning (MIL) (Muggleton et al., 2014, 2015), could discover background information directly from data. MIL was demonstrated discovering the rule used by Bohan et al. (2011a) and Tamaddoni-Nezhad et al. (2013) that 'big things eat small things' directly from a simulated, synthetic food web (Tamaddoni-Nezhad et al., 2015). This holds out the exciting possibility that reconstruction of an ever greater number of ecological networks, from eDNA, will drive a rapid improvement in our understanding of ecosystem structure and function because it will require the discovery of the background information models—the generic rules—that describe/determine the ecological interactions that structure all ecosystems.

View chapterPurchase book

Machine Learning for Flare Forecasting

Anna M. Massone, ... FLARECAST Consortium, in Machine Learning Techniques for Space Weather, 2018

2 Standard Machine Learning Methods

Most machine learning methods applied so far to the problem of flare prediction belong to the family of supervised methods. In particular, the great majority of these techniques ranges from standard multilayer perceptrons (Borda et al., 2002; Qu et al., 2003; Wang et al., 2008; Colak and Qahwaji, 2009; Mubiru, 2011; Zavvari et al., 2015) to neural networks characterized by an optimization of the computational efficiency, mainly in the learning phase (Qahwaji and Colak, 2007; Qahwaji et al., 2008; Bian et al., 2013). A few papers in more recent literature discuss the performance of support vector machines (SVMs) for flare prediction, describing rather standard implementations of this supervised approach (Qu et al., 2003; Qahwaji and Colak, 2007; Qahwaji et al., 2008; Bobra and Couvidat, 2015; Boucheron et al., 2015; Zavvari et al., 2015). Further, unsupervised clustering is used as a classification tool in combination with supervised regression algorithms (Li et al., 2007). Finally, as far as feature selection is concerned, results are obtained utilizing logistic regression, also in a hybrid combination with SVMs (Song et al., 2009; Yuan et al., 2010; Bian et al., 2013).

The current release of the FLARECAST service contains both unsupervised and supervised algorithms. Unsupervised prediction is realized by clustering methods that organize a set of unlabeled samples into meaningful clusters based on data similarity. Data partition is obtained through the minimization of a cost function involving distances between data and cluster prototypes. Optimal partitions are obtained through iterative optimization: starting from a random initial partition, samples are moved from one cluster to another until no further improvement in the cost function optimization is achieved. In a classical approach, each sample may belong to a unique cluster, while in a fuzzy clustering formulation, a different degree of membership is assigned to each sample with respect to each cluster. In particular, the standard algorithms considered in FLARECAST are the well-known Hard C-Means (or K-means) algorithm (Anderberg, 1973) and its fuzzy extension, the Fuzzy C-Means algorithm (Bezdek, 1981).

In the supervised context, multilayer perceptron is by far the most common neural network used in machine learning. The usual training algorithm is the error-back-propagation approach, which relies on a gradient descent algorithm, uses a forward and a backward pass through the feed-forward neural network, and performs weights update using the derivatives of the error function of the network with respect to the neural weights. FLARECAST also currently includes two recurrent neural networks that allow feedback loops in the feed-forward architecture. This effect is realized, specifically, by means of an Elman neural network (Elman, 1990), which permits any number of context nodes, and of a Jordan network, which constrains the number of context nodes to coincide with the number of output nodes (Jordan, 1986).

The set of currently implemented FLARECAST standard supervised methods is completed by an SVM and a LASSO algorithm that realize data regression by means of a quadratic loss function with two different penalty terms.

View chapterPurchase book

Ecohydrology and Restoration

R. Ben-Hamadou, ... E. Wolanski, in Treatise on Estuarine and Coastal Science, 2011

10.13.3.5.2 The model

Two ML methods were used to construct models, that is, regression tree model was induced to describe the phytoplankton dynamics over a longer period, explain the ecosystem behavior, and identify the main triggers of change in phytoplankton concentration in NA, while rule-based regression model was induced to predict the phytoplankton concentration for some period in future given the present values of measured data.

Regression trees are one of the ML methods for numerical modeling. As it is the case with the other data-driven methods, to induce a regression tree, we need a data set to learn from and an algorithm (see also Section 10.13.2). The data can be organized as examples in a spreadsheet, where each example (each row in the spreadsheet) is composed of attributes (also called descriptors or independent variables) and class (also called outcome of the attributes values or dependent variable), having the form (AT1, AT2,…, ATn, TARGET).

Compared to the simple linear regression, which calculates one equation (one weight vector) for the entire data set, piecewise or tree-structured regression divides the data set to several subsets on which uniform class value or linear equation can be applied. The division to subsets is based on tests of the values of the input attributes, which are put as nodes in a regression tree.

Thus, regression trees are hierarchical structures composed of nodes and branches, where the internal nodes contain tests on the input attributes. Each branch of an internal test corresponds to an outcome of the test, and the predictions for the values of the target variable (the class) are stored in the leaves, which are the terminal nodes in the tree. If the leaves contain a single value for the class prediction then we are talking about simple regression trees, while if linear equation is used for prediction in the leaf we are speaking of (regression) model trees. Figure 16 illustrates the procedure of constructing regression and model trees.

Sign in to download full-size on the computer

Figure 16. Induction of regression and model trees from given data set (examples). The difference between the regression and model trees is that regreesion trees predict a single value of the target variable in the leaves and model trees put a linear equation in the leaves to predict the value of the target variable.

Using the data set of the NA (Table 4) and the ML algorithm M5P encoded in the software package WEKA (Witten and Frank, 2000) a regression tree model for calculating the phytoplankton concentration was induced (Volf et al., 2011). The model (Figure 17) simulates the phytoplankton concentration (Phyto) based on the values of the following attributes: Year, Flow (flow rate of the Po River), SAL (salinity), Month, and NO2 (nitrite).

Sign in to download full-size image

Figure 17. Regression tree model for simulating the phytoplankton concentration in north Adriatic (adapted from Volf et al., 2011). The model simulates the phytoplankton concentration based on the values of five attributes: Year, Month, Flow (River Po flow rate), SAL (salinity), and NO2 (nitrite). It is read in terms of 'if then' rules, e.g., 'if' (Year &gt; 1997) and (SAL &gt; 37.9), 'then' Phyto = 352 705 ind l−1.

So, how do we read, interpret, and use this model? To read the model we simply start from the top node and proceed in terms of 'if then' rules. Starting with the top node (Year) a significant change in the phytoplankton dynamics can be noted in 1997. The phytoplankton concentrations on the left-hand side of the tree (Year < 1997) are much higher than the concentrations on the right-hand side (Year > 1997). In the first case (Year < 1997) the phytoplankton concentration is predicted depending on the value of the attribute Flow (flow rate of the Po River), which represents an internal test node. If Flow is higher than 2024 m3 s−1, then we observe the highest concentration of Phyto (2 572 789 ind l−1), otherwise we come to the next test node in the tree (SAL). The lowest phytoplankton concentration (Phyto = 352 705 ind l−1) can be observed when (Year > 1997) and (SAL > 37.9).

This model can be interpreted as a compilation and synthesis of previous research in the phytoplankton dynamics in the northern Adriatic. General trend of oligotrophication of NA (as also found by the model) is documented (Harding et al., 1999; Degobbis et al., 2000) as a consequence of the reduction of the phosphorus load in Po River water during the late 1980s (de Wit and Bendoricchio, 2001). This was mainly the result of a gradual reduction of polyphosphate content in detergents (Provini et al., 1992; Pagnotta et al., 1995). Latest study performed on long-term data indicated forcefully that the still common perception of the northern Adriatic as a very eutrophic basin is no longer appropriate, at least for its northern part and in recent years (Mozetič et al., 2009). However, episodes of algal blooms, anoxia, and mucilage events are still noted in the last two decades (Degobbis et al., 2000; Precali et al., 2005), indicating a low stability of the ecosystem behavior.

In both cases (before and after 1997), salinity is the main signal indicating changes of the impact of freshwater inputs to the area, and also of inflow of more saline waters from the central Adriatic. A reduction of riverine nutrients input and extended saline waters intrusion contributed to lower phytoplankton concentrations after 1997, most often throughout the investigated area of the northern Adriatic (Mozetič et al., 2009).

The changes in phytoplankton in 1993 and 2000 (respectively referred as 1992.5 and 1999.5 in Figure 17) are difficult to understand but coincide with unusually high freshwater discharges in the NA in autumn (Supić et al., 2006). In October 1993, the Po River flow rates were markedly higher than any monthly averages since 1917, when the measurements started. Exceptionally high flows occurred also in the second part of October and November 2000. Unusually marked stratification persisted also in December, due to the presence of a thick freshened surface layer. In these conditions, an extended near-anoxia developed in the bottom layers (CMR, Rovinj database).

The descriptive regression tree model (Figure 17) gives a solid explanation about the phytoplankton dynamics in NA and can be of assistance in the management of the ecosystem. However, from the management point of view it is even more useful if the phytoplankton is predicted for some time in advance, so that measures can be undertaken. For this purpose, a rule-based regression model was induced from the same data set (Volf et al., 2011), using the Rule Quest's Cubist software, where the basic M5 algorithm (Quinlan, 1992) for induction of regression trees was enhanced by combining the model-based and instance-based learning (Quinlan, 1993). Rule-based models for numeric prediction are yet another model representation, which is similar to the regression tree models. The models are interpreted as a set of 'if then' rules, where each rule is associated with a multivariate linear model. A rule indicates that, whenever a case satisfies all the conditions, the linear model is appropriate for predicting the value of the target attribute.

The phytoplankton prediction rule-based model is presented in Table 5. The model is composed of 10 rules, each of which is related with a linear equation predicting the phytoplankton concentration for 14 days in advance. To predict the future phytoplankton concentration, the following data are needed at present: Flow (Po River flow rate), Month, Temp (temperature), SAL (salinity), Dene (density), pH, NO3, NO2, NH4, N/PO4 and N/SiO4, and Phyto (see also Table 4). The rule selection depends on the values of the variables in the rule. When a rule is selected, corresponding equation is applied to calculate the phytoplankton concentration 14 days in advance (Phyto_pred in Table 5).

Table 5. Predictive model for phytoplankton concentration: rule-based model

Rule no.RulesEquationsRule 1IF Phyto ≤804620.5Phyto_pred = 732300 + 0.917 Phyto - 233140 Dene - 58205 Temp + 179186 SAL - 7782 MonthRule 2IF (Temp &gt; 9.6) AND (Temp ≤ 20.3) AND (Phyto &gt; 804620.5) AND (Phyto ≤ 2807349)Phyto_pred = 5.24147e+006 + 0.931 Phyto - 84220 Temp - 57364 Dene - 199934 NO2 - 20263 NO3 - 7368 N/Si+ 29 Flow - 368417 pH + 88810 NH4 + 19504 SALRule 3IF (Temp &gt; 20.3) AND (Phyto &gt; 804620.5)Phyto_pred = −2.76291e+006 + 0.716 Phyto + 359934 Dene + 103143 Temp + 135511 Month - 256581 SAL - 1125 N/P + 7567 N/SiRule 4IF (Temp ≤9.6) AND (Phyto &gt; 804620.5) AND (Phyto ≤ 2807349)Phyto_pred = −6.68528e+006 + 0.87 Phyto - 390486 Dene + 386746 SAL - 80868 Temp + 218220 NH4 - 852 N/P + 21132 NO3 - 12263 Month + 559168 pHRule 5IF (N/P ≤ 62.3) AND (Phyto &gt; 2807349)Phyto_pred = 1.16131e+007 + 0.664 Phyto - 141666 Temp - 6625 N/P - 147108 NO3 - 97863 Dene+ 406748 NO2 + 17884 N/Si - 660243 pH + 87424 NH4Rule 6IF (Month &gt; 4) AND (N/P &gt; 62.3) AND (Phyto &gt; 2807349) AND (Phyto ≤ 1.13562e+007)Phyto_pred = 7.14367e+007 - 1.18988e+006 Temp - 3.28797e+006 Dene + 2.00528e+ 006 SAL + 0.831 Phyto - 4.86272e+006 NO2 - 278807 Month + 2.30991e+ 006 NH - 258224 NO3 - 55637 N/Si - 4.16418e+006 pH + 2917 N/P - 57 FlowRule 7IF (Temp ≤12.2) AND (NH4 ≤ 0.31) AND (N/P &gt; 62.3) AND (Phyto &gt; 2807349)Phyto_pred = 3.97828e+007 + 8.67119e+006 Dene + 1.92162e+006 Temp - 5.92294e+ 006 SAL - 8.71454e+006 NH4 + 797894 NO3 + 1566 Flow + 0.8 Phyto - 1.05622e+007 pH - 10620 N/P + 2.05592e+006 NO2Rule 8IF (Temp ≤ 12.2) AND (NH4 &gt; 0.31) AND (N/P &gt; 62.35) AND (Phyto &gt; 2807349)Phyto_pred = 1.26724e+008 - 3.25732e+007 Dene - 7.30295e+006 Temp + 2.64069e+007 SAL - 7.87193e+006 NO2 + 0.78 Phyto + 610748 NO3 - 1.16366e+ 007 pH - 2.89067e+006 NH4Rule 9IF (Month &lt;= 4) AND(Temp &gt; 12.2) AND (N/P &gt; 62.35) AND (Phyto &gt; 2807349)Phyto_pred = 1.3374e+008 + 4.30887e+006 Dene - 4.10462e+006 SAL + 1.756 Phyto - 815924 NO3 - 4.89689e+006 NO2 - 1.19144e+007 pH - 4543 TempRule 10IF (Month &gt; 4) AND (N/P &gt; 62.35) AND (Phyto &gt; 1.13562e+007)Phyto_pred = −4.05262e+008 - 4.50557e+006 Temp + 19140 Flow + 5.66287e+007 pH - 1.05564e+ 007 NH4 + 36997 N/Si

The cod inside on the computer

The model performs with high accuracy, when simulated on validation data, including a good prediction of the peak values (Figure 18). Unlike the existing conceptual models in NA, which are derived from the theoretical modeling knowledge, this model lacks interpretability. However, the aim of this model is prediction and given the accuracy of the model performance on unseen data and its computational efficiency, it can be highly useful water management tool.

Sign in to download full-size on the computer

Figure 18. Measured vs. simulated data for station SJ107 (R2= 0.88). The performance of the model was similarly accurate for the rest of the measuring stations (Volf et al., 2011).

View chapterPurchase book

Estimation of Biophysical Variables from Satellite Observations

Fred Baret, in Land Surface Remote Sensing in Agriculture and Forest, 2016

2.3.2.2 Machine learning approaches

Alternatively, machine learning has been proposed since the 1990s. Among these methods, neural networks (Figure 2.3) were used intensively [GON 99]. Baret et al. [BAR 95] and Verger et al. [VER 11a] have shown that neural networks used with individual spectral reflectance were more efficient than approaches based on vegetation indices. Fang and Liang [FAN 05] found that neural networks are as efficient as the multiple regression projection pursuit. Neuronal approaches were applied to MERIS [LAC 06] and VEGETATION [BAR 07b] at a kilometric spatial resolution. Neural networks have also been applied to decametric resolution airborne data POLDER [WEI 02] and Landsat satellite sensors [FAN 03b], CHRIS [VER 11a] and FORMOSAT [CLA 13]. Other automatic learning methods were also assessed, such as Support Vector Machines (SVM) or Gaussian Process Regression (GPR) [VER 12]. Methods like GPR appear interesting when learning is performed on experimental data. However, when applied to large learning datasets, they appear limited by the computing capacity [MAC 03]. However, one advantage of this approach applied to experimental data is to provide an estimation of the associated uncertainties. In the case of the use of simulated learning database from the RTM, the uncertainties associated with simulated reflectance must be specified, which is not always easy.

Sign in to download full-size image

Figure 2.3. An example of a neural network used to estimate the GAI using canopy reflectances from the MSI Sentinel-2 sensor. Norm is the standardization of inputs and outputs. S and L represent the tangent sigmoid and linear transfer functions associated with each neuron. ϕ, θs and θo represent respectively the relative azimuth between the viewing and illumination directions, zenith angles of the sun and that of the view direction. R560–R2190 represent the canopy reflectances in the bands of the MSI Sentinel-2 sensor.

From [BAR 09]

The learning database is a major component in the success of machine learning approaches. It should represent the distributions and co-distributions of the variables in actual conditions. It is at this level that the 'a priori' knowledge of these distributions is introduced into the machine learning approaches. They can therefore be considered as part of the family of Bayesian approaches. The density of the possible cases will decrease rapidly with the required number of the vegetation variables. The experimental designs can then be exploited to ensure that the space of canopy realization is roughly evenly populated [BAC 02b]. Machine learning approaches can also be used as smoothing methods: they interpolate between the different cases of the learning database. Consequently, estimates made outside the definition domain (corresponding to the convex hull of the learning database reflectance) should be considered with caution. To simplify the learning database and make it a more robust method, it is possible to eliminate cases that have been simulated but never observed [BAR 07a]. However, this procedure requires having a large database representative of all possibly encountered situations.

View chapterPurchase book

Ecosystem Services: From Biodiversity to Society, Part 2

Corinne

4.3.2 Meta-Interpretive Learning

In the machine-learning settings described in the previous section, the search for trophic links was constrained by additional information on the species (e.g. body size, trophic behaviour) which were provided as part of background knowledge. The logical rule stated that X may eat Y if X is a predator bigger than Y. However, for most communities and ecosystems, including microbial communities, this kind of background knowledge may not be available or it may be incomplete. MIL (Muggleton et al., 2014) is a new machine-learning approach capable of predicate invention and recursive rule learning. This new approach can be used for learning both the interactions between species (or OTUs) and the 'rules of interaction' directly from species occurrence or abundance data. In this case, background knowledge does not include any specific knowledge on the species but includes higher-order meta-rules, M ⊆ B, which are activated during the proving of examples in order to generate hypotheses, H. A recent study showed that MIL can be used to re-construct a simplified food web and learn interaction rules directly from data (Tamaddoni-Nezhad et al., 2015). We believe that this new learning setting will be useful for learning ecological networks from NGS data whenever the interaction rules are not known before hand.

X predatör