training

13 MONTE CARLO UNCERTAINTY ANALYSIS

13.1 Empirical Cumulative Distribution Fitting
13.2 Probability Distributions
13.3 Simulation of Correlated Features
13.4 Parameter Specification for Manual Input
13.5 Correlation Matrices

14 MONTE CARLO COST ANALYSIS

14.1 Effect of correlation
14.2 Cost risk analysis
14.3 Project: Construction of new 3-story building
14.3.1    Design costs & schedule
14.3.2    Earthworks costs & schedule
14.3.3    Foundation costs & schedule
14.3.4    Structure (floors,pillars,roofing) costs & schedule
14.3.5    Envelope (walls,windows,external doors) costs & schedule
14.3.6    Services(plumbing,electrical,cabling) & Finishes (partitions,decorations) & schedule
14.3.7    Site cleaup & Landscaping costs & schedule

15 DATASETS USED

Numerous empirical (real-world) datasets will be employed during training.   Datasets are commonly used in applied statistics, machine learning, text mining, etc.

15.1 Retinol
15.2 SRBCT gene expression
15.3 500 Pubmed abstracts
15.4 Yale face image library
15.5 Dow-30 log-returns (2 years)
15.6 Wine

7 DEPENDENCY

7.1 Linear Regression: Single Predictor (and Diagnostics)
7.2 Multiple Linear Regression (and Diagnostics)
7.3 Residuals [standardized, Studentized, Jacknife(deletion), leverages, Cook's D, DFFITS, DFBETAS]
7.4 Multicollinearity (Variance inflation factors),
7.3 Multivariate Linear Regression
7.4 Binary Logistic Regression (Diagnostics)
7.5 Polytomous Logistic Regression
7.6 Additive and Multiplicative Poisson Regression (Diagnostics)
7.7 Non-Linear Poisson Regression

8 SURVIVAL ANALYSIS

8.1 Kaplan-Meier Analysis and Logrank Test
8.2 Cox Proportional Hazards Regression

9 TEXT MINING

9.1 Text Mining via Stemming
9.2 Text Mining: N-gram Analysis
9.3 Sentiment Analysis

10 CLASS DISCOVERY

10.1 Crisp K-means Cluster Analysis (CKM)
10.2 Distance Metrics
10.3 Cluster Validity
10.4 Fuzzy K-means Cluster Analysis (FKM)
10.5 Self-Organizing Map (SOM)
10.6 Unsupervised Neural Gas (UNG)
10.7 Gaussian Mixture Models (GMM)
10.8 Unsupervised Random Forests (URF)
10.9 Non-Linear Manifold Learning (NLML)
10.10 Principal Components Analysis (PCA)
10.11 Component Subtraction
10.12 Covariance Matrix Shrinkage
10.13 Kernel PCA (KPCA)
10.14 Diffusion Maps (DM)
10.15 Local Linear Embedding (LLE)
10.16 Laplacian Eigenmaps (LEM)
10.17 Locality Preserving Projections (LPP)
10.18 Stochastic Neighbor Embedding (t-SNE)
10.19 Unsupervised artificial neural networks (UANN)
10.20 Sammon Mapping (Sammon)
10.21 Classic Multidimensional Scaling (CMDS)
10.22 Non-Metric Multidimensional Scaling (NMMDS)
10.23 Hierarchical Cluster Analysis (HCA)

11 FEATURE SELECTION

11.1 Introduction and Requirements
11.2 Cross-validation and Repartitioning
11.3 Class Comparisons During CV
11.4 Filtering Methods
11.5 Generation of Non-Redundant Feature List

12 CLASS PREDICTION

12.1 Linear Regression (LREG)
12.2 Decision Tree Classification (DTC)
12.3 k-Nearest Neigbor (kNN)
12.4 Naïve Bayes Classifier (NBC)
12.5 Linear Discriminant Analysis (LDA)
12.6 Quadratic Discriminant Analysis (QDA)
12.7 Learning Vector Quantization (LVQ1)
12.8 Random Forests (SRF)
12.9 Polytomous Logistic Regression (PLOG)
12.10 Artificial Neural Networks (ANNs)
12.11 Particle Swarm Optimization (PSO)
12.12 Kernel Regression (RBF) and RBF Networks (RBFN)
12.13 Support Vector Machines (SVM)
12.14 Supervised Neural Gas (SNG)
12.15 Mixture of Experts (MOE)

1 INTRODUCTION

1.1 Scales: Nominal, Ordinal, Dichotomous(Binary), Discrete, Continuous
1.2 Feature types (Continuous, Nominal, Binary, Text)
1.3 General Default Preferences
1.4 Color and Graphics Defaults
1.5 Changing Feature Types

2 SUMMARY STATISTICS

2.1 Data Arrangement
2.2 Normal Distribution
2.3 Central Tendency and Dispersion
2.4 Histograms, X-Y Scatter plots, Matrix plots
2.5 Skewness, Kurtosis
2.6 Normality Tests
2.7 Heteroscedasticity Tests

3 LABELS, TRANSFORMS, AND FILTERING

3.1 Editing labels, mathematical transformation
3.2 Filtering records

4 TRANSFORMATIONS

4.1 Z-Scores from Standard Normal Distribution
4.2 Log
4.3 Quantile
4.4 Rank
4.5 Percentile
4.6 van der Waerden Scores
4.7 Nominal-to-Binary
4.8 Fuzzification
4.9 Fast Wavelet Transform (FWT)
4.10 Root MUSIC
4.11 Text to Categorical

5 INDEPENDENCE

5.1 2 Unrelated Samples
5.2 k Unrelated Samples
5.3 Equality Tests for 2 Independent Proportions
5.4 Chi-Squared Contingency Tables
5.5 Nominal (Categorical) Measurement Scale
5.6 Two-way Contingency Tables
5.7 Fisher's Exact Test
5.8 Multiway Contingency Tables
5.9 Exact Tests for Multiway Tables
5.10 Related Samples

6 ASSOCIATION

6.1 Covariance
6.2 Parametric Correlation: Pearson Product Moment
6.3 Force plots
6.4 Non-parametric Correlation: Spearman Rank
6.5 Multivariate Forms of Covariance and Correlation
6.6 Euclidean Distance
6.7 Matrix Formulation of Covariance and Correlation
6.8 Association Rules (Market Basket Analysis)