Preface | p. xi |
Authors | p. xiii |
Symbol Description | p. xv |
Data of High Dimensionality and Challenges | p. 1 |
Dimensionality Reduction Techniques | p. 3 |
Feature Selection for Data Mining | p. 8 |
A General Formulation for Feature Selection | p. 8 |
Feature Selection in a Learning Process | p. 9 |
Categories of Feature Selection Algorithms | p. 10 |
Degrees of Supervision | p. 10 |
Relevance Evaluation Strategies | p. 11 |
Output Formats | p. 12 |
Number of Data Sources | p. 12 |
Computation Schemes | p. 13 |
Challenges in Feature Selection Research | p. 13 |
Redundant Features | p. 14 |
Large-Scale Data | p. 14 |
Structured Data | p. 14 |
Data of Small Sample Size | p. 15 |
Spectral Feature Selection | p. 15 |
Organization of the Book | p. 17 |
Univariate Formulations for Spectral Feature Selection | p. 21 |
Modeling Target Concept via Similarity Matrix | p. 21 |
The Laplacian Matrix of a Graph | p. 23 |
Evaluating Features on the Graph | p. 29 |
An Extension for Feature Ranking Functions | p. 36 |
Spectral Feature Selection via Ranking | p. 40 |
SPEC for Unsupervised Learning | p. 41 |
SPEC for Supervised Learning | p. 42 |
SPEC for Semi-Supervised Learning | p. 42 |
Time Complexity of SPEC | p. 44 |
Robustness Analysis for SPEC | p. 45 |
Discussions | p. 54 |
Multivariate Formulations | p. 55 |
The Similarity Preserving Nature of SPEC | p. 56 |
A Sparse Multi-Output Regression Formulation | p. 61 |
Solving the L2,1-Regularized Regression Problem | p. 66 |
The Coordinate Gradient Descent Method (CGD) | p. 69 |
The Accelerated Gradient Descent Method (AGD) | p. 70 |
Efficient Multivariate Spectral Feature Selection | p. 71 |
A Formulation Based on Matrix Comparison | p. 80 |
Feature Selection with Proposed Formulations | p. 82 |
Connections to Existing Algorithms | p. 83 |
Connections to Existing Feature Selection Algorithms | p. 83 |
Laplacian Score | p. 84 |
Fisher Score | p. 85 |
Relief and ReliefF | p. 86 |
Trace Ratio Criterion | p. 87 |
Hilbert-Schmidt Independence Criterion (HSIC) | p. 89 |
A Summary of the Equivalence Relationships | p. 89 |
Connections to Other Learning Models | p. 91 |
Linear Discriminant Analysis | p. 91 |
Least Square Support Vector Machine | p. 95 |
Principal Component Analysis | p. 97 |
Simultaneous Feature Selection and Extraction | p. 99 |
An Experimental Study of the Algorithms | p. 99 |
A Study of the Supervised Case | p. 101 |
Accuracy | p. 101 |
Redundancy Rate | p. 101 |
A Study of the Unsupervised Case | p. 104 |
Residue Scale and Jaccard Score | p. 104 |
Redundancy Rate | p. 105 |
Discussions | p. 106 |
Large-Scale Spectral Feature Selection | p. 109 |
Data Partitioning for Parallel Processing | p. 111 |
MPI for Distributed Parallel Computing | p. 113 |
MPI_BCAST | p. 114 |
MPI_SCATTER | p. 115 |
MPI_REDUCE | p. 117 |
Parallel Spectral Feature Selection | p. 118 |
Computation Steps of Univariate Formulations | p. 119 |
Computation Steps of Multivariate Formulations | p. 120 |
Computing the Similarity Matrix in Parallel | p. 121 |
Computing the Sample Similarity | p. 121 |
Inducing Sparsity | p. 122 |
Enforcing Symmetry | p. 122 |
Parallelization of the Univariate Formulations | p. 124 |
Parallel MRSF | p. 128 |
Initializing the Active Set | p. 130 |
Computing the Tentative Solution | p. 131 |
Computing the Walking Direction | p. 131 |
Calculating the Step Size | p. 132 |
Constructing the Tentative Solution | p. 133 |
Time Complexity for Computing a Tentative Solution | p. 134 |
Computing the Optimal Solution | p. 134 |
Checking the Global Optimality | p. 137 |
Summary | p. 137 |
Parallel MCSF | p. 139 |
Discussions | p. 141 |
Multi-Source Spectral Feature Selection | p. 143 |
Categorization of Different Types of Knowledge | p. 145 |
A Framework Based on Combining Similarity Matrices | p. 148 |
Knowledge Conversion | p. 150 |
KSIMFEA→KSIMSAM | p. 151 |
KFUNFEA, KINTFEA → KSIMSAM | p. 152 |
MSFS: The Framework | p. 153 |
A Framework Based on Rank Aggregation | p. 153 |
Handling Knowledge in KOFS | p. 155 |
Internal Knowledge | p. 155 |
Knowledge Conversion | p. 156 |
Ranking Using Internal Knowledge | p. 157 |
Relevance Propagation with KRELint, FEA | p. 157 |
Relevance Voting with KFUNint, FEA | p. 157 |
Aggregating Feature Ranking Lists | p. 158 |
An EM Algorithm for Computing | p. 159 |
Experimental Results | p. 160 |
Data and Knowledge Sources | p. 160 |
Pediatric ALL Data | p. 160 |
Knowledge Sources | p. 160 |
Experiment Setup | p. 161 |
Performance Evaluation | p. 162 |
Empirical Findings | p. 164 |
Discussion of Biological Relevance | p. 166 |
Discussions | p. 167 |
References | p. 171 |
Index | p. 191 |
Table of Contents provided by Ingram. All Rights Reserved. |