FDA +009
· 2 min read
Linear Separability
- the data is linearly separable if
- it can be separated by a point on a single dimension line of data points
- by a line on a two-dimensional representation of data points
- by a plane (a two-dimensional surface) in a three-dimensional representation of the data points
- If it is non-linearly separable, look at other options for classification.
Hyperplane
- the conceptual divide between data
- Weight vector: represented and generated in weight space.
- Choosing the hyperplane
- Minimum distance between samples
- Least-squares method
- Gradient Descent
Artificial Neural Networks, ANN
- Strength: for high dimensionality problems, the complex relations between variables
- Weaknesses: theoretically complex, computationally intensive, needs large data sets, complicated to implement
- Kinds of ANN
- Perceptrons, Multilayer Perceptrons
- Deep learning neural networks
- Kohonen networks
- Convolutional neural networks
- Radial Basis Functions
- Recurrent neural networks
- Support Vector Machines
- Competitive learning
- Boltzmann machines
Multilayer Perceptrons, MLP
- Challenges
- Decide on the network topology.
- how many hidden layers are needed
- how many neurons in each of the hidden layers
- Find values for the weights which make the network produce the correct output values for the given input values.
- Decide on the network topology.
- Neural networks only accept numeric data.
- need to convert the categorical into numeric.
- One-Hot encoding, Thermometer encoding.
- high values may need to be scaled into a similar range as neural networks
- need to do a log transform to pull the values into a target range.
[-1, +1]or[0, 1]
- input neurons should be as small as possible.
- adding neurons
->more parameters and weights->amplify any bias. (overtrain the network)
- adding neurons
- one categorical attribute may have many attribute values
- each adding a parameter
->adding risk of overtraining
- each adding a parameter
Resilient Propagation, RProp
- directly adjust the weight step based on the local gradient information
- introduces a weight update value for each weight
- updates it based on the sign of the partial derivative of the error with respect to the weight
- update value
- if sign changes (i.e. jumped over local minima)
->slightly decreased. - if sign remains the same
->slightly increased.
- if sign changes (i.e. jumped over local minima)
- weight
- if derivative is (i.e. error increasing)
->decreased by - if the derivative is negative (i.e. error decreasing)
->increased by - if the derivative changes sign, the last weight update is reverted. (backtracks the last weight update)
- if derivative is (i.e. error increasing)
KNIME
- RProp MLP Learner + MultiLayerPerceptron Predictor
- MultilayerPerceptron + Weka Predictor (back propagation with momentum)