Thesis Proposal

Improvement of classification neural

networks through auto-generation

of supplemental training inputs

David Windmueller

Christopher Newport University

 

Abstract

The use of neural nets for classification of many variable systems has been found effective where conventional formula based systems have failed. Unfortunately, the back propagation neural net approach gives poor results when trained upon a data set that does not properly map the needed input space. This study proposes techniques for the auto-generation of additional training inputs to improve the abilities of the final neural net. The additional inputs would be created from an evaluation of inadequacies in the initial trained net. These inadequacies would not be determined through an evaluation of the independent test data set, but through the autonomous evaluation of the trained net's output characteristics. The additional inputs are presented to the human expert for classification and then supplemented to the initial training set. Successful implementation of this method would enable neural net programmers to spend less effort in the creation of training data sets, for inadequacies would be self extracted for clarification.

I. Introduction

The back propagation neural network is a powerful software tool that is used to control or classify systems that are not easily described by mathematical formulae. It has the ability to train itself from a set of input/output pairs and then extrapolate its findings on inputs that it has never seen before. This makes an effective approach on systems that have relationships that are difficult to accurately describe with a set of rules. The properly trained neural network is also able to successfully classify non-precise data inputs that deviate from the original training set. As with most systems, the quality of the output is determined by the quality of the input. The classification abilities of the final neural net are dictated by the quality of the training set.

A. Rationale

The normal process for training a back propagation neural net involves a number of steps. The programmer needs to take the initial set of input/output data and separate it into two groups. The first set is used as a training set that shapes the internal weights of the net to produce the desired outputs. The second set is processed by the trained net as an independent check of its accuracy. The output of the trained net is compared to the desired outputs of the test set as a measurement of error.

B. Problem Statement

After finding that the network is unable to properly classify the test set, the programmer can continue to train the network on the initial data in hopes of lowering the test set data error. Another option is to find the specific areas where the net is failing and add more examples to the training set. These examples would help to clarify the desired classifications in the problem areas.

Currently, this iterative process is a time consuming task, which limits the ability of the neural net to the skills of the user to create and improve the initial training set. Also data sets can be too complicated to allow the programmer to visualize the needed inputs to improve classification.

 

C. Purpose

This study will investigate the auto-generation of input patterns for improving the overall quality of the final training set. The created input patterns will not be generated from deficiencies discovered from the use of the test set, rather they will be found from analysis of the network itself and also through combinations created from the original input data. The goal of this study is to compare the quality of various methods for auto-generation of input data for a classification neural network. The improvements found might lead to a change in the requirements for creating a successful training data set. With the ability to find weak areas after training, the neural net system can tolerate an initial data set of lesser quality.

D. Research questions

To achieve the goal of this study, a number of questions must be answered. The first is a simple test to find if the addition of new data pairs will actually improve the total abilities of the network. The next question deals with the effectiveness of various methods for selecting examples from a large generated set. The final question compares the abilities of a number of auto-generation techniques to find which offers the greatest improvements to the overall performance of the network.

1.Will the addition of generated input data improve the classification abilities of the original neural net?

This answer to this question leads to two courses of research. If the response of a fully trained network is not improved by the introduction of new data, comparing the various auto-generation methods would be difficult. If this is found to be true, the net would be trained significantly less on the original data and then the auto-generation techniques would be applied. This technique would investigate if the total number of training iterations to shape the internal weights could be lowered when compared to fully training the network with the initial test set.

2. Which method of sorting auto-generated data will provide better results?

The auto-generation methods will provide a greater number of input patterns than is desired by the human expert for evaluation. To minimize the burden on the human, various methods will be used to select the ones that will have the greatest positive effect in reducing classification error. The various selections of the sorting routines will be used and compared for output characteristics.

3. Which method of creating the auto-generated data will provide the best inputs for the improvement of the net?

This is the fundamental question of the research. The evaluation of the auto-generation methods will lead to a greater understanding of data set construction techniques for neural applications. Also it will indicate the level of practicality of the various methods. The research will demonstrate the amount of accuracy that is gained with each method versus the amount of extra effort that is needed from both the human and the computer.

II. Related literature

A. Back Propagation Neural Networks

The Artificial Neural Network is a software configuration used to emulate a function that learns. The foundation of the net is the neuron unit. These neurons are linked together to created a structure that accepts inputs and gives an appropriate output after a minimal amount of calculations. The neuron itself is similar to its biological counterpart in a number of ways. The first is its input and output structures. Both neurons have a variable number of inputs and a solitary output. The premise of its operation is that all of the inputs are collected and evaluated. The evaluation determines what to send as an output from the individual neuron. The evaluation process of the software and that of the biological neurons share the concept of thresholds. The output of the neuron will be fired if the sum of the inputs is greater than a set threshold. The amount that each input adds to the sum function is dependent on a weighted multiplier.3

To train a neural network, the multipliers are varied by using a gradient descent technique. This specifically corrects the values that cause the greatest part of the detected errors. This process is only used during the training stage for once the weights have be properly set, error evaluation is no longer computed.

A number of improvements to the back propagation techniques have been the goal of different research projects. These projects have focused on many elements of the network, including the evaluation function itself, variations of network construction, and also training parameters such as momentum. Pertaining to the construction of the input data set, some work has been performed studying the effect of the selective introduction of input pairs for training to the network in hopes of maintaining the network’s generalization abilities.4 Also research has been performed on variations of when the gradient descent calculations should be applied for the manipulation of the weights. These calculations can be performed after each input pair or after the whole input set has been processed.1

  1. Generation Techniques

Input generation techniques for the purpose of this specific research have not been located, but there are known concepts that pertain to manipulations of a trained network to extract specific inputs. One possible method that could be used finds input values from the network that give a specific desired output.2 Another technique that is used to generate variations of data is simple bit changes from an original data set based on Hamming distances.1

III. Methodology

The study will involve a number of steps to answer the research questions. The first step is to pick a set of data for classification and evaluation. Variations of this data that will be created by the auto-generation methods will have to be able to be redisplayed to the user for classification. The researcher will then create an artificial neural net and train the initial net to a certain fixed level. The training progress will be monitored by the constant evaluation of error from a test data set.

After this, the auto-generation software will create new entries for evaluation. These entries will go through a sorter and be presented for human classification. Various methods will be used for the data generation and sorting routines in hopes of finding the best solution. One generation method that might be used will fix the outputs to a designed pattern and add an additional layer to the input side of the trained neural net. All of the inputs to the new layer will be fixed at an input of one. The net will be 'retrained' using back propagation, keeping the original weights constant. The weights of the additional layer will be methodically adjusted until the desired output is produced. These weights will represent the final input that when processed by the trained net will give the needed output. The original pattern of output will be designed by the programmer and will represent an ambiguous classification. With inputs that are generated with no relationship to the original data that trained the net, there is a risk of creating data that will not help to improve the network. The sorting routine that might be used will compare the generated inputs and rank them according to how close they are to real data inputs.

Another method that might be used will take a previously trained input pair and methodically change it while monitoring the output of the net. This method would find areas that were very close to misclassification. The importance of inputs would be ranked by how close they are to real data inputs.

A simple method for data generation would be to use averages constructed from the original inputs. If there are n input pairs that the net was originally trained on, you could create 1/2 * (n * (n-1) ) pairs from averaging each possible combination of two input patterns. These averages would be put through the network and their outputs would be sorted by their classification ambiguities.

All of these chosen inputs would be sent to the human expert for classification and then input/output pairs will be added to the original data set. The net would be retrained. Error measurements will be performed to compare network performance with the addition of new data.

To keep the consistent models, neural net weights will always start from the same randomized set with a constant number of neurons. The only variations between the test sets will be the different additions to the original training set from the different auto-generation techniques.

C. Schedule

A proposed schedule for activities in this study, with projected milestones and completion dates, is shown in the table below.

Date of Milestone Completion

Thesis committee selected 04/01/99

Thesis proposal accepted 06/04/99

Training data collected 06/10/99

Software developed 07/30/99

Training completed 08/30/99

Research completed 09/30/99

Draft thesis approved 10/30/99

Final thesis submitted 11/14/99

Thesis defense 11/28/99

IV. Key references

 

1 John Hertz, Anders Krogh, and Richard G. Palmer, Introduction to the Theory of Neural Computation (Addison-Wesley Publishing Company, 1991)

2 Timothy Masters, Introduction to the Theory of Neural Computation (Addison-Wesley Publishing Company, 1991)

3 David M. Skapura, Building Neural Networks (ACM Press, 1996)

4 Fabio Tamburini and Renzo Davoli, An algorithmic method to build good training sets for neural-network classifiers (University of Bologna, Technical Report UBLCS-94-18,1994).

 

 

 

 

 

 

 

Improvement of classification neural

networks through auto-generation

of supplemental training inputs

 

 

By

David K Windmueller

 

Thesis proposal submitted to the Graduate Faculty of

Christopher Newport University

for the degree of

Master of Science in Applied Physics and Computer Science

with a concentration in Computer Science

to be awarded August 1999

 

 

Submitted

June 4 , 1999

 

 

 

Approved:

Dr. David Hibler, Thesis Advisor ______________________________

Dr. John Hardie ______________________________

Dr. Lynn Lambert ______________________________