Reference: Ohno-Machado, L. Identification of Low Frequency Patterns in Backpropagation Neural Networks. Knowledge Systems Laboratory, Medical Computer Science, November, 1994.
Abstract: Although neural networks have been widely applied to medical problems in recent years, their applicability has been limited for a variety of reasons. One of these barriers has been the inability to discriminate rare classes of solutions (i.e., the identification of categories that are infrequent). In this article, I demonstrate that a system of hierarchical neural networks (HNN) can over come the problem of recognizing low frequency pat terns, and therefore can improve the prediction power of neural-network systems. HNN are designed according to a divide-and-conquer approach: Triage networks are able to discriminate supersets that contain the infre quent pattern, and these supersets are then used by Spe cialized networks, which discriminate the infrequent pattern from the other ones in the superset. The super sets that are discriminated by the Triage networks are based on pattern similarity. The application of multilay ered neural networks in more than one step allows the prior probability of a given pattern to increase at each step, provided that the predictive power of the network at the previous level is high. The method has been applied to one artificial set and one real set of data. In the artificial set, the distribution of the patterns was known and no noise was present. In this experiment, he HNN provided better discrimination than a standard neural network for all classes. In a real data set of nine thousand patients who were suspected of having thyroid disorders, the HNN also provided higher sensitivity than its corresponding standard neural network (without a corresponding decay in specificity) given the same time constraints. I discuss the reasons why the sensitivity achieved by systems of divide-and-conquer hierarchical neural networks is superior to that of non-hierarchical neural network models, the conditions in which the algorithm should be applied, potential improvements, and current limitations.
Notes: Updated November 1994.
Full paper available as ps.