An automated rule refinement system

作者: Robert Andrews

DOI:

关键词: Machine learningDomain knowledgeBlack boxKnowledge acquisitionComputer scienceNetwork architectureDomain theoryArtificial neural networkArtificial intelligenceKnowledge baseProblem domain

摘要: Artificial neural networks (ANNs) are essentially a 'black box' technology. The lack of an explanation component prevents the full and complete exploitation this form machine learning. During mid 1990's field 'rule extraction' emerged. Rule extraction techniques attempt to derive human comprehensible structure from trained ANN. Andrews et.al. (1995) proposed following reasons for extending ANN paradigm include rule facility: * provision user capability extension 'safety critical' problem domains software verification debugging components in systems improving generalization solutions data exploration induction scientific theories knowledge acquisition symbolic AI An allied area research is that refinement'. In refinement initial base, (i.e. what may be termed `prior knowledge') inserted into by prestructuring some or all network architecture, weights, activation functions, learning rates, etc. process then proceeds same way as normal viz (1) train on available set(s); (2) extract `refined' rules. Very few have act true system. Existing techniques, such KBANN, (Towell & Shavlik, (1993), limited base used initialize must nearly complete, modifying antecedents. limitations existing severely limit their applicability real world domains. Ideally, technique should able deal with incomplete bases, modify antecedents, remove inaccurate rules, add new generating motivation project was develop system investigate its efficacy when applied both premise behind refined rules better represent actual domain theory than network. hypotheses tested include: utilization prior will speed up training, produce smaller networks, more accurate bias phase towards solution 'makes sense' domain. 1998 Geva, Malmstrom, Sitte, (1998) described Local Cluster (LC) Neural Net. Geva showed LC learn / approximate complex functions high degree accuracy. hidden layer comprised basis (the local cluster units), composed sigmoid based 'ridge' functions. General ridge can oriented any direction. We describe RULEX, designed provide underlying through weights units RULEX exploits feature, ie, axis parallel Restricted (Geva , 2002), allow hyper-rectangular IF ? 1 = i n : xi [ lower upper ] THEN pattern belongs target class easily extracted comprise 14 applications public results compared leading technique, See5, generally performing well See5 cases outperforming predictive RULEIN, allows converted parameters define RULEIN captured architecture thus facilitating first paradigm. variety artificial problems. Experimental indicate satisfy requirement correctly translating set has bahaviour which it constructed. also show where strong exists, initializing using speeds produces smaller, properly representing theory. weak exists not always apparent. Experiments method only partially correct, present base. combination shown effective use

参考文章(20)
LiMin Fu, Rule learning by searching on adapted nets national conference on artificial intelligence. pp. 590- 595 ,(1991)
Sebastian B. Thrun, Extracting Provably Correct Rules from Artificial Neural Networks University of Bonn. ,(1993)
Peter Auer, Robert C. Holte, Wolfgang Maass, Theory and Applications of Agnostic PAC-Learning with Small Decision Trees Machine Learning Proceedings 1995. pp. 21- 29 ,(1995) , 10.1016/B978-1-55860-377-6.50012-8
G Towell, J Shavlik, THE EXTRACTION OF REFINED RULES FROM KNOWLEDGE BASED NEURAL NETWORKS MACHINE LEARNING. ,vol. 13, pp. 202- 221 ,(1993)
Mark W. Craven, Jude W. Shavlik, Using Sampling and Queries to Extract Rules from Trained Neural Networks Machine Learning Proceedings 1994. pp. 37- 45 ,(1994) , 10.1016/B978-1-55860-335-6.50013-1
Saito, Nakano, Medical diagnostic expert system based on PDP model IEEE 1988 International Conference on Neural Networks. pp. 255- 262 ,(1988) , 10.1109/ICNN.1988.23855
G. David Garson, Interpreting neural-network connection weights AI Expert archive. ,vol. 6, pp. 46- 51 ,(1991) , 10.5555/129449.129452
J. R. Quinlan, Improved use of continuous attributes in C4.5 Journal of Artificial Intelligence Research. ,vol. 4, pp. 77- 90 ,(1996) , 10.1613/JAIR.279