作者: Robert Andrews
DOI:
关键词: Machine learning 、 Domain knowledge 、 Black box 、 Knowledge acquisition 、 Computer science 、 Network architecture 、 Domain theory 、 Artificial neural network 、 Artificial intelligence 、 Knowledge base 、 Problem domain
摘要: Artificial neural networks (ANNs) are essentially a 'black box' technology. The lack of an explanation component prevents the full and complete exploitation this form machine learning. During mid 1990's field 'rule extraction' emerged. Rule extraction techniques attempt to derive human comprehensible structure from trained ANN. Andrews et.al. (1995) proposed following reasons for extending ANN paradigm include rule facility: * provision user capability extension 'safety critical' problem domains software verification debugging components in systems improving generalization solutions data exploration induction scientific theories knowledge acquisition symbolic AI An allied area research is that refinement'. In refinement initial base, (i.e. what may be termed `prior knowledge') inserted into by prestructuring some or all network architecture, weights, activation functions, learning rates, etc. process then proceeds same way as normal viz (1) train on available set(s); (2) extract `refined' rules. Very few have act true system. Existing techniques, such KBANN, (Towell & Shavlik, (1993), limited base used initialize must nearly complete, modifying antecedents. limitations existing severely limit their applicability real world domains. Ideally, technique should able deal with incomplete bases, modify antecedents, remove inaccurate rules, add new generating motivation project was develop system investigate its efficacy when applied both premise behind refined rules better represent actual domain theory than network. hypotheses tested include: utilization prior will speed up training, produce smaller networks, more accurate bias phase towards solution 'makes sense' domain. 1998 Geva, Malmstrom, Sitte, (1998) described Local Cluster (LC) Neural Net. Geva showed LC learn / approximate complex functions high degree accuracy. hidden layer comprised basis (the local cluster units), composed sigmoid based 'ridge' functions. General ridge can oriented any direction. We describe RULEX, designed provide underlying through weights units RULEX exploits feature, ie, axis parallel Restricted (Geva , 2002), allow hyper-rectangular IF ? 1 = i n : xi [ lower upper ] THEN pattern belongs target class easily extracted comprise 14 applications public results compared leading technique, See5, generally performing well See5 cases outperforming predictive RULEIN, allows converted parameters define RULEIN captured architecture thus facilitating first paradigm. variety artificial problems. Experimental indicate satisfy requirement correctly translating set has bahaviour which it constructed. also show where strong exists, initializing using speeds produces smaller, properly representing theory. weak exists not always apparent. Experiments method only partially correct, present base. combination shown effective use