作者: Sandro Saitta
关键词: Feature selection 、 Set (abstract data type) 、 Artificial intelligence 、 System identification 、 Machine learning 、 Computer science 、 Data set 、 Data stream mining 、 Cluster analysis 、 Data collection 、 Data mining 、 Decision support system
摘要: Data alone are worth almost nothing. While data collection is increasing exponentially worldwide, a clear distinction between retrieving and obtaining knowledge has to be made. retrieved while measuring phenomena or gathering facts. Knowledge refers patterns trends that useful for decision making. interpretation creates challenge particularly present in system identification, where thousands of models may explain given set measurements. Manually interpreting such not reliable. One solution use mining. This thesis thus proposes an integration techniques from mining, field research the aim find data, into existing multiple-model identification methodology. It shown that, within framework support, mining constitute valuable tool engineers performing identification. For example, clustering group similar together order guide subsequent decisions since they might indicate possible states structure. A main issue concerns number clusters, which, usually, unknown. determining correct clusters estimating quality algorithm, score function proposed. The reliable index set, understanding results. Furthermore, information who perform achieved through feature selection techniques. They allow relevant parameters candidate models. core algorithm strategy based on global search. In addition providing about model space, found supporting related sensor placement. When integrated methodology iterative placement, provide support rational basis placement structures. Greedy search strategies should selected according context. Experiments show whereas more efficient initial greedy suitable