作者: Mohammad Mahdi Hassan , Martin Blom , Gufran Ahmad Ansari
DOI:
关键词:
摘要: It is common to collect data from practitioners in the software engineering field using surveys and questionnaires. This data is usually analyzed using descriptive statistics where the entire population is considered as an undivided group, sometimes complemented by sampling methods to obtain variations within the sample. In many cases, the survey population is partitioned into smaller groups by using available background knowledge of the participants. These techniques are valid, but can only reveal opinion diversity if that correlates with the background variables, and fail to identify sub-groups across multiple background variables. The existing approaches can thus capture the general trends but might miss opinions of different minority sub-groups. This problem becomes more complex in longitudinal studies where minority opinions might fade or resolute over time. Data from longitudinal studies may contain patterns which can be extracted using a clustering process. These patterns may unveil supplementary information and draw attention to alternative viewpoints than those exhibited by the sample population as a whole. This approach may reveal the range of opinion variations between diverse groups over time and makes it possible to identify the minorities. In our research, we have investigated the suitability of clustering techniques for analyzing categorical data from longitudinal studies.