作者: Ben Veal
DOI:
关键词:
摘要: Say we have a set of data which can be represented by distinct binary vectors A ⊆ {0, 1}n (e.g. medical data: each vector could correspond to patient and entry the presence or absence particular symptom), has corresponding label either 0 1 (for example this represent whether disease not). We form some classification rule h : → 1} that correctly labels all in A. Now are presented with new x ∈ 1}n\A (with no assumptions what distribution it come from), would ideally want classified h. might expect if is ‘similar’ then more likely (since correct ‘consistent’ on A), but how do measure similarity A? Here explore one measure, investigating its usefulness combinatorial extremal properties.