We propose programs to determine from the cash register receipts which baskets were purchased by the same customer. The putative customers can then be given identifiers. Programs can infer more facts about customer characteristics and behavior with facts about purchases of an identified customer over time than could be inferred from mere statistics about the baskets themselves.
This example of phenomenal data mining is straightforward in that it is reasonably clear what a successful result would be and how it might be used. We hope to make it plausible that enough information is present in the data to usefully distinguish customers. However, experiment is needed to verify that feasible algorithms exist.
Demographic information about customers is known to be useful, e.g. their ages, occupations, sexes and incomes. When this information is supplied, e.g. in mail order situations where credit is granted, it is extensively used. However, in our supermarket chain example, that information is not in the database of transactions. Let us consider inferring it; it might then be used in any of the presently conventional ways.
There are several approaches to associating baskets purchased by the same customer.