Database Scoring with Object-Based Segmentation

Hero Image: Database Scoring with Object-Based Segmentation

Segmentation studies in practical market research generally fall into two categories: those that are related to company databases and those that aren’t. The latter are often based on survey data alone and tend to primarily use attitudinal information for identifying segments in the market. Segments so formed are quite rich, but once the segments have been formed, identifying segment members in the broader market is hard to do. Some form of broad targeting has to be used, perhaps based on demographic and media usage variables. On the other hand, segments formed on the basis of database variables don’t face the same problem since it is quite straightforward to score the entire database. But often the use of database variables alone does not provide sufficiently rich segment descriptions. This poses a dilemma for companies that want rich segmentation schemes and the ability to score their database using the segments so developed. Should they go with an attitudinal segmentation scheme or a demographic/transactional segmentation scheme?

Traditionally the answer has been far less than ideal. One option is to create rich segments using attitudinal variables and then try to predict those segments using whatever information is available in the database, with the hope of developing acceptable prediction equations that can then be used to score the entire database. The problem with this approach is that demographic, transactional and behavioral variables usually do not predict attitudes well, thus causing substantial misclassification. Another approach is to mix attitudinal variables with variables from the database in creating the segmentation scheme in the hopes of creating more “friendly” prediction equations. How well this works often depends on which type of variable was more influential in forming the segments, and based on that the quality of the prediction equations would vary. A third approach is to use attitudinal variables only in the creation of segments and prediction equations. Customers in the database are then scored based on the answers they give to the set of attitudinal questions. Of course, this works only in cases where a company has the means to query a large number of its customers on the key questions.

Object-Based Segmentation

A more recent proposed solution to this problem is object-based segmentation. In this approach, a large number of segments are initially formed using demographic or transactional variables already available in the database. These segments are called the objects. Next, these objects are used as the “respondents” in another segmentation analysis where the basis variables for the analysis are the attitudinal variables that are likely to yield rich segments. Once this analysis has been completed, it becomes easier to classify the database since the objects were formed using variables already available in the database.

For example, one could consider age, income, education and gender to be variables available for all customers in the database. The first segmentation analysis would create many segments (say 50-200), or objects, based on these demographic variables. Next, assume that there are 10 attitudinal variables that need to be used as basis variables. The mean score on each of those 10 attitudinal variables is calculated for each object and those means are used as the basis variables in the segmentation analysis. When the final solution is obtained, the segments can be interpreted based on attitudinal variables. But for classification, the demographic variable based object clusters can be used to easily identify respondents in the database.

As can be seen, this approach has considerable advantages when there is a need to classify a database. As long as a set of variables (demographic or otherwise) is available in the database for all respondents, this approach can be used. The method of classifying the database is straightforward and does not even require any form of discriminant or logistic regression based modeling. But generally speaking the attitudinal richness of the segments will depend to a large degree on the extent of the relationship between the demographic variables and the attitudinal variables. The stronger the relationship, the richer the segments will be.

An Example

A healthcare company wanted to create attitudinal segments based on survey results that could then be used to score their database of customers. A set of 18 attributes were identified as likely candidates for the segmentation. In order to use object-based segmentation, we identified five demographic variables from the database. The first stage of the analysis was to create a large set of objects based on these demographic variables. After a bit of experimentation we settled on 108 objects of varying sizes. In the second stage, we calculated the means of the 18 attitudinal statements for each of the 108 objects and used this 18 by 108 matrix as the input to the segmentation analysis. The final solution had four segments with substantial differences in attitudes that made clear sense to the client, both in terms of their composition and their affinity for products.

For a comparison, we also ran a segmentation analysis using the raw data and the 18 attributes; in other words, a traditional segmentation analysis. Again, we were able to find four segments and the content of the segments was substantially similar to what we had obtained in the object-based segmentation. This gave us confidence that the objects were not distorting the data and leading us down the wrong path. The advantage of using the object-based segments was that it was now possible to directly classify the members in the client’s database without resorting to any further modeling. Hence, the client was able to not only get an attitudinal segmentation, but was also able to easily classify their database.

The primary difference between the solutions obtained with the two methods was the difference in richness between the attitudes and demographic variables. The traditional segmentation solution was richer in that the segments showed more differences on the attitudes. In comparison, the object-based segmentation was not as rich on attitudinal differences, but it had much clearer demographic differences. Clearly, this is an expected result given the nature of each analysis. The traditional segmentation directly uses the attitudinal variables thus providing more discrimination, while the object-based segmentation uses the demographic variables as a vehicle to segment the attitudes. What matters at the end is that the Object-based segmentation is not only able to produce a substantively satisfying solution, but classifying the respondents in the database is now a trivial problem. It should be noted here that the actual segmentation technique used in Object-based segmentation is not any more or less important than in traditional segmentation.

When faced with a situation of conducting an attitudinal segmentation and a need to classify an existing database, Object-based segmentation seems worthy of consideration.