|
Nearly all methods used before conjoint came along involved asking
consumers direct questions about the features they wanted in a product,
often using importance-rating scales. Direct rating methods had
several serious drawbacks. If given no constraints, consumers will
tend to rate nearly everything as important. After all, it costs
nothing to give a proposed feature a "highly important"
rating when participating in a survey. In a product-feature study
using a list of 20 product attributes, for instance, 15 or 17 features
(if not all 20) could emerge as "crucial" for the product.
Another problem with rating scales lies in the way people tend
to give what they perceive to be socially desirable answers. But
what people think they should do and what they actually do often
differ.
Finally, even sophisticated shoppers often have trouble stating
what motivates them to choose one product over another. The choice
process is something like the process of riding a bicycle in that
it is largely based on types of judgment that most people find difficult
to precisely describe.
Direct-question research typically resulted in overpriced products
loaded with features that had little appeal to real-world consumers,
and stories of failed products and services are legion.
Back to top
Enter Conjoint Analysis
Conjoint analysis made product and service testing more realistic.
Its most common form, full-profile conjoint analysis, presents the
study participant with a series of product descriptions (or other
representations). The participant is asked to look at the descriptions
and make a series of choices similar to those they would make in
the real world. The results of the study allow the value of various
product features to be derived mathematically. Figure 1 shows a
typical conjoint profile, or card.
 |
Conjoint analysis has several basic analytical requirements or
ground rules. First, it requires that the products or services tested
be treated as sets of distinct attributes (or features). It also
requires a limited set of variations (or levels) for each feature.
In a test of hotel features, for example, one feature might be the
type of hotel lobby. Suppose this attribute had three levels: plain
and simple, small and opulent and large and ostentatious. Four to
ten other hotel attributes might be varied in similar ways to develop
the product profiles.
In most cases, full-profile conjoint analysis uses a specific type
of experimental design that selects a specific subset of the many
possible combinations of attributes that could tested. For instance,
suppose that in our hotel example, we decided to test seven attributes,
three of which had three levels and four of which had two levels.
This would lead 432 possible combinations of attribute levels (3
x 3 x 3 x 2 x 2 x 2 x 2).
Full-profile conjoint analysis could estimate the worth of all
these possible combinations using just 16 product profiles. This
is good, because the average respondent could not be counted upon
to rate 432 different product profiles without heavy use of certain
illegal stimulants. Use of experimental design methods clearly can
extract a great deal more information than you could get otherwise.
Conjoint analysis predicts consumer choices better than rating
scales because participants in a conjoint analysis-based study look
at relatively realistic product profiles and make trade offs among
various product features, selecting some combinations of attributes
as better than others. Hence conjoint analysis is called a multi-attribute
trade-off technique.
Back to top
Why Discrete Choice Modeling?
For all its strengths, conjoint analysis also shows some weaknesses,
particularly in testing branded products. In conjoint analysis,
brand often gets treated as a product feature, with various brands
as the levels. Problems arise when brand appears as an attribute
to be tested along with other attributes. Since each level of each
attribute must appear with each other attribute level, impossible
combinations of brands and features can appear. Impossible brand-price
combinations are especially likely to appear. Also, when the brand
name itself signals a degree of product quality, it cannot be traded
accurately against other attributes.
Problems also can arise in asking respondents to rate rank product
profiles. If the procedure is to work at all, study participants
must rank or rate all the conjoint profiles. Sometimes rankings
prove difficult. Looking at series of 16 product profiles, most
of us probably could select the one we like best and the one we
like the least. We probably also would have little trouble identifying
our second favorite and our next-to-least favored choices It's harder
to decide which should be ranked fifth or sixth. Forcing respondents
to carefully rank or rate alternatives they would never choose can
make the task imposed by conjoint somewhat different from real-world
product selection behavior. And unfortunately, accurate rankings
or ratings of all the product profiles are needed for conjoint analysis
to provide meaningful results.
Discrete choice modeling avoids impossible combination and forced
choice problems, while preserving - and even extending - the estimating
power of conjoint. With DCM, respondents see products or services
alongside competitive products in a series of market scenarios.
They are asked to look at each scenario, and answer a simple question:
If these were all the choices available, which would you choose,
if any?
Once the respondent has done this, he or she simply goes on to
the next scenario and makes the same simple decision. Figure 2 shows
a sample DCM scenario.
| Sample DCM Scenario |
| Which one, if any, would
you choose? |
| Chain |
Holiday Inn |
Marriott |
Hyatt Regency |
Westin |
None of these |
| Price |
$80.00 |
$80.00 |
$100.00 |
$110.00 |
(I would choose to stay somewhere else.) |
| Location |
Near airport, away from downtown |
Downtown |
Near airport, away from downtown |
Near airport, away from downtown |
|
| Clientele |
Primarily business travelers |
Heavy convention and meeting business |
Primarily business travelers |
Business and family travelers |
|
| Room |
Basic room, not cramped but little workspace |
Large and spacious with desk and table |
Large and spacious with desk and table |
Suite with bedroom and sitting room |
|
| Lobby and Public Areas |
Basic ordinary lobby |
Large, elegant and impressive |
Large, elegant and impressive |
Large, elegant and impressive |
|
| Health club facilities |
No health club associated with hotel |
On premises |
On premises or nearby |
On premises or nearby |
|
| Check-Out Speed |
Standard check-out |
Express check-out, never over 3 minutes |
Express check-out, never over 3 minutes |
Express check-out, never over 3 minutes |
|
| Parking |
Ample free parking |
Ample free parking |
Ample free parking |
Parking typical for airport area |
|
| Restaurant in Hotel |
Moderate, not fancy |
Nice but not exceptional |
Nice but not exceptional |
One of the best in town |
|
| No smoking rooms |
No special rooms or floors |
No special rooms or floors |
No special rooms or floors |
Special floors |
|
|
DCM's approach has several benefits, aside from posing a more realistic
and natural task. Perhaps most importantly, the question asked is
the question researchers care about most. The respondent makes a choice,
decision or purchase, rather than just stating a preference. In addition
to having a great deal of theoretical support, the DCM approach has
high face validity with nontechnical (managerial) audiences.
Further, representations of brands can be customized to match marketplace
reality. Each brand can have its own attributes and attribute levels.
Attributes can vary for different brands. Several products with
the same brand name can appear side by side in the scenarios, allowing
for direct measurement of product line effects.
DCM can even be set up to have one or several products missing
from some scenarios, which allows for the measurement of marketplace
effects of product introduction or withdrawal.
DCM also handles certain experimental designs more easily than
conjoint. With DCM, many products can share one large experimental
design, or each product can have its own conjoint-style design.
Products can share experimental designs (for instance, two experimental
designs can be split among six products). Very large designs (requiring,
say, 32 or 64 product scenarios) can be fractionalized (split apart)
very easily; the downside is that fractionalization requires larger
samples. However, it is still much easier than the complex gyrations
required when splitting conjoint designs, which tend to become messy
and produce less than ideal results.
With large enough samples, you can even use random designs with
DCM. With DCM, random designs do not have the exact formulations
of experimental designs. Instead, they put levels of attributes
together in a different random configuration for each respondent.
Sawtooth Software's CBC product does this.
You can also use DCM to analyze data collected with no explicit
design, an approach called "revealed preference analysis"
that has long been used in econometrics. However, any design departing
from strict experimental design principles must be tested carefully.
Back to top
Key Characteristics
DCM analysis necessarily involves choices among alternative products
or services, typically shown side-by-side in scenarios. In addition,
DCM typically uses estimation by logistic regression procedures.
Multinomial (or polytomous) logistic regression is used for choices
among more than two alternatives. When more than two alternatives
are tested, DCM also must make an important mathematical assumption,
the independence of irrelevant alternatives (IIA). IIA is a key
property of DCM.
Conditional variables
DCM differs from most other forms of analysis in that it often uses
conditional variables. Conditional variables exist only for one
or a few of the choices available. They also can have different
levels for the various choices. Again considering hotels, suppose
Marriott was the only chain that offered an automated robot bar
in each room. The automated bar could be a small refrigerator that
electronically recorded drinks you took from it, or a smaller wall-mounted
unit. With DCM, "automated bar" could appear as a conditional
variable only in connection with the Marriott. Since this feature
would not be offered by the other hotels in the real world it would
not appear as a feature tested in connection with them.
Aggregate level analysis
DCM also has one salient limitation: Analysis can be done at the
aggregate level only. This is a consequence of logistic regression,
which works in terms of likelihoods or odds. Odds, of course, only
can be estimated at the group or aggregate level. Some experts say
you need, at a minimum, the moral equivalent of six people to do
any estimation with logistic regression.
Aggregate-level analysis makes it relatively easy to fractionalize
large DCM tasks. Fractionalization is done most efficiently by splitting
the scenarios randomly among respondents. For a large design that
has many attributes and levels, and that requires 32 scenarios,
you can simply give each respondent 16 scenarios at random. This
would require a larger sample, of course; in this case, double the
initial number. Assuming respondents could handle a lengthy task,
you also could try a 50 percent boost in sample size with each rating
24 scenarios.
The ease of splitting tasks between respondents leads back to the
idea of using the moral equivalent of a certain number of people.
But sampling error is a factor in DCM just as it is in other survey
research. Given this, you will want to use samples adequate for
accurate estimation. There is nothing moral about a sample of six.
Greater complexity
Analytical complexity is, unfortunately, another key aspect of DCM.
A few software packages have addressed the issue, building in analytical
procedures that, to varying degrees, make DCM somewhat simpler.
Still, DCM remains far more difficult than conjoint when it comes
to analysis and model specification. You may need special designs
to capture interactions (e.g., so-called response surface-type designs,
which standard conjoint design programs cannot generate, or other
more esoteric designs).
DCM models also characteristically include testing for relations
beyond the raw data. For example, you might look at effects based
upon squared or cubed variables, especially in investigating the
effects of price. You also usually would test for interactions between
key variables.
Iterative analytical procedures and models that must "converge"
Multinomial logit itself works differently from many multivariate
procedures. It runs iteratively until it converges upon a solution,
making it at least roughly akin to K-Means clustering, which runs
and reruns solutions until two consecutive runs fall within some
acceptable tolerance of each other. Unfortunately, while clustering
models usually converge, or behave, DCM models may not.
Several factors can cause nonconvergence with DCM:
- multicollinear variables (variables highly correlated with each
other or some combination of other variables);
- the presence, among the choices in the scenarios of one or more
alternatives selected very infrequently; and
- the presence in the design of any highly infrequent variable
or variables. Infrequent variables can arise because DCM allows
variables to be conditional, to apply to only one of the choices
being tested.
Unfortunately, it is unclear how infrequent a variable or an alternative
can be without causing problems. If the model does not converge,
though, infrequent alternatives within the scenarios and infrequent
variables should be among the first suspects discarded from the
model.
DCM models may not work on the first half dozens tries. Even when
a model works, it may be far from the best in terms of what you
need to find. So DCM usually requires more exploration of alternatives
than other methods, such as conjoint analysis.
Simulations closely related to scenarios
Simulations must be closely tied to the scenarios presented to respondents.
If five products appear in all the DCM scenarios, you cannot do
accurate estimations of what might happen if there were only four,
or if another product entered the market as a sixth competitor.
Such contingencies must be considered as part of the initial DCM
design. Conjoint provides more flexibility, allowing you to make
unplanned. after the fact estimations of effects.
Reasons favoring the logistic model
Given that the logistic model can be harder to get right than the
models used in many other procedures, you may well ask why it is
worth the bother. The simple answer is that it provides more analytical
precision and power. Logistic regression models handle problems
with discrete (not continuous) dependent variables that ordinary
linear regression cannot. Linear regression definitely does not
work correctly when you are trying to predict a variable that can
take only two values, such as choose vs. don't choose, yes vs. no.
or O vs. 1. Linear regression is not bounded by the values that
you are trying to predict in this case. Predictions from linear
regression can take any value, so instead of just the 0 or I you
are hoping to predict, it might produce a prediction of 0.1 or two,
or even a negative number. The correct answer always would be either
0 or l (or yes or no) when considering a single product and a single
choice between two alternatives. so linear regression definitely
is not the best method for analyzing that situation.
The theory of linear regression also states that it should not
be used when the dependent variable can take only a few discrete
values. So linear regression often will not work well even when
the consumer can make several choices at once (for instance, choosing
which types of soft drink and how many of each to buy on a shopping
trip).
Discriminant analysis, not linear regression, is the linear model
technique that should be used to predict which choices consumers
make. Multinomial logit (MNL) can be thought of as similar to multiple
discriminant analysis in several ways. Predictions of group memberships
(or which thing gets chosen) from both discriminant and MNL often
will be highly similar - assuming that discriminant analysis handle
the problem in question. Some statisticians even use MNL as a sort
of superset of discriminant analysis, capable of handling more complex
analytical questions. However, MNL has different theoretical underpinnings
than discriminant analysis. MNL works in terms of likelihoods and
odds. As such, its approach and the output it produces are more
closely related to the question of choice than the assumptions and
output of discriminant analysis.
Logistic models: Beyond a Simple straight-line approach
The logistic model is not linear and additive. It does not assume
adding a few utilities will accurately model how people respond.
Rather, it assumes an S-shaped (sigmoidal) response curve. Figure
3 shows a typical S-curve. You can think of this curve as representing
a view or theory of how consumers will respond to a product.
When utilities are near zero, utility must increase by a large
amount to get consumers to a middling position. In other words,
it takes a lot for a product or service to move consumers from indifference
to some interest.
In the middle range of utilities, small improvements (or gains
in utility) can lead to sharp increases in consumers' likelihood
to respond. It only takes a little extra utility (or perceived value)
to generate strong interest in a product among people who already
have some interest in it. The end of the logistic curve gets flat
like the beginning. In other words, it takes a lot of extra utility
to move people from interest to action.
Not all writers on the subject see the logistic curve in quite
this way, but it certainly goes with the shape of the logistic function.
It is a helpful way to think of how logistic models work.
The nonlinear nature of the logistic response curve has some practical
implications as well. First, you cannot directly estimate the value
of an alternative by summing the utilities that it includes. With
conjoint analysis, you simply add up the utilities of all the product/attribute
variations of interest to you. and the total is what the product
is worth. Not with DCM. Instead, you typically must run simulations
through the MNL estimation program.
DCM and the independence of irrelevant alternatives
Strictly the independence irrelevant alternatives is a mathematical
property of error terms that the MNL model assumes to exist. It
is something like the assumption in linear regression that errors
in estimations have a common variance and are independent (homoskedasticity).
Practically, though, assuming IIA works as a property of DCM means
that you are assuming the odds of selecting one alternative over
another are not influenced by the presence of other alternatives
that you are not actively considering, As a result, some critics
have called the MNL model unrealistic as a representation of peoples'
behavior. They argue that consumers always consider all alternatives
in making any choice, even if many of the choices are things they
would never select. Defenders of MNL say that if your decision comes
down to a choice between A and B, the presence of any number of
other alternatives makes no difference.
Arguments like these can go on all night with each involved party
more certain of the correctness of their position at the end. Recall
though, that IIA is simply a consequence of reasonable assumptions
about error terms. Fortunately, in practice, violations of IIA do
not appear too often and when they do, they usually prove remediable.
However the remedy involves experimenting with the DCM model until
it no longer violates the IIA assumption.
The special properties and requirements of conditional variables
Conditional variables exist for one (or a few) of the alternatives
being tested, but not for all. Using conditional variables allows
each product to have its own unique attributes and levels. Conditional
variables require special coding of variables. Normally, nominal-level
variables handled in analytical procedures by the use of dummy variables.
These variables take on values of I or 0, corresponding to the attribute
being present or absent.
A nominal-level variable that has three levels would get translated
into two dummy variables. For instance, in the hotel lobby example,
we would create two dummy variables to describe the three types
of lobby:
| Type of lobby |
Var. 1 |
Var. 2 |
| 1. plain and simple |
1 |
0 |
| 1. small and opulent |
0 |
1 |
| 1. large and ostentatious |
0 |
0 |
Conditional variables may require a different scheme, since a code
of 0 is best used to stand for an attribute level that is absent
in an alternative. Rather than using the usual 0 vs. 1 form, codes
of -1 and 1 are used, eliminating the chance of confusion between
a code for an attribute level and the "level absent" code.
DCM: No individual-level utilities
Aggregate-level analysis means no individual-level utilities of
the type obtained from conjoint. And the absence of individual-level
utilities means segmentation is not available from DCM results.
Segmentation based on conjoint utilities often provides tremendously
useful results. The groups that emerge usually have sharply different
wants and needs related to the product in question.
It's likely that it will be possible to infer individual-level
utilities from DCM before long. On one side, the academics are rushing
to the rescue, which means a batch of not terribly readable articles
about something called latent class models are available. A few
papers also have been presented by proponents at conferences. And
some academics even claim to have used such models.
Latent class models may become interesting, if not useful, at about
the time Intel introduces the 80986 chip. The model now take a week
or two to run on very large machines (faster and bigger than a fully
loaded Pentium). At a recent conference, one academic stated that
he simply had his department buy a Sun Computers work station (a
highly powerful machine, something on the order of the most powerful
computer in the world as of 1980). It was then a simple matter to
run this machine for two or three weeks straight every time he needed
to solve one of these problems. The academic failed to see why some
listeners found this less than practical.
Other practitioners are investigating alternative approaches, but
so far nothing promising has emerged.
In the meantime, nothing prevents you from clustering on other
criteria, then running the DCM among the groups or segments developed
in the clustering.
DCM: Evaluating the results
Once you have completed a DCM analysis, there are several criteria
you can examine to see how well the model has performed. The first
thing that needs to happen is simple: The model must run, or converge.
Sometimes getting a model to converge takes some doing. You may
need to drop some terms or add some others. Highly correlated variables
can spoil a DCM model -- something like the problem of multicollinearity
in regular linear regression. You may need to transform some of
the basic variables. (Squaring, cubing and taking the square roots
of numerical variables are some of the more common transformations.)
You may need to eliminate infrequent choices from consideration.
Once the model runs, one of the most straightforward ways to diagnose
its goodness is to check the variables that emerge as significant.
As in regular linear regression models, the variables entered into
DCM models may or may not prove to contribute significantly. With
DCM models, you also need to be particularly careful about variables
that emerge as significant but have signs that seem backwards (a
variable that you expect to have a positive effect emerges as apparently
having a negative effect). This can happen when the model has too
many terms (again, due to problems similar to multicollinearity
in linear regression), or too few terms (you left out something
important).
DCM also has a few standard measurements that help you assess the
goodness of a model. One of these, the Rho-squared, is like the
R-squared (explained variance) in linear regression. There is one
caution here: the more terms or variables you have in the final
model, the more likely you are to get a "good" Rho-squared.
This again is like regular linear regression. A regression equation
with 100 or so variables is likely to explain more variance than
a model with just a few terms in it. As a result, big, complicated
DCM models often look pretty good, based on the Rho-squared statistic.
Below are listed some guidelines for reading the Rho-squared values
that emerge from a DCM analysis. The figures from Williams and Louviere
seem more fitted to large models. If you keep just a few of the
most important variables - which after all, is what many clients
want and need - some alternative interpretations become available.
| |
Interpretation |
| Rho Squared Value |
Williams/Louviere
(Large Models) |
Struhl
(small models) |
| 0.0 - 0.2 |
Poor |
Poor |
| 0.2 - 0.3 |
Poor - Reasonable |
Fair |
| 0.3 - 0.5 |
Reasonable - Good |
Good |
| 0.5 - 0.8 |
Good - Excellent |
Excellent |
| 0.8 - 1.0 |
Excellent |
Wow! |
Another key measure of a model's goodness is a correct classification
table. You read these much as you read the correct classification
table in a discriminant analysis. The correct classification table
shows how often each choice was selected, and how often it would
be predicted as the choice, based on the best DCM model that could
be generated. The following table shows a simple DCM example that
did not come out very well.
The table shows that consumers had four choices in each scenario
(three products and "none of these"). Let's review what
the data in the table show for the first alternative, or product.
The first product, was actually chosen 740 times. That number appears
at the end of the first row. Of those 740 actual choices, we would
predict 495 choices based on the best DCM model. This leads to the
"% correct" classification of 66.89% for this alternative
(shown at the bottom of the first column).
Overall, we would predict that the first alternative would be chosen
some 1,650 times, with 260 of the choices coming from those who
actually chose the second alternative, 490 from those who actually
chose the third and 405 from those who actually chose the fourth
(none of these). The figures can all be found in the first column
of numbers, which corresponds to the first alternative.
Although the model performs well predicting choices of the first
alternative, it does not do well elsewhere. It predicts that nobody
would choose the second or fourth alternatives, with correct classification
rates of 0 percent for each. Overall correct classification is 29
percent, not appreciably better than the level of 25 percent that
would be expected based on chance.
| CLASSIFICATION TABLE |
|
|
|
|
|
|
| Observed |
Predicted Alternative |
| Altern. |
1 |
2 |
3 |
4 |
TOTAL |
| 1 |
495.000 |
.000000 |
245.000 |
.000000 |
740.000 |
| 2 |
260.000 |
.000000 |
120.000 |
.000000 |
380.000 |
| 3 |
490.000 |
.000000 |
210.000 |
.000000 |
700.000 |
| 4 |
405.000 |
.000000 |
170.000 |
.000000 |
575.000 |
| TOTAL |
1650.00 |
.000000 |
745.000 |
.000000 |
2395.00 |
| % Correct |
66.8919 |
.000000 |
30.0000 |
.000000 |
29.4363 |
In short, the table shows that the model did not do a good job
of predicting choices, unless our only interest was in discovering
why people selected the first alternative. Incidentally, this fictional
data led to an overall model fit (Rho-squared) of 0.21 - just marginally
acceptable even by the most lenient standards. If we could do no
better than this with a DCM model in real life, we might find the
results somewhat disappointing. Fortunately, all the real-world
models I have seen have performed significantly better than this
one.
DCM to do it yourself: The analytical alternatives
Two new programs, CBC from Sawtooth Software, and Ntelogit from
Intelligent Marketing Systems, make it possible for the nonspecialist
to analyze a DCM problem. For some years, the analytically more
advanced (or more brazen) could use multinomial logit programs from
Systat and SAS to perform many of the same tasks. (the May issue
of QMRR contains a review of the two new programs.)
Back to top
DCM - A Two-Minute Summary
DCM is a powerful technique that can solve many problems better
than full-profile conjoint analysis. It is not an overstatement
to say that DCM provides the ultimate in product and service testing.
DCM uses the most realistic methods available for measuring consumers'
choices. Consumers respond to products and services as they would
in the marketplace. All the products shown can have the attributes
that they would in the marketplace. They do not need to share the
same attributes and the attributes do not have to vary in the same
ways, as in conjoint. Several products from a given manufacturer
can be included in a single DCM task, allowing for the explicit
modeling of product-line effects. Products can appear in some scenarios
and not others, which leads to realistic testing of the elimination
or introduction of new products.
Often, DCM gets applied to complex problems with many attributes.
In fact, more attributes and levels actually can make it easier
to fit a model closely to the data.
Even at its most accessible (for the moment, CBC from Sawtooth
Software), DCM is more difficult to do than standard conjoint. It
requires understanding of model specification and strategies for
trouble-shooting.
Highly complex problems probably require an expert. However, not
all experts necessarily will give the same answers. Even the authorities
in the field are still debating the best approaches to intricate
DCM problems. Unless you have some analytical sophistication, you
may not want to try DCM by yourself - at least the first time. Regardless
of how you try the technique, though, it definitely merits serious
investigation, given its great analytical power and the tremendous
amount of useful information it can provide.
Reprinted from Quirk's Marketing Research Review, June/July
1994. All rights reserved.
Back to the Library
|