Combining concepts from Choice theory (Economics) and applications of Parameter estimation (Statistics), Conjoint analysis is a quantitative market research technique which helps identify which attributes of a given product or service users most value.
Employed in the areas of Product development, Marketing (Product positioning), and Operations, knowing user choices allow to improve product offering (by modeling and testing focused set of product options), and best allocate resources.
A conjoint study typically starts with identifying an approach exercise starts with the design of a Conjoint research following by estimating model parameters. This involves breaking the product/service into constituent parts (attributes) to build profiles and gathering preference data through surveys.
Design of conjoint study typically involves the below steps:
- Recognize approach (type of analysis required in the situation)
- Identify attributes and assign levels
- Define utilities and design experiment (based on choice situations)
- Parameter estimation and synthesize results
- Develop implications
For the purposes of the post, I present a hypothetical situation where I want to develop the next billion-dollar cola product based on what attributes cola drinkers most like.
In this survey, I had 10 respondents rate 8 profiles of cola products. The profiles have been created based on varying levels of a cola drink’s taste profile. These distinguishing attributes used are “kola (the kola nut bitter to ending in vanilla-sweet flavor)”, “fizz (effervescence)”, “sweetness” and “price”.
Based on the profiles presented, the respondents were asked to provide their liking for each of the product profile.
install.packages("conjoint") library("conjoint") setwd("C:/R") #Read in the files profiles <- read.csv("profiles.csv", header=T, na.strings=c("")) preferences <- read.csv("preference.csv", header=T, na.strings=c("")) #Add the levels levels <- c("low", "high", "low", "high","low", "high","1.5", "2") levels.df <- data.frame(levels)
For simplicity, the four attributes identified have been configured with levels across the flavor profiles and price.
Once the data is loaded, we will call the conjoint function by passing three inputs:
- (dataset: preference) Survey responses from participants with ratings across each of the flavor+price profile created
- (dataset: profiles) Profiles created based on varying levels of flavors and price points
- (dataframe: levels.df) Levels across the four attributes
Plot: Utilities of individual attributes
Plot: Average importance of factors
Looks like based on survey results, the most appreciated attributes are fizz followed by kola. So we might want to create a cola with higher fizz and kola flavor (like Thums up). Sweetness is relatively less important but based on the positive utility of the attribute “sweetness”, there are respondents who have liked a sweeter cola. Of all the attributes, the least important is price. This in fact could be observed by eyeballing the two paired profiles which have the same taste profile but a different price (for example: profile 1 & profile 2 are such a pair). This means people are ok to pay $0.5 premium for the cola which most appeal to them.
The challenges in setting up a conjoint study are often complex and multi-faceted. In a more realistic setting, typically products or services have both more attributes and levels which leads to a huge number of possible profiles to evaluate for candidacy. And during evaluation of these large set of profiles by respondents, the results are subject to various type of response bias.
In this post, I am using SuperStore data to explore some of the data wrangling functions from these two packages.
The data is in the form of an Excel workbook with three sheets namely – Orders, Returns, and Region. I am loading the three different sheets into separate datasets into R, joining them and performing necessary aggregations.
setwd("C:/R") #For accessing and dumping excel files install.packages("openxlsx") library(openxlsx) #Used for data wrangling install.packages("dplyr") library(dplyr) #Used for data wrangling install.packages("tidyr") library(tidyr) #Load three individuals sheets into separate datasets superstore.wb
# Sum of Sales by Product Category data.storeOrders%>% group_by(Product.Category) %>% summarise(Total.Sales = sum(Sales)) %>% arrange(Total.Sales)
#Join data sets and aggregate as per requirement # Inner join the two data sets Order and Users by Region and look at Total Sales by Region data.final % group_by(Region) %>% summarise(Total.Sales = sum(Sales)) data.final
Here is a code snippet for removing markings across visualizations/data tables within an analyses. This script will not reset the filters and only work on the marking. To illustrate this, I have constricted the date range for couple of date filters in the filter column.
I will now make some selections on the bar chart
To remove the markings without resetting the applied filters, we can execute the script through the button Reset Markings
Code snippet for reference:
# Import required libraries from Spotfire.Dxp.Data import * from Spotfire.Dxp.Application.Filters import * def resetMarking(): # Loop through each data table for dataTable in Document.Data.Tables: # Navigate through each marking in a given data table for marking in Document.Data.Markings: # Unmark the selection rows = RowSelection(IndexSet(dataTable.RowCount, False)) marking.SetSelection(rows, dataTable) # Call the function resetMarking()
Below is one more use of the snippet applied for multiple visualizations which use cascading markings.
When the analysis contains multiple markings, then visualizations using these multiple markings have to be set individually. Below I am removing the marking for a single data table.
Though we Unmark marking for a single visualization, the markings which have got applied through cascading are not reset in other visualizations. In the bottom visualization, the markings are not reset as it inherits markings from the center visualization.
In such cases, our script should do the job.
Cohort Analysis is a technique used to analyze characteristics of a cohort (a group of customers distinguished on a common characteristic) over time. It is actually another type of customer segmentation which extends the analysis over a defined period.
One of the frequently applied use case in sales function is to segment customer base based on some set of characteristics. The criteria could be to categorize them into groups who are likely to continue buying, who are likely to defect or who have already defected (went inactive).
Once these groups are formed, some of the common applications for analysis would be to:
- Study customer retention – use the results to learn about conversion rates of certain groups and accordingly focus marketing initiatives (may be try to focus on customers who could be retained)
- Forecast transactions for cohorts/individual customers and predict purchase volume
- Bring more business – Identify groups for upselling and cross-selling
- Estimate marketing costs by calculating lifetime value of a customer by cohort
- Improve customer experience based on individual customer needs across websites and stores
Marketing is hugely important for a business to succeed. Being able to clearly define marketing objectives and accordingly prioritize on marketing spend is one of the major challenges marketers face. And in order to tune their approach, marketers need important metrics from various business functions to determine marketing effectiveness. Below is an attempt to categorize some of the generally applied analytic techniques that can be used to measure the marketing performance.
Our first step in this analysis would be to identify relevant data sources and develop automation capabilities to streamline data into well-defined repositories. Next, we could use a combination of descriptive and predictive analytic techniques to gain insights. And further we could integrate different models and automate their execution to perform prescriptive analytics for continuous monitoring and feedback.
Marketing drives sales and sales in turn should help improve marketing strategy. Let’s look at some techniques to identify sales patterns and then work on improving our mix of marketing activities.
|Applications||Applicable Tools/Techniques||Required Measures/Expected Results|
|Sales Performance (Descriptive)||Visualizing data using Time Series Analysis and other metrics using standard/ad hoc reporting and operational dashboards that cater to different audiences||Use accumulated data over time to learn about correlations and identify patterns|
|ARIMA models for time series data|
|Sales Performance (Predictive)||Simple and multiple linear regression techniques for forecasting and simulation||Determine future possibilities and predicting events to make more informed decisions|
|Applications||Applicable Tools/Techniques||Required Measures/Expected Results|
|Customer Acquisition and Retention||Logistic Regression (Churn Analysis)||Using historical data to identify ingress and egress of customers|
|Customer Segmentation||Cluster Analysis||Identify potential markets and improve on promotion, product, pricing and distribution decisions|
|Product and Brand Feedback||Text Analytics using Natural Language Processing Toolkit from Python||Analyze unstructured data from social media platforms such as Facebook, Twitter, Yelp etc.|
|Sentiment Analysis using Stanford NLP|
|Customer Loyalty||Logistic Regression||Understand customer behavior and improve decisions around targeted promotions|
|Multivariate Analysis using Factor Analysis, Principal Component Analysis or Canonical Correlation Analysis|
|E-Marketing||Clickstream Analysis (Traffic and E-commerce-based)||Improve conversion and sales|
|Drive email marketing campaigns|
|Google Analytics for website statistics||Search engine optimization (SEO)|
Note: The above mentioned techniques can always be used across a set of problems depending on their applicability.
After analyzing the results from our analytical models, we have to take measures on improving crucial marketing activities such as generating leads, demand creation and product promotion. Further, above analysis could be used to design and implement marketing strategies including product and brand promotion, pricing strategy, distribution and customer service. And the findings can be employed in improving questionnaires and other mechanisms of collecting marketing data and customer feedback to learn about product performance and brand value.
With these new analytics capabilities, we can make predictions much more accurately and provide our marketing teams with new ideas to drive promotions and boost sales.
In general, adoption and effective application of these analytic techniques is challenging. Building the right analytics should be informed by industry knowledge and subject to the business function in context. However, this is a process which requires constructive iteration over a long term and in most cases should lead in optimizing marketing performance and delivering tremendous value to the organization.
Below is a code snippet to pull data from a data table into Script context. The data from a particular column(s) could be used to perform validations or compared against values from other data tables.
from Spotfire.Dxp.Data import *tableName='SuperStore_Sample' columnToFetch='Order Date' activeTable=Document.Data.Tables[tableName] rowCount = activeTable.RowCount rowsToInclude = IndexSet(rowCount,True) cursor1 = DataValueCursor.CreateFormatted(activeTable.Columns[columnToFetch]) ctr1 = 0 for row in activeTable.GetRows(rowsToInclude,cursor1): rowIndex = row.Index val1 = cursor1.CurrentValue ctr1 = ctr1 + 1 if (ctr1 == 5): break
Further, we could push the data into an array for temporary storage and use as per requirement.