Using Spotfire for Predictive Analytics (Regression Modeling)

Posted on Updated on

We are building a model using Linear Regression to forecast sales

Spotfire_Regression_Modeling_1

`Sales` ~ `Order Quantity` + `Discount` + `Shipping Cost` + `Profit` + `Unit Price` + `Product Base Margin`

This is the model with “Sales” as the Response variable and all the subsequent columns after the “~” considered as Predictor variables.

Let us click on “OK” to examine the results for the model

Spotfire_Regression_Modeling_2

In the Model Summary pane, we can check the summary metrics:

Residual standard error: 1421 on 8329 degrees of freedom
 (63 observations deleted due to missingness)
 Multiple R-squared: 0.8421, Adjusted R-squared: 0.842
 F-statistic: 7406 on 6 and 8329 DF, p-value: 0

Below is the significance of the model parameters:

Residual Standard Error: A lower value indicates the model is better fit for our data.

Adjusted R-Squared: This is a commonly used measure of fit of a regression equation. It penalizes the addition of too many variables while rewarding a good fit of the regression equation. A higher Adjusted R-Squared value represents the model to be a better fit.

p-value: Predictors with this value closer to zero are better contributing to the model

Some of the other factors which will influence our model are Collinearity and multicollinearity, and Variance Inflation Factor (VIF), AIC and BIC values can help assess our model.

Collinearity is a case of an independent variable being a linear function of another. And in Mulitcollinearity, a variable is a linear function of two or more variables. These issues can increase the likelihood of making false conclusions from our estimates.

High VIF means that multicollinearity significantly impacts the equation whereas lower AIC and BIC are better. Spotfire_Regression_Modeling_3

The Table of Coefficients will have various p-values for various predictors (also called Regressors). Lower p-values will give the significance of each predictor in the model

Spotfire_Regression_Modeling_4If there are patterns in the the “Residuals vs. Fitted” plot, then the current model could be improved.

Spotfire_Regression_Modeling_5

A simple horizontal bar signifying the relative importance of each predictor used in the model. Discount is the least important predictor.

Spotfire_Regression_Modeling_6

If the normal QQ plot closely approximates to the line y=x, then the model fits the data well.

Spotfire_Regression_Modeling_7 Spotfire_Regression_Modeling_8

In the above plot, the larger values represent points (data points) which are more influential and have to be further investigated.

Depending on these various factors, the model has to go through a series of investigative steps till a satisfactory level of fit is reached.

In addition to the knowledge of statistics, domain specific understanding is also quite crucial in assessing the inputs and the results. For example when analyzing sales, we examine specific types of sales broken into tiers depending on various criteria such as quarter of the year, geographic factors, economic indicators, seasonal influences etc.

We can exclude the outliers which will skew our results. Further, appropriate weights could be distributed on each input parameter to identify whether the specific type of sale is profitable to our business.

Visualizing data using Box Plots in R

Posted on Updated on

R provides quick way of performing exploratory analysis of your data using boxplots

data<-data.frame(Stat11=rnorm(100,mean=3,sd=2), 
Stat21=rnorm(100,mean=4,sd=1), Stat31=rnorm(100,mean=6,sd=0.5), Stat41=rnorm(100,mean=10,sd=0.5), Stat12=rnorm(100,mean=4,sd=2), Stat22=rnorm(100,mean=4.5,sd=2), Stat32=rnorm(100,mean=7,sd=0.5), Stat42=rnorm(100,mean=8,sd=3), Stat13=rnorm(100,mean=6,sd=0.5), Stat23=rnorm(100,mean=5,sd=3), Stat33=rnorm(100,mean=8,sd=0.2), Stat43=rnorm(100,mean=4,sd=4)) 
df = data.frame(data)

Box_Plots_1

boxplot(data, las = 2, names = c("Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4"))

Box_Plots_2

boxplot(data, ylab ="APR (%)", xlab ="Time", las = 2, names = c("Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4"))

Box_Plots_3

Spotfire: Map Charts

Posted on Updated on

Tibco has made several smart acquisitions and integrated capabilities from acquired technology companies to its existing suite of products. One such acquisition is that of Maporama Solutions, which helped Tibco tremendously improve Spotfire’s geospatial analytic capabilities.  As a result, Spofire 6.0 release introduced many features which help develop rich map visualizations for analysis especially allowing to integrate GIS data and the OpenSpirit connect.

Map_Charts_1

The map chart allows adding four types of layers which can be superimposed upon each other to increase the detail of analyses performed.

Adding layers is easy and some knowledge of appropriately combining these layers (their interoperability) with your data can be used for many important use-cases across industries.

Below we are adding multiple layers in the visualization to increase the level of detail. The Marker layer holds the data we want to view in a certain context, which in turn are introduced by Map, Feature and WMS layers.

Map_Charts_2

By adding these new layers, zoom visibility can be configured for each layer – so you can only see the relevant details at each zoom level. That is, you would not want to see the roads in a country at the highest zoom level (offering a continent view of the world). This is a very good feature in Spotfire which greatly improves the map navigation and readability across various zoom levels.

Map_Charts_3

Depending on the business requirement, we can use data to add different marker layers.

As in below visualization, we can locate various stores a company has in a particular area (using the data on marker layer). However, at any given point, we can use data from only a singly marker layer – by checking on the relevant marker layer from the chart.

Map_Charts_4

Below I have marked some specific stores to see their performance and showing complete data from your marker layer for each selected point – using Detail-on-demand feature.

Map_Charts_5

We can control which map, marker or features layer has to be seen at a given point.

Now I am adding a WMS layer to pull data via the WMS URL from National Atlas service.

Map_Charts_6

We can superimpose WMS sub layers in a single chart to combine our view of background layers or simply create separate WMS layers (with single sub layer) in order for separate viewing.

I have added “People – Density 2000” sub layer from this WMS service to see how is the population density in my market area.

Map_Charts_7

You can combine multiple WMS sub layers in single chart along with your Marker layer (which holds the data you want to analyze) and build detailed visualizations.

Map_Charts_8

 

To note, only Marker and Feature layer can be used for interactivity with in the chart. WMS layer only adds the detail into the chart from the webservice – that is just the underlying layer to view your data from Marker layer.

References:

Gartner Report – Business Intelligence and Analytics Platforms, 2014

http://learn.spotfire.tibco.com/pluginfile.php/17355/mod_resource/content/9/Spotfire6_NewFeatures.pdf

 

Spotfire IronPython: Reset all filters

Posted on Updated on

IronPython Code:

import Spotfire.Dxp.Application.Filters as filters

for scheme in Document.FilteringSchemes:
scheme.ResetAllFilters()

Spotfire IronPython: Reset all visible filters

Posted on Updated on

IronPython Code to reset all applied visible filters:

import Spotfire.Dxp.Application.Filters as filters
from Spotfire.Dxp.Application.Filters import *
# Navigate through each page in the analyses
for eachPage in Document.Pages:
# Get the active data table in the current page
activeTable = eachPage.ActiveDataTableReference
# Get the set of all table groups in the current page
tableGroup = eachPage.FilterPanel.TableGroups
# Navigate through all the table groups in the filter panel of current page
for t in tableGroup:
# To filter on the active data table used in the filter panel
if (t.Name == activeTable.Name):
# In table group (data table), navigate through the filter groups
for filterSubGroup in t.SubGroups:
# Navigate through the filters in each filter group
for filterHandle in filterSubGroup.FilterHandles:
filterReference = filterHandle.FilterReference
# Reset if the filter is visible
if filterHandle.Visible==True:
filterReference.Reset()
print "Reset done for: " + eachPage.Title + '.' + t.Name + '.' + filterSubGroup.Name + '.' + filterReference.Name
if filterHandle.Visible==False:
print "Reset not done for: " + eachPage.Title + '.' + t.Name + '.' + filterSubGroup.Name + '.' + filterReference.Name

Please take note that this script can be used to not reset the hidden filters in the analyses. That is, this will reset all the visible filters which are part of filter groups across all the pages in the analyses.

Snapshot for indentation reference:

Spotfire_IronPython_Reset_All_Visible_Filters

Spotfire: Custom Gauge for Dashboards

Posted on Updated on

Implementing a Custom Gauge for dashboards

Custom_Gauge_1

Taking the concept further, we can have a slider on the dashboard to reflect the changing value in the Gauge as in the below screenshot:

Custom_Gauge_2

HTML:

<DIV style=”WIDTH: 400px; HEIGHT: 320px” id=gauge>
<H4>Change Value</H4>
<SpotfireControl id=”891ea35a30ad4cc9a92c68b6134f4604″ /></DIV>

 

JS:

//define your javascript libraries
resource=[ “//cdn.jsdelivr.net/raphael/2.1.0/raphael-min.js”,
“//cdn.jsdelivr.net/justgage/1.0.1/justgage.min.js”]

//add scripts to head
$.getScript(resource[0],function(){
$.getScript(resource[1],init)
})

//Make sure you use the Slider Spotfire Control in the “value” field

init=function(){
var g = new JustGage({
id: “gauge”,
value: parseInt($(“#891ea35a30ad4cc9a92c68b6134f4604”).text().substring(0,2)),
min: 0,
max: 100,
title: “Visitors”
});
}

 

Reference