### Using Spotfire for Predictive Analytics (Regression Modeling)

We are building a model using Linear Regression to forecast sales

`Sales` ~ `Order Quantity` + `Discount` + `Shipping Cost` + `Profit` + `Unit Price` + `Product Base Margin`

This is the model with “Sales” as the Response variable and all the subsequent columns after the “~” considered as Predictor variables.

Let us click on “OK” to examine the results for the model

In the Model Summary pane, we can check the summary metrics:

Residual standard error: 1421 on 8329 degrees of freedom(63 observations deleted due to missingness)Multiple R-squared: 0.8421, Adjusted R-squared: 0.842F-statistic: 7406 on 6 and 8329 DF, p-value: 0

Below is the significance of the model parameters:

Residual Standard Error: A lower value indicates the model is better fit for our data.

Adjusted R-Squared: This is a commonly used measure of fit of a regression equation. It penalizes the addition of too many variables while rewarding a good fit of the regression equation. A higher Adjusted R-Squared value represents the model to be a better fit.

p-value: Predictors with this value closer to zero are better contributing to the model

Some of the other factors which will influence our model are Collinearity and multicollinearity, and Variance Inflation Factor (VIF), AIC and BIC values can help assess our model.

Collinearity is a case of an independent variable being a linear function of another. And in Mulitcollinearity, a variable is a linear function of two or more variables. These issues can increase the likelihood of making false conclusions from our estimates.

High VIF means that multicollinearity significantly impacts the equation whereas lower AIC and BIC are better.

The Table of Coefficients will have various p-values for various predictors (also called Regressors). Lower p-values will give the significance of each predictor in the model

If there are patterns in the the “Residuals vs. Fitted” plot, then the current model could be improved.

A simple horizontal bar signifying the relative importance of each predictor used in the model. Discount is the least important predictor.

If the normal QQ plot closely approximates to the line y=x, then the model fits the data well.

In the above plot, the larger values represent points (data points) which are more influential and have to be further investigated.

Depending on these various factors, the model has to go through a series of investigative steps till a satisfactory level of fit is reached.

In addition to the knowledge of statistics, domain specific understanding is also quite crucial in assessing the inputs and the results. For example when analyzing sales, we examine specific types of sales broken into tiers depending on various criteria such as quarter of the year, geographic factors, economic indicators, seasonal influences etc.

We can exclude the outliers which will skew our results. Further, appropriate weights could be distributed on each input parameter to identify whether the specific type of sale is profitable to our business.

### Visualizing data using Box Plots in R

R provides quick way of performing exploratory analysis of your data using boxplots

data<-data.frame(Stat11=rnorm(100,mean=3,sd=2), Stat21=rnorm(100,mean=4,sd=1), Stat31=rnorm(100,mean=6,sd=0.5), Stat41=rnorm(100,mean=10,sd=0.5), Stat12=rnorm(100,mean=4,sd=2), Stat22=rnorm(100,mean=4.5,sd=2), Stat32=rnorm(100,mean=7,sd=0.5), Stat42=rnorm(100,mean=8,sd=3), Stat13=rnorm(100,mean=6,sd=0.5), Stat23=rnorm(100,mean=5,sd=3), Stat33=rnorm(100,mean=8,sd=0.2), Stat43=rnorm(100,mean=4,sd=4)) df = data.frame(data)

boxplot(data, las = 2, names = c("Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4"))

boxplot(data, ylab ="APR (%)", xlab ="Time", las = 2, names = c("Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4","Station 1","Station 2","Station 3","Station 4"))

### Spotfire: Map Charts

Tibco has made several smart acquisitions and integrated capabilities from acquired technology companies to its existing suite of products. One such acquisition is that of Maporama Solutions, which helped Tibco tremendously improve Spotfire’s geospatial analytic capabilities. As a result, Spofire 6.0 release introduced many features which help develop rich map visualizations for analysis especially allowing to integrate GIS data and the OpenSpirit connect.

The map chart allows adding four types of layers which can be superimposed upon each other to increase the detail of analyses performed.

Adding layers is easy and some knowledge of appropriately combining these layers (their interoperability) with your data can be used for many important use-cases across industries.

Below we are adding multiple layers in the visualization to increase the level of detail. The Marker layer holds the data we want to view in a certain context, which in turn are introduced by Map, Feature and WMS layers.

By adding these new layers, zoom visibility can be configured for each layer – so you can only see the relevant details at each zoom level. That is, you would not want to see the roads in a country at the highest zoom level (offering a continent view of the world). This is a very good feature in Spotfire which greatly improves the map navigation and readability across various zoom levels.

Depending on the business requirement, we can use data to add different marker layers.

As in below visualization, we can locate various stores a company has in a particular area (using the data on marker layer). However, at any given point, we can use data from only a singly marker layer – by checking on the relevant marker layer from the chart.

Below I have marked some specific stores to see their performance and showing complete data from your marker layer for each selected point – using Detail-on-demand feature.

We can control which map, marker or features layer has to be seen at a given point.

Now I am adding a WMS layer to pull data via the WMS URL from National Atlas service.

We can superimpose WMS sub layers in a single chart to combine our view of background layers or simply create separate WMS layers (with single sub layer) in order for separate viewing.

I have added “People – Density 2000” sub layer from this WMS service to see how is the population density in my market area.

You can combine multiple WMS sub layers in single chart along with your Marker layer (which holds the data you want to analyze) and build detailed visualizations.

To note, only Marker and Feature layer can be used for interactivity with in the chart. WMS layer only adds the detail into the chart from the webservice – that is just the underlying layer to view your data from Marker layer.

References:

Gartner Report – Business Intelligence and Analytics Platforms, 2014

### Spotfire IronPython: Reset all filters

IronPython Code:

import Spotfire.Dxp.Application.Filters as filters for scheme in Document.FilteringSchemes: scheme.ResetAllFilters()

### Spotfire IronPython: Reset all visible filters

IronPython Code to reset all applied visible filters:

import Spotfire.Dxp.Application.Filters as filters from Spotfire.Dxp.Application.Filters import * # Navigate through each page in the analyses for eachPage in Document.Pages: # Get the active data table in the current page activeTable = eachPage.ActiveDataTableReference # Get the set of all table groups in the current page tableGroup = eachPage.FilterPanel.TableGroups # Navigate through all the table groups in the filter panel of current page for t in tableGroup: # To filter on the active data table used in the filter panel if (t.Name == activeTable.Name): # In table group (data table), navigate through the filter groups for filterSubGroup in t.SubGroups: # Navigate through the filters in each filter group for filterHandle in filterSubGroup.FilterHandles: filterReference = filterHandle.FilterReference # Reset if the filter is visible if filterHandle.Visible==True: filterReference.Reset() print "Reset done for: " + eachPage.Title + '.' + t.Name + '.' + filterSubGroup.Name + '.' + filterReference.Name if filterHandle.Visible==False: print "Reset not done for: " + eachPage.Title + '.' + t.Name + '.' + filterSubGroup.Name + '.' + filterReference.Name

Please take note that this script can be used to not reset the hidden filters in the analyses. That is, this will reset all the visible filters which are part of filter groups across all the pages in the analyses.

Snapshot for indentation reference:

### Spotfire: Custom Gauge for Dashboards

Implementing a Custom Gauge for dashboards

Taking the concept further, we can have a slider on the dashboard to reflect the changing value in the Gauge as in the below screenshot:

HTML:

<DIV style=”WIDTH: 400px; HEIGHT: 320px” id=gauge>

<H4>Change Value</H4>

<SpotfireControl id=”891ea35a30ad4cc9a92c68b6134f4604″ /></DIV>

JS:

//define your javascript libraries

resource=[ “//cdn.jsdelivr.net/raphael/2.1.0/raphael-min.js”,

“//cdn.jsdelivr.net/justgage/1.0.1/justgage.min.js”]

//add scripts to head

$.getScript(resource[0],function(){

$.getScript(resource[1],init)

})

//Make sure you use the Slider Spotfire Control in the “value” field

init=function(){

var g = new JustGage({

id: “gauge”,

value: parseInt($(“#891ea35a30ad4cc9a92c68b6134f4604”).text().substring(0,2)),

min: 0,

max: 100,

title: “Visitors”

});

}