kmeans-segments2

Visualize Clustering with SOM in Anatella / R

Note: The data is extracted from Marketing Engineering, with the kind permission of Dr A. De Bruyn. A very complete R code for SOM can be found in the excellent post of Shane Lynn http://www.shanelynn.ie/self-organising-maps-for-customer-segmentation-using-r/ Clustering is a tricky business. while the hardest part of it lies in the business process, there is also a
classification_1

Classification problems: lift curve or classification table?

The common idea of classifying is to look at “small groups” of records, and evaluate if we should put them a 1 or a 0 when it comes to a particular target. For example, if I am interested in figuring out who will get cancer, I can “build” the following logic, without requiring any predictive
compute_lift

Lift, ROC, AUC and Gini

One good way to compare different predictive modeling platforms is to compare the models that are produced by these platform.Comparing models across platform is not an easy task. Models can be compared using various criteria’s: 1.    Simple predictive Model Quality (i.e. Height of the lift curve / AUC) 2.    Generalization ability of the Model (Difference
lca

Recoding variables with R and Anatella

Variable recoding can be a real pain in the neck. Although there are functionalities to do this in R and Anatella, doing it right is not always easy. For example, when working with Latent Class Analysis, we want to get all our variables in a few categories, and quantiles do not really do the job.