Predictive Lead scoring App in under 15 mins

Qualifying leads is an important activity that sales teams focus to improve their productivity and win customers. We will show you how easy is to build a lead scoring app using machine learning algorithms. For this blog we will use but in the next few days we will show how easy is to build the same using Apache Spark, Azure ML. etc.

Following are the ingredients you need to cook a delicious predictive lead scoring model.

  • Lead dataset. For this example we will use the SFDC lead object.
  • Build the model. We will be using Machine Learning platform.
  • Train and validate models.
  • Review results
  • Visualization – Our favorite Salesforce Wave!
  1. Prepare the dataset.

This is the first step and in many cases time consuming activity in data analytics. We have to prep the data based on the models we are planning to build. We generated sample lead dataset using mockaroo with the following fields

  1. Full Name
  2. Company
  3. email
  4. State
  5. City
  6. Title
  7. Lead source
  8. Company Size
  9. isCustomer – This is the objective field that we will be predicting.

Sample records.

first_name Company Email State City Title LeadSource isCustomer CompanySize
Jose Wood Yozio KS Kansas City Account Coordinator Dreamforce TRUE Under 1M
Paul Kelley Zoonoodle SC Columbia Environmental Specialist Onsite FALSE Under 1M
Louis Stewart Podcat NY New York City Account Executive Webinar TRUE Between 1M and 5M
Johnny Carr Trudoo FL Pensacola Assistant Media Planner Event TRUE Under 1M
Theresa Reed Flipstorm MD Silver Spring Computer Systems Analyst IV Dreamforce FALSE Under 1M
Louis Evans Gabvine MT Bozeman Assistant Manager Demo FALSE Between 1M and 5M
Judy Anderson Skajo MO Jefferson City Biostatistician IV Demo TRUE Under 1M
Mark Vasquez Buzzster NY Rochester Geologist IV Onsite FALSE Under 1M
Judith Freeman Linklinks CA Stockton Biostatistician IV Dreamforce TRUE Between 5M and 10M

2. Build the model

Here are the step by step instructions to build a model in

  • Download the latest edition of and run the instance. Its straightforward and should be up and running in few mins.
  • Import the files.




Parse the file.

Key thing to note here is choose the right attributes for the fields.



Create test and validation datasets.




Select train (75% split) dataset to build the model.





Choose the algorithm, for this example I chose GBM. Select the train and validation datasets and the objective field we want to predict.





Model completes and results are available to review!

model complete

Review results


Now use the predict function to run on a test dataset to score leads!

Possibilities are end less

  1. Fetch SFDC lead dataset, combine with data providers like Axciom, D&B etc
  2. Associate any other marketing data
  3. Use to train and validate models.
  4. Load the results of the datasets in visualization tools like Salesforce Wave for business users to access the results of the predicted status.