Journey of Analytics

Interactive map charts using R and leaflet.

GRAPHICAL ANALYSIS

  • R program- Create Scatter plots, histograms, barplots, boxplots, piecharts and densityplots.​

  • Heatmaps & Correlograms - see Kaggle SFO crime project below.​

  • Advanced graphics - Bubble charts, 3D surface plots and mathematical function graphs.

Marketing Analytics


1. R programs

  • mktg_fns.R => Compute RFM (recency, frequency and monetary value) and basic Customer Segmentation 

  • ​cust_revenue.R => program to calculate revenue/customer & revenue/segment , Customer Scoring and revenue prediction.

API Programming:

 

1. Twitter API:

  • R Programs to get list of Twitter Follower ids and their profiles (location, screen name, follower count, etc.) 

  •  Python program to track changes in Twitter follower count .

New projects added during 1st week of  every month. To stay updated, please subscribe to my blog page

OR

sign up here for email notifications.

Pattern analysis and "cyber-security strength" analysis for password list dataset. *** NEW ***

Text analytics:

 news Headline analysis: *** new ***

  • text pre-processing, 

  • sorts and aggregates by publisher names

  • creates word clouds and word association plots

​​​KAGGLE PROJECTS / MACHINE LEARNING


4. US education Scorecard 

  • Github link here. 

  • ​Interactive US state map showing college names, admission rate, average faculty salary, etc. 

​​​

3. SAN FRANCISCO CRIME CLASSIFICATION 

  • Kaggle score = 2.60, with multinomial regression algorithm.
  • Heatmap  to view worst affected regions. ​
  • Programs to test relationship using chisquare tests and visualizations using Correlograms and ggplot.
  • Github link here.


2. TITANIC SURVIVOR PREDICTIONS 

  • Official Score = 0.789 (~79% accurate predictions)
  • Competition link here.
  • Github code link here.  Please review the Readme.md file for program description and submission files.​​
  • Predictions made using the following algorithms: Naive Bayes Algorithm, Neural net model, Random Forest, Decision tree algorithm. 

​1. Airbnb New User Bookings

  • Official Score = 0.832 (~83% accurate predictions)
  • Competition link here.
  • Github code link here.  Please review the Readme.md file for program description and submission files.​​

Statistical & DATA Analysis


​1. R programs

The programs below are all available under the same github repository "R_projects".

  • ​Descriptive statistics  and basic associations between variables .

  • ANOVA test 

  • Chi-square test of independence

  • Pearson Correlation

  • Linear & Multiple regression

  • Machine learning with Decision Trees . 

  • Random forest algorithm.

​​​

2. SAS programs 

The programs below are all available under the same github repository "Statistics_with_SAS".

  • ​​ANOVA

  • Chi-square test of independence

  • Pearson Correlation

  • Moderator variables - with chisquare test.