Posts

Starbucks Project

Image
High-level overview: This blog is going to investigate a dataset provided by Starbucks. This dataset includes loads of information about their customers, their offers (promotions). By merging three separate tables, this project illustrates some main points about data, some problems which can be fixed, and also, discuss about how to predict using existing features. Problem domain This project is going to find out the possibility of predicting the Starbuck users' behaviour in response to offers given. Project origin This data set contains simulated data that mimics customer behavior on the Starbucks rewards mobile app. Description of Input Data Data sets portfolio.json - containing offer ids and meta data about each offer. profile.json - demographic data for each customer transcript.json - records for transactions, offers received, offers viewed, and offers completed. More details about each data set: portfolio.json id (string) - offer id offer_type (string) - type of offer ie BOGO,

Three facts about Stackoverflow survey data

Image
Introduction In this short blog, I would like to draw your interest in the survey coming from Stackoverflow. You might wonder whether or not parents' education inspire their offsprings, especially in the area of IT. Also, is race a feature affecting an individual's income? Finally, with less features than the original model, will it be still significant enough to explain the features which have impact on developers' salary. Let's dig into this article to find down. Part 1: The relationship between race and salary Firstly, it can be seen that a large share of respondents fall into "White or of European descent". In the next step, I would like to call this category shortly "White" and try to see some charts of salary among "White" and the rest. According to the bar plot, "white" developers have a higher average salary than others. However, to see the distribution among the two categories, let's have a further look at the histogr