While working on Machine Learning/Predictive Modelling problems, feature selection is an important step. It is because, we get a dataset with too many variables in practical model building problems in which all variables are not relevant to the problem, and this we don’t know in advance. Also, there are some disadvantages of using all given[…]
In twitter, reviewers are mostly talking about black money, currency ban, Modi fights corruption etc.;In facebook, reviewers are mostly talking about PM Modi’s Master stroke, Bank, Money, Currency etc.
Descriptive Statistics is the term given to the analysis of the data, which will show meaningful insights, patterns present in data. However this doesn’t allow us to make any conclusions beyond the given data points. Let us take an example, Suppose in a company if Higher Management asked for Revenue data. Then directly giving him[…]
What it is? I came across this technique while working with Text. I was trying to analyse Twitter’s tweets and Facebook’s posts from page after Reliance Jio Launch. Analysis invloves: Data Collection Data Cleaning Word Cloud creation Sentiment Analysis After this I was thinking to do something else, while searching on net I found this new[…]
This is totally for educational purpose, this analysis may/ may not match with other analysis.
This post states the steps to connect R studio with SQL Server, so that we can directly access tables and can do analysis on data stored in SQL Server. System Related Settings 1. Go to Control panel of your system. 2. Click on Administrative tools 3. Select User dsn -> click on “add” -> “Sql[…]
Facebook is very famous, commonly used Social Media today. We can get huge data from here and do Text Mining and Sentiment Analysis. I also have thought of doing this. 🙂 To create connection between Rstudio and Facebook, user should have :- 1. facebook account 2. registered with Facebook API (click on “register now” option[…]
If we are analyzing dataset using pivot tables like this: Let suppose there are many values in column selected as filter (here we want to know detail month wise – it has 12 values), so each time selecting value from drop down is little difficult. Slicer can be used to simplify this scenario, as it[…]
Missing value or junk value imputation with mean/median/mode is the very basic part of data cleaning [ Read This ], as these processes will give the accuracy up to a certain level. Also if mean/median/mode are applicable when our data is in some traditional format, but in most of the practical scenario it is not.[…]
Data cleaning is the most important part of data analysis and if we have missing values in our dataset, our task is going to be more tedious. If number of observations with missing values is <=5% we can simply delete those observations, but what if the number of observations are a lot? then we can’t[…]