# Text Analytics Part I – Web Crawling using R

Most of us use India’s most popular shopping site flipkart for viewing the specifications of electronic goods especially cell phones. Before buying any phone, people generally visit this site and look for reviews of their products which they are planning to buy. That’s why I have choosen this site as a work for my analysis and[…]

# How hierarchical clustering works?

Hierarchical cluster analysis or HCA is a method of cluster analysis which seeks to build a hierarchy of clusters. This can be done using two approaches: Agglomerative: This is a “bottom up” approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Divisive: This is[…]

# Code for Clustering in R using Iris dataset

> View(iris) To know the optimal no of clusters, using hierarchical clustering methodology: > d=dist(scale(iris[,-5])) > h=hclust(d,method=’ward.D’) > plot(h,hang=-1) > k=kmeans(iris[,-5],3) > rect.hclust(h,h=35,border=”blue”) > k Following dendrogram appeared: Selecting 3 to be most optimal, applying k-means to get the centers for these 3 clusters: > k=kmeans(iris[,-5],3,nstart=20) By giving nstart=20, we are fixing the starting point[…]

# How Decision Tree – Classifier works?

Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. The topmost decision node in a tree[…]

# How k-means clustering works?

K-means is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. Algorithm Steps: Step 1: First decide the no of clusters (let suppose k clusters we want to create) Step 2: Randomly assign centres to these k clusters Step 3: Calculate the distance of remaining data points with these k clusters[…]