Way2It4U - Machine Learning with R and Data Science Online Training

Way2IT4U – Machine Learning with R Online Training

Way2IT4U – Data Science Online Training

machine-learning-online-training-hyderabad

Why Way2IT4U Machine Learning with R Online Training

Way2IT4U – Machine Learning with R Online Training and Data Science Online Training: In this term, you’ll begin by exploring core machine learning concepts, before moving on to supervised and unsupervised learning.

The concept in machine learning is similar to that of data mining and predictive modeling. In both require searching through data to look for the patterns and adjust the data accordingly. This concept is mostly used in the area of Online Shopping, fraud detection, online shopping, spam filtering, predictive maintenance.

Throughout Machine Learning online training, you will be getting the clear functioning of each and every algorithm.

Table of Contents

Table of Contents

DATA SCIENCE:

INTRODUCTION TO DATA SCEINCE

NEED OF DATA SCEINCE
Who is data scientist
Who can be data scientist
Technologies used in data science
Work flow of data science

INTRODUCTION TO MACHINE LEARINING

What is machine learning
Types of learning
Supervised learning
Un-super vised learning
Machine learning algorithms
- KNN
Naïve Bayes
Decision trees
Classification rules
Regression
Linear regression
Logistic regression
K-means clustering
Association rules
Neural networks
Svm’s

MACHINE LEARNING LANGUAGE

R
Python
Mahout
Spark Mlib
Amazon machine learning
Lisp
Java etc.

INTRODUCTION TO R

History of s/s-plus
Development with R
Installing R and R-studio
C ram
Setting up R environment
Installing & loading of R packages
Basic data types in R
Structures in R
Vector
List
Matrix
Data frame
array
R as calculator
Performing calculations on different structure

SUB SETTING THE DATA

Extracting required data from R objects
Subset
Using DPLYR and TIDYR package

LOADING DATA INTO R OBJECTS

Data extraction from URL
Data extraction from CSY
Data extraction from clipboard

VIEWING / THE DATA USING R EXPLORING

Statistical observation
Mean
Mode
Median
Quantile
Box plots/histograms /plot for observing of data

GRAPHICS WITH R

Managing graphics
High level plotting commands
Plot() function
Plotting multiple curves on the same graph
Pie charts
Histograms
Box plot
Scatter plots
QQ(Quantile-quantile)plots
Use copy /paste to copy
3-dimensional plots
Use a legend()function
Low-level plotting commands

ANOVA

Anova(analysis of variance)
Data files
Inputting data
One –way anova
Two –way anova
Graphical summary of anova
Extracting means and statistics
Table commands
More complex anova models

LAZY LEARNING –CLASSIFICATION USING NEAREST NEIGHBORS

Understanding classification using nearest neighbours
The KNN algorithm
Calculating distance
Choosing an appropriate k
Preparing data for use with KNN
Why is the KNN algorithm lazy?
Diagnosing cancer with the KNN algorithm
Collecting data
Exploring and preparing the data
Transformation-normalizing numeric the data
Data preparing –creating training and test datasets
Training a model on the data
Evaluating model performance
Improving model performance
Transformation –z-score standardization
Testing alternative values of k

PROBABILISTIC LEARNING – CLASSIFICATION USING NAÏVE BAYES

Understanding naïve Bayes
Basic concepts of Bayesian methods
Probability
Joint probability
Conditional probability with Bayes’ theorem
The naïve Bayes algorithm
The naïve Bayes classification
The Laplace estimator
Using numeric features with naïve Bayes
Filtering mobile phone spam with the naïve Bayes algorithm
Collecting data
Exploring and preparing the data
Data preparation –processing text data for analysis
Data preparation –creating training and test datasets
Visualizing text data-word clouds
Data preparation-creating indicator features for frequent words
Training a model on the data
Evaluating model performance
Improving model performance

DIVIDE AND CONQUER –CLASSIFICATION USING DECISION TREES AND RULES

Understanding decision trees
Divide conquer
The C5.0decision tree algorithm
Choosing the best split
Pruning the decision tree
Identifying risky bank loans using C5.0 decision trees
Collect data
Exploring and preparing the data
Data preparation-creating random training and test datasets
Training a model on the data
Evaluating model performance
Improving model performance
Boosting the accuracy of decision trees
Making some mistakes more costly than others
Understanding classification rules
Separate and conquer
The one rule algorithm
The RIPPER algorithm
Rules from decision trees
Identifying poisonous mushrooms with rule learners
Collecting data
Exploring and preparing data
Training a model on the data
Evaluating model performance
Improving model performance

FORECASTING NUMARIC DATA – REGRESSION METHODS

Understanding regression
Simple linear regression
Ordinary least squares estimation
Correlations
Multiple linear regression
Predicting medical expenses using linear regression
Collecting data
Exploring and preparing data
Exploring relationships among features- the correlation matrix
Visualizing relationships among features –the scatterplot matrix
Training a model on the data
Evaluating model performance
Improving model performance
Model specification –adding non-linear relationships
Transformation –converting a numeric variable to a binary indicator
Model specification –adding interaction effects
Putting it all together-an improved regression model
Understanding regression trees and model trees
Adding regression to trees
Estimating the quality of wines with regression trees and model trees
Collecting data
Exploring and preparing the data
Training a model on the data
Visualizing decision trees
Evaluating model performance
Measuring performance with mean absolute error
Improving model performance

FINDING PATTERNS- MARKET BASKET ANALYSIS USING ASSOCIATION RULES

Understanding association rules
The Apriori algorithm for association rule learning
Measuring rule interest –support and confidence
Building a set of rules with the Apriori
Identifying frequently purchased groceries with association rules
Collecting data
Exploring and preparing the data
Data preparation – creating a sparse matrix for transaction data
Visualizing item support –item frequency plots
Visualizing transaction data-plotting the sparse matrix
Training a model on the data
Evaluating model performance
Improving model performance
Sorting the set of association rules
Taking subsets of association rules
Saving association rules to a file or data frame

FINDING GROUPS OF DATA- CLUSTERING WITH K-MEANS

Understanding clustering
Clustering as a machine learning task
The K-means algorithm for clustering
Using distance to assign and update cluster
Choosing the appropriate number of cluster
Finding teen market segments using K-means clustering
Collecting data
Exploring and preparing the data
Data preparation –dummy coding missing values
Data preparing –imputing missing values
Training a model on the data
Evaluating model performance
Improving model performance

EVALUATING MODEL PERFORMANCE

Measuring performance for classification
Working with classification prediction data in R
A closer look at confusion matrices
Using confusion matrices to measure performance
Beyond accuracy – other measure of performance
The kappa statistic
Sensitivity and specificity
Precision and recall
The F- measure
Visualizing performance TRADEOFFS
ROC curves
Estimating future performance
The holdout method
Cross-validation
Bootstrap sampling

IMPROVING MODEL PERFORMANCE

Tuning stock models for better performance
Using caret for automated parameter tuning
Creating a simple tuned model
Customizing the tuning process
Improving model performance with meta – learning
Understanding ensembles
Bagging
Boosting
Random forests
Training random forests
Evaluating random forest performance