Way2IT4U – Machine Learning with R Online Training
Way2IT4U – Data Science Online Training
Why Way2IT4U Machine Learning with R Online Training
Way2IT4U – Machine Learning with R Online Training and Data Science Online Training: In this term, you’ll begin by exploring core machine learning concepts, before moving on to supervised and unsupervised learning.
The concept in machine learning is similar to that of data mining and predictive modeling. In both require searching through data to look for the patterns and adjust the data accordingly. This concept is mostly used in the area of Online Shopping, fraud detection, online shopping, spam filtering, predictive maintenance.
Throughout Machine Learning online training, you will be getting the clear functioning of each and every algorithm.
DATA SCIENCE:
- INTRODUCTION TO DATA SCEINCE
- NEED OF DATA SCEINCE
- Who is data scientist
- Who can be data scientist
- Technologies used in data science
- Work flow of data science
- INTRODUCTION TO MACHINE LEARINING
- What is machine learning
- Types of learning
- Supervised learning
- Un-super vised learning
- Machine learning algorithms
- KNN
- Naïve Bayes
- Decision trees
- Classification rules
- Regression
- Linear regression
- Logistic regression
- K-means clustering
- Association rules
- Neural networks
- Svm’s
- MACHINE LEARNING LANGUAGE
- R
- Python
- Mahout
- Spark Mlib
- Amazon machine learning
- Lisp
- Java etc.
- INTRODUCTION TO R
- History of s/s-plus
- Development with R
- Installing R and R-studio
- C ram
- Setting up R environment
- Installing & loading of R packages
- Basic data types in R
- Structures in R
- Vector
- List
- Matrix
- Data frame
- array
- R as calculator
- Performing calculations on different structure
- SUB SETTING THE DATA
- Extracting required data from R objects
- Subset
- Using DPLYR and TIDYR package
- LOADING DATA INTO R OBJECTS
- Data extraction from URL
- Data extraction from CSY
- Data extraction from clipboard
- VIEWING / THE DATA USING R EXPLORING
- Statistical observation
- Mean
- Mode
- Median
- Quantile
- Box plots/histograms /plot for observing of data
- GRAPHICS WITH R
- Managing graphics
- High level plotting commands
- Plot() function
- Plotting multiple curves on the same graph
- Pie charts
- Histograms
- Box plot
- Scatter plots
- QQ(Quantile-quantile)plots
- Use copy /paste to copy
- 3-dimensional plots
- Use a legend()function
- Low-level plotting commands
- ANOVA
- Anova(analysis of variance)
- Data files
- Inputting data
- One –way anova
- Two –way anova
- Graphical summary of anova
- Extracting means and statistics
- Table commands
- More complex anova models
- LAZY LEARNING –CLASSIFICATION USING NEAREST NEIGHBORS
- Understanding classification using nearest neighbours
- The KNN algorithm
- Calculating distance
- Choosing an appropriate k
- Preparing data for use with KNN
- Why is the KNN algorithm lazy?
- Diagnosing cancer with the KNN algorithm
- Collecting data
- Exploring and preparing the data
- Transformation-normalizing numeric the data
- Data preparing –creating training and test datasets
- Training a model on the data
- Evaluating model performance
- Improving model performance
- Transformation –z-score standardization
- Testing alternative values of k
- PROBABILISTIC LEARNING – CLASSIFICATION USING NAÏVE BAYES
- Understanding naïve Bayes
- Basic concepts of Bayesian methods
- Probability
- Joint probability
- Conditional probability with Bayes’ theorem
- The naïve Bayes algorithm
- The naïve Bayes classification
- The Laplace estimator
- Using numeric features with naïve Bayes
- Filtering mobile phone spam with the naïve Bayes algorithm
- Collecting data
- Exploring and preparing the data
- Data preparation –processing text data for analysis
- Data preparation –creating training and test datasets
- Visualizing text data-word clouds
- Data preparation-creating indicator features for frequent words
- Training a model on the data
- Evaluating model performance
- Improving model performance
- DIVIDE AND CONQUER –CLASSIFICATION USING DECISION TREES AND RULES
- Understanding decision trees
- Divide conquer
- The C5.0decision tree algorithm
- Choosing the best split
- Pruning the decision tree
- Identifying risky bank loans using C5.0 decision trees
- Collect data
- Exploring and preparing the data
- Data preparation-creating random training and test datasets
- Training a model on the data
- Evaluating model performance
- Improving model performance
- Boosting the accuracy of decision trees
- Making some mistakes more costly than others
- Understanding classification rules
- Separate and conquer
- The one rule algorithm
- The RIPPER algorithm
- Rules from decision trees
- Identifying poisonous mushrooms with rule learners
- Collecting data
- Exploring and preparing data
- Training a model on the data
- Evaluating model performance
- Improving model performance
- FORECASTING NUMARIC DATA – REGRESSION METHODS
- Understanding regression
- Simple linear regression
- Ordinary least squares estimation
- Correlations
- Multiple linear regression
- Predicting medical expenses using linear regression
- Collecting data
- Exploring and preparing data
- Exploring relationships among features- the correlation matrix
- Visualizing relationships among features –the scatterplot matrix
- Training a model on the data
- Evaluating model performance
- Improving model performance
- Model specification –adding non-linear relationships
- Transformation –converting a numeric variable to a binary indicator
- Model specification –adding interaction effects
- Putting it all together-an improved regression model
- Understanding regression trees and model trees
- Adding regression to trees
- Estimating the quality of wines with regression trees and model trees
- Collecting data
- Exploring and preparing the data
- Training a model on the data
- Visualizing decision trees
- Evaluating model performance
- Measuring performance with mean absolute error
- Improving model performance
- FINDING PATTERNS- MARKET BASKET ANALYSIS USING ASSOCIATION RULES
- Understanding association rules
- The Apriori algorithm for association rule learning
- Measuring rule interest –support and confidence
- Building a set of rules with the Apriori
- Identifying frequently purchased groceries with association rules
- Collecting data
- Exploring and preparing the data
- Data preparation – creating a sparse matrix for transaction data
- Visualizing item support –item frequency plots
- Visualizing transaction data-plotting the sparse matrix
- Training a model on the data
- Evaluating model performance
- Improving model performance
- Sorting the set of association rules
- Taking subsets of association rules
- Saving association rules to a file or data frame
- FINDING GROUPS OF DATA- CLUSTERING WITH K-MEANS
- Understanding clustering
- Clustering as a machine learning task
- The K-means algorithm for clustering
- Using distance to assign and update cluster
- Choosing the appropriate number of cluster
- Finding teen market segments using K-means clustering
- Collecting data
- Exploring and preparing the data
- Data preparation –dummy coding missing values
- Data preparing –imputing missing values
- Training a model on the data
- Evaluating model performance
- Improving model performance
- EVALUATING MODEL PERFORMANCE
- Measuring performance for classification
- Working with classification prediction data in R
- A closer look at confusion matrices
- Using confusion matrices to measure performance
- Beyond accuracy – other measure of performance
- The kappa statistic
- Sensitivity and specificity
- Precision and recall
- The F- measure
- Visualizing performance TRADEOFFS
- ROC curves
- Estimating future performance
- The holdout method
- Cross-validation
- Bootstrap sampling
- IMPROVING MODEL PERFORMANCE
- Tuning stock models for better performance
- Using caret for automated parameter tuning
- Creating a simple tuned model
- Customizing the tuning process
- Improving model performance with meta – learning
- Understanding ensembles
- Bagging
- Boosting
- Random forests
- Training random forests
- Evaluating random forest performance