Machine Learning Project development Step 2: Choice of algorithm based on Supervised Learning Algorithm:

 

Machine Learning Project development Step 2: Choice of algorithm based on Supervised Learning Algorithm:

First step2: Machine learning project development: Choose an algorithm if the project is coming under supervised learning

There are many supervised algorithms. A supervised learning algorithm is divided into regression and classification. If the output is discrete or binary choose the classification algorithm and if it is continuous choose regression. So step 2 is --

Identify whether the problem is coming under regression or classification?. Choose the algorithm accordingly.

Different algorithms are used for regression and classification problems. A few of them are listed below.

Algorithm for regression analysis

1.       Linear Regression

2.       Decision tree regression

3.       Random Forest

4.       Gradient boosted trees

5.       Neural Network

Algorithm for classification are :

1.       Logistic Regression

2.       Naïve Bayes

3.       Stochastic Gradient Decent

4.       K-Nearest Neighbors

5.       Decision Tree

6.       Random Forest

7.       Support Vector Machine

Criteria to choose the algorithm:

1.                   Amount of data set: i)No of samples. ii)Number of features in one sample 

2.                    Non-linearity present in the data

 

Case1: If the amount of data is less and (no of features) and less non-linearity a traditional algorithm such as linear regression, logistic regression Decision the tree will work better.

Majority of the cases Artificial Neural Network with different size network will provide good result

As data size goes on increasing performance goes on reducing and need to build a big network to improve results.

Case2: Data set is an essential requirement of the machine learning algorithm. Companies, Govt offices, the Medical sector, The education sector, the Business sector produces a huge amount of data every day. With the help of these data, machine learning models are created. All generated data is in raw form. Directly raw data is of no use and needs to apply data analysis tools. Data analysis is the process of collecting, cleaning, and transforming data into useful  information

Various data analysis tools are available. The languages of R and Python (with no prior programming experience required), how to create data visualizations with Tableau, and apply by applying statistics and analytics we can find out nonlinearities in it, which will be useful to make the choice of algorithm

As the data size goes on increasing performance of the net goes on reducing. Nowadays as the speed of a computer is not an issue and therefore ANN is the best choice.  In  ANN, network size goes on increasing as data size increases and the performance of the model also goes on increasing. For too large data set Deep Learning is the most preferred choice


 (Ref.:Diagram is taken from ( Coursera course of ANN)

 

 

 

Comments

Popular posts from this blog

WHY DEEP NEURAL NETWORK: Difference between neural network and Deep neural network

Linear Regression and Gradient Decent method to learn model of Linear Regression.