What are the big data algorithms?

Big data is a very broad concept and there is no so-called big data algorithm. What you want to ask is the algorithm of big data mining:

1. Naive Bayes

Super simple, just like doing some counting work. If the conditional independence hypothesis holds, NB will converge faster than the discriminant model, so you only need a small amount of training data. Even if the hypothesis of conditional independence is not established, NB still performs surprisingly well in practice.

return

LR has many ways to regularize the model. Compared with NB's conditional independence hypothesis, LR does not need to consider whether the samples are relevant. Different from decision tree and support vector machine, NB has a good probability explanation, and it is easy to update the model with new training data (using online gradient descent method).

3. Decision chart

DT is easy to understand and explain. DT is nonparametric, so there is no need to worry about whether outliers and data are linearly separable. In addition, in many classification problems, RF is often the best, and it is fast and scalable, so it does not need to adjust many parameters like SVM, so RF is a very popular algorithm recently.

4. Support vector machine

High classification accuracy has a good theoretical guarantee for over-fitting, and it can also perform well by choosing the appropriate kernel function when facing the problem of linear inseparability of features. SVM is very popular in high-dimensional text classification.

To learn more about data mining, you can take a look at the course of CDA Data Analyst. Big data analysts now have professional international certification. "CDA data analyst" refers to a new type of data analyst who specializes in data collection, cleaning, processing and analysis, and can make business reports and provide decision-making in the Internet, finance, retail, consulting, telecommunications, medical care, tourism and other industries. Global CDA licensees are adhering to the advanced new concept of business data analysis, following the new standard of "Professional Ethics and Code of Conduct for CDA Data Analysts", giving full play to their professional ability in the field of data science, promoting scientific and technological innovation and progress, and helping the sustained economic development. Click to make an appointment for a free audition class.