With the development of computer technology, network technology, communication technology and internet technology, enterprises will produce a lot of business data in e-commerce. How to mine valuable information from rich customer data and provide effective decision-making assistance for enterprise managers is a real concern of enterprises. Among them, customer classification is one of the important functions of analytical customer relationship management. By classifying customers, distinguishing customers' Xia Yao degrees, and making special marketing plans and customer relationship management strategies for customers with different Xia Yao levels, enterprises can reduce marketing costs and improve profits and competitiveness. Customers can also get a suitable trading experience from the special marketing plan and customer relationship management strategy formulated by the food industry. Data mining is a necessary means for analytical CRM to realize its "analysis" function, and it is also an effective tool to realize customer classification.
1 Customer Relationship Management (CRM)
CRM (Customer Relationship Management) is a new management mechanism aimed at improving the relationship between enterprises and customers. It is implemented in the fields of marketing, sales, service and technical support of enterprises, and its goal is to provide better and faster services to attract and retain customers, and to reduce the cost of warehousing industry through comprehensive management of business processes.
In the environment of e-commerce, CRM enables website enterprises to better meet the needs of customers and provide better services in all business links, thus enabling website enterprises to retain existing customers and explore potential customers in this new business environment without time and space differences. So as to improve the market competitiveness. At the same time, CRM can provide important information such as customer demand, market distribution and feedback information, and provide a basis for intelligent analysis of enterprises and business activities. Therefore, CRM has brought the foundation for enterprises to successfully realize e-commerce.
Personalized service is a powerful weapon to enhance competitiveness, and CRM is customer-centric and provides customers with the most suitable service. Internet has become an ideal channel to implement the application of customer relationship management. Remember the names of customers and their preferences, and provide different content according to different customers, which will greatly increase the possibility of customers patronizing again. CRM can increase customer loyalty, increase the purchase ratio, make each customer have more purchase needs and longer needs, and improve customer satisfaction.
2 data mining technology
How to analyze these massive data and find that the powerful tool is data mining, which can provide valuable information for business decision-making and make enterprises profit.
In analytical CRM system, data mining is the core technology. Data mining is a process of extracting potential and valuable knowledge, models or rules from a large number of data. For enterprises, data mining can help find the trend of business development, reveal known facts, predict unknown results, and help enterprises analyze the key factors needed to complete tasks, thus increasing income, reducing costs and making enterprises in a more favorable competitive position.
2. 1 Common algorithms for data mining
(1) decision tree decision algorithm. Decision tree is a tree structure similar to flowchart. Each internal node represents an attribute test, each branch represents a test output, and each leaf node represents a class or class distribution. Decision tree algorithm includes tree construction and tree pruning. There are two common pruning methods: pruning first, then pruning.
(2) Neural network. Neural network is a set of interconnected input and output units, in which each connection has a weight. In the learning stage, by adjusting the weights of neural network, the correct category labels of input samples can be predicted for learning.
(3) Genetic algorithm. According to the principle of survival of the fittest, genetic algorithm forms a new population composed of the most suitable rules of the current population and the descendants of these rules. Genetic algorithm is used in classification and other optimization problems.
(4) Rough set method. The basis of rough set method is to establish equivalence classes in given training data. It understands knowledge as the division of data, and each divided set is called a concept, which uses the known knowledge base to process or engrave inaccurate or uncertain knowledge. Rough set is used for feature reduction and correlation analysis.
(5) Fuzzy set method. Rule-based classification systems have a disadvantage: they have a steep truncation of continuous attributes. The introduction of fuzzy logic allows the definition of "fuzzy" boundaries, which facilitates the processing at a high abstract level.
Others include Bayesian network, visualization technology, proximity search method and formula discovery.
2.2 analysis methods commonly used in data mining
(1) classification forecast. Mainly used for customer segmentation (clustering) processing, such as the classification of value customer groups. Classification and prediction are two forms of data analysis, which can be used to extract models describing important data categories or predict future trends. Data validation is a two-step process. In the first step, a model is established to describe a predetermined set of data classes or concepts, and the model is constructed by analyzing database tuples with attribute descriptions. The second step is to use the model to classify. Firstly, the prediction accuracy of model plow is evaluated. If the accuracy of the model is acceptable, it can be used to classify data ancestors or objects with unknown category labels.
Prediction technology is mainly used to discover the future behavior of customers, such as the analysis of customer churn, learning the behavior changes of various customers before churn with neural network method, and then predicting (warning) the possible churn of valuable customers. Predictors build and use models to evaluate unlabeled sample classes, or to evaluate possible attribute values or value intervals of a given sample. Classification and prediction have a wide range of applications, such as reputation confirmation, medical diagnosis, performance prediction and purchase choice. The commonly used algorithms for classification prediction include decision tree induction, Bayesian classification, Bayesian network, neural network, K nearest neighbor classification, genetic algorithm, rough set and fuzzy set technology.
(2) Cluster analysis. Clustering is to divide data objects into multiple classes or clusters. The objects in the same cluster have high similarity, but the objects in the cluster are very different. As a branch of statistics, cluster analysis has been widely studied for many years, and now it mainly focuses on distance-based cluster analysis. K-means, K-medoids and other clustering analysis tools also have many applications.
(3) Association rules. Association rules mine interesting relationships between items in a given dataset. Let I = {i 1, i2, ... im} be a set of items, and task-related data D be a set of database transactions, in which each transaction T is a set of items, so that T is included in I. The form of association rules is the implication of A =>B, where A∈I, B∈I, A ∩. Mining association rules is divided into two steps: ① Find all frequent itemsets, and the frequency of these itemsets is at least the same as the predefined minimum support number. ② Generate strong association rules from frequent itemsets. These rules must meet the minimum support and minimum confidence.
(4) Sequence pattern. Sequence pattern analysis is similar to association rule analysis, and also aims at mining the relationship between data items, but sequence pattern analysis is the sequential relationship of data items in time dimension. For example, a customer may purchase financial analysis software six months after purchasing a computer.
(5) Analysis of isolated points. Outliers are the result of measurement errors or inherent data variability. Many data mining algorithms try to minimize the influence of outliers or eliminate them. In some cases, one person's noise may be another person's signal. Isolated points are very useful. Outlier mining can be described as: given a set of n data points or objects, and the expected number of k outliers, find out the top k objects that are significantly different or inconsistent with the remaining data. Outlier detection methods can be divided into three categories: statistical methods, distance-based methods and offset-based methods.
3 application method
3. 1 Understand the business
In the initial stage, we focus on understanding the business characteristics and simplify them into the conditions and parameters of data analysis. For example, in the retail industry, our first step is to know whether there is obvious correlation between the customer's purchase frequency, purchase frequency and the amount spent each time.
3.2 Data analysis
The focus of this stage is to standardize the existing data. We find that in many industries, the analyzable data do not match the above-mentioned analysis objectives. For example, the monthly income level of consumers may be related to many purchase behaviors, but the original data accumulation does not necessarily have this mantis data. The solution to this problem is to reason from other relevant data. For example, through a sample survey, we found that the monthly income of customers who buy a large amount of toilet paper at one time is concentrated at the level of 1000-3000 RMB. If this conclusion is basically true. We can infer from the consumption habits, what proportion of customers with income level this month; In addition, according to the method of sampling survey. On the basis of questionnaire survey, the income level curve of the whole sample population is deduced.
3.3 data preparation
This stage focuses on transforming, cleaning and importing data, which may be extracted from multiple data sources and then combined to form a data cube. For a small number of missing data, it is one of the problems that need to be dealt with at this stage whether to fill in with mean, ignore or distribute according to existing samples.
3.4 Modeling
There are various model methods available now. Let the best one be applied to the main problems we should pay attention to. Is the main task at this stage. For example, whether to use regression method in profit forecasting, and what is the basis of forecasting. These issues need to be reached through consultation between industry experts and data analysis experts.
3.5 Evaluation and Application
An excellent evaluation method is to use different time periods to let the system predict the consumption that has happened, and then compare the predicted results with the actual situation, so that the evaluation of the model is easy. After completing the above steps, most analysis tools support saving and reusing the established model. More importantly, in this process, the methods and knowledge of data analysis should have been understood by customers' market analysts or decision makers. We not only provide the final result, but also provide the method to obtain it. "Stick a golden needle in a person" is the difference between TurboCRM consulting service and pure software provider.
Finally, in the software architecture, the analysis database and the operation database should be separated to avoid affecting the real-time response speed of the operation database.
4 conclusion
Data mining can divide a large number of customers into different categories. Each category has similar customer attributes, but different categories have different customer attributes. It can provide completely different services to these two types of customers and improve customer satisfaction. Detailed and practical customer classification is of great benefit to the business strategy of enterprises.