According to IDC report, the global big data technology and service market will maintain a compound annual growth rate of 365,438+0.7% in the next few years, and the total market size is expected to reach $23.8 billion in 2065,438+06. According to this calculation, the growth rate of the big data market will reach seven times that of the entire information and communication technology field in the same period. This market is rapidly absorbing technologies and services from various existing and new markets. At present, some IT industry leaders such as IBM, Microsoft, Oracle Bone Inscriptions, Hewlett-Packard, EMC, etc. are optimistic about this field, and have invested manpower and financial resources in the layout.
According to IDC data, in the past five years, the amount of data generated by human behavior has increased by 10 times, and this increase will reach 29 times in the next 10 years. However, 80% of data is unstructured data, so how to mine and use data will become the value point and difficulty of big data.
Gao Wen, Chairman of the Steering Committee of China Computer Conference and Professor Peking University, said in an interview with this magazine recently that big data is not only widely concerned by the industry, but also a hot spot in the technical field. From a technical point of view, data mining is the value of big data, but there are still many problems in data mining, which are far from our expectations. He said that Alibaba has made an attempt in data mining. Ali financial logistics has been derived from the massive transaction data of e-commerce, but this is only the value in the commercial field and has not released energy in social change. In the future, big data will bring more changes to society.
The value brought by big data is also being widely discussed by the industry and academia. In recent years, big data has continuously penetrated into all walks of life, bringing revolutionary influence to every field, and is becoming the driving force and booster of innovation in various industries. During this period, with the continuous development and innovation of Internet social technology, people are increasingly accustomed to sharing all kinds of information and data, expressing their demands and making suggestions through social platforms such as Weibo, WeChat, blogs and forums. The amount of data spread on these platforms every day is as high as tens or even hundreds of billions. These huge social data constitute an important part of big data, which plays an important role in the government's collection of public opinion trends, enterprises' understanding of product reputation, and the company's development of market demand.
Nowadays, although the Internet has become a very effective way to collect public opinion and understand the effectiveness of government and enterprises. However, due to the lack of necessary online posting supervision measures, it is difficult to obtain in-depth and high-quality online public opinion information in time and effectively after the public opinion crisis, which often leads to the passivity in handling public opinion crisis events. Therefore, attaching importance to online public opinion response and establishing a public opinion response system of "monitoring, responding, summarizing and archiving" have become one of the important contents of government affairs in the era of big data.
In this context, public opinion monitoring and analysis industry came into being to adapt to the public opinion monitoring and services in the era of big data. Mainly through massive information collection, intelligent semantic analysis, natural language processing, data mining, machine learning and other technologies, we will continuously monitor information such as websites, forums, blogs, Weibo, print media and WeChat, grasp all kinds of information and network dynamics in a timely, comprehensive and accurate manner, explore the symptoms of events from the vast big data universe, summarize public opinion trends, grasp public attitudes and emotions, and make trend predictions and suggestions in combination with historical similarities and similar events.
The Application Value of Big Data in public opinion monitoring
(A) The core of the value of big data: public opinion prediction
The starting point of traditional online public opinion guidance is to monitor the online public opinion that has happened. However, the limitation of this method lies in its lag. The application of big data technology is to mine and analyze the data related to network public opinion, advance the target time of monitoring to the initial stage of network sensitive news dissemination, and simulate the evolution process of actual network public opinion through the established model to realize the prediction of network sudden public opinion.
(2) The conditions for big data to play its value: comprehensive public opinion.
The first condition for big data technology to predict public opinion is to analyze and calculate all kinds of relevant comprehensive data. In the traditional data era, when analyzing netizens' opinions or public opinion trends, they only pay attention to netizens' attitudes and emotions, ignoring netizens' psychological changes; Only pay attention to text information, but pay less attention to pictures, videos, voices and other content; Only observe local changes in public opinion and ignore changes in public opinion of other groups; Only read the text content of netizens, but ignore the complex and changeable social network. From the perspective of public opinion analysis, netizens are just "lonely zombies" in the information ocean, just like ant colonies can emerge with high intelligence, while a single ant runs around like a hot pot.
In the era of big data, it broke through the one-sided, single and static thinking in the traditional data era, and began to study online public opinion data in a three-dimensional, global and dynamic way, and included seemingly insignificant public opinion data into the scope of analysis and calculation.
(c) The foundation of the value of big data: the quantification of public opinion
The realization of forecasting public opinion value by big data must be based on the scientific calculation and analysis of the excavated massive information by using mathematical models, on the premise that all relevant data can be quantified, that is, all public opinion information can be quantified. But data quantification is not the same as simple digitization, but the computability of data. It is necessary to count the number of people who hold this opinion while paying attention to the comments of netizens; While interpreting the content of netizens' speeches, count the number of social networks that netizens interact with; The change of netizens' mood can be identified by quantitative indicators.
(d) The key to the value of big data: the relevance of public opinion
Behind the data is the network, and behind the network is the people. Studying network data is actually a social network composed of researchers. The key technology of big data technology to predict the realization of public opinion value is to correlate the relationship between public opinions, which will not only pay attention to the causal relationship in the traditional sense, but also pay attention to the correlation between data. According to the big data thinking, each data is a node, which can infinitely form a multiplication effect with other related data in the public opinion chain-similar to the Weibo fission propagation path, and the fission-related state of the data contains infinite possibilities.
Public opinion monitoring bottleneck in the era of big data
At present, the main means of public opinion monitoring's work is still manual search. Although the relatively mature search software in the market is also used for auxiliary search, the traditional two-dimensional search method is still used, that is, the topic keywords and the two-dimensional coordinates of the network platform are used for public opinion search, and the collected information is processed into public opinion products by public opinion staff. However, the results of public opinion information are mostly first-class text information. For deep-seated and multi-level public opinion information, such as news, post-Weibo comments, netizens' social relations, emotional changes reflected by netizens' comments on an event, and inflammatory and action remarks and hints of netizens, we can't dig deeply, but still rely on manual collection, analysis and judgment. Subject to the different knowledge levels and value judgments of public opinion workers, it is very likely that valuable public opinion information will be lost, and the trend of public opinion cannot be predicted accurately and timely, which greatly reduces the efficiency and accuracy of public opinion monitoring's work, increases the contingency and speculation of finding valuable public opinion information, and lays a hidden danger for public opinion prediction of major emergencies.
The Implementation of public opinion monitoring in the Context of Big Data
The collection and processing of big data is the foundation of public opinion monitoring. Mastering the ability to grasp data and realizing the "value-added" of data through "processing" are the necessary skills of public opinion monitoring's analysis. Due to the advanced collection technology developed by the D Dorrico public opinion data analysis station system, users can not only monitor all kinds of text information, but also configure the system to collect and obtain the latest reply contents of some topics, and obtain their detailed information, such as the number of views, replies, respondents and reply time. Many websites have complex structures, or use Frame or JavaScript to dynamically write content or Ajax technology to automatically refresh content in real time, which are difficult or impossible for ordinary crawler technology to handle. For the collected and monitored information, the system can automatically classify and present it in negative public opinion, related to me, my concern, special topic tracking and other columns, so that users can go straight to the subject and find the information they need as soon as possible.
The research on trends is the goal of public opinion monitoring in the era of big data. Now people can mine information from massive data, judge trends and improve efficiency, but this is far from enough. In the era of information explosion, people are required to continuously strengthen the analysis and prediction of relevant public opinion information, and expand the focus of monitoring from simply collecting effective data to in-depth judgment of public opinion. Duoruike public opinion data analysis station system carries out special key tracking monitoring on the monitored negative information, and carries out regular screen capture monitoring on the key home pages and special page evidence preservation. Monitoring personnel can re-select and classify the information automatically identified and classified by the system, and can easily export public opinion daily newspapers and weekly newspapers containing analysis data charts according to work needs, thus reducing the complexity of public opinion data analysis and statistical mapping. For some sensitive information, the system can also inform users in time through SMS and email, so that users can grasp important public opinion trends remotely at any time.
The era of big data needs big collection and big analysis, which is the embodiment of data processing and application requirements under the background of data explosion. However, the traditional manual collection and manual monitoring are obviously difficult to meet the data requirements and application requirements under the background of big data. Doreco public opinion data analysis station system has successfully realized the functions of automatic real-time monitoring, automatic content analysis, automatic alarm and so on. It has effectively solved the problem of traditional manual implementation in public opinion monitoring, accelerated the supervision efficiency of online public opinion, helped organize forces to organize, analyze, guide and respond to information, improved users' ability to respond to public emergencies caused by online public opinion, and strengthened the analysis and judgment of "big data" on the Internet.