1. A lot. The characteristic of big data is "big" first. In the former Map3 era, a small MB Map3 can meet the needs of many people. But with the passage of time, the storage unit has changed from GB in the past to TB, even the current PB and EB levels.
With the rapid development of information technology, data began to explode. Social networks (Weibo, Twitter, Facebook), mobile networks, various intelligent tools, service tools, etc. Have become the source of data. Nearly 400 million members of Taobao generate about 20TB of daily commodity trading data; About 654.38 billion users of Facebook generate more than 300TB of log data every day.
Intelligent algorithms, powerful data processing platforms and new data processing technologies are urgently needed to count, analyze, predict and process such large-scale data in real time.
2. Diversity. A wide range of data sources determines the diversity of big data forms. Any form of data can play a role. At present, the most widely used recommendation systems are Taobao, Netease Cloud Music and Today's Headlines. These platforms will further recommend what users like by analyzing users' log data.
Log data is obviously structured data, and some data is not obviously structured, such as pictures, audio and video. These data have weak causality and need to be manually marked.
3. high speed. Big data is produced very quickly, mainly through the Internet. Everyone in life is inseparable from the Internet, which means that every day, individuals are providing a lot of information to big data every day.
And these data need to be processed in time, because it is very uneconomical to spend a lot of capital to store historical data with little effect. For a platform, the saved data may only be in the past few days or a month, and the data far away must be cleaned up in time, otherwise the cost is too high.
Based on this situation, the processing speed of big data is very strict. A lot of resources in the server are used to process and calculate data, and many platforms need to do real-time analysis. Data is generated all the time, and whoever is fast has an advantage.
4. Value. This is also the core feature of big data. Among the data generated in the real world, the proportion of valuable data is very small.
Compared with traditional small data, the greatest value of big data lies in mining valuable data for future trend and pattern prediction and analysis from a large number of irrelevant data, and making in-depth analysis through machine learning methods, artificial intelligence methods or data mining methods.
Discover new laws and knowledge and apply them to agriculture, finance, medical care and other fields, and finally achieve the effect of improving social governance, improving production efficiency and promoting scientific research.