This group of people are called data scientists abroad. This title was first proposed by D.J.Pati and Jeff Hammerbacher in 2008, and they later became the heads of LinkedIn and Facebook data science teams respectively. At present, the position of data scientist has also begun to create value in traditional industries such as telecommunications, retail, finance, manufacturing, logistics, medical care and education in the United States.
But in China, as a big data talent training base, we can have a clearer understanding of today's big data industry, and think that the application of big data has just sprouted and the talent market is not so mature. "It's hard to expect a generalist to complete all the links in the whole chain. More companies will recruit talents who can supplement existing teams according to their existing resources and shortcomings. " Wang Yuyao, business analysis and strategy director of LinkedIn China, told CBN Weekly.
What does a data engineer do?
Therefore, each company has different requirements for big data work: some emphasize database programming, some highlight applied mathematics and statistical knowledge, some require relevant experience of consulting companies or investment banks, and some hope to find applied talents who understand products and markets. Because of this, many companies will give these people who deal with big data some new titles and definitions according to their business types and team division of labor: data mining engineers, big data experts, data researchers, user analysis experts and so on. It is a title that often appears in domestic companies. We collectively refer to it as "Big Data Engineer".
Because the domestic big data work is still in a stage to be developed, how much value can be extracted from it depends entirely on the personal ability of engineers. Experts who have been in this industry have given a general framework of talent demand, including computer coding ability, mathematics and statistical background. Of course, if you can have a deeper understanding of some specific fields or industries, it will be more helpful for them to quickly judge and grasp key factors.
Although for some large companies, a master's degree is a better choice, Xue, a researcher at Alibaba Group, stressed that education is not the most important factor, and experience in large-scale data processing and curiosity about treasure hunting in the data ocean are more suitable for this job.
In addition, an excellent big data engineer should have certain logical analysis ability and be able to quickly locate the key attributes and determinants of a business problem. "He needs to know what is relevant, what is important, what kind of data is the most valuable, and how to quickly find the core requirements of each business." Shen Zhiyong, a data scientist at the United Nations Baidu Big Data Joint Lab, said. Learning ability can help big data engineers adapt to different projects quickly and become data experts in this field in a short time; Communication skills can make their work go smoothly, because the work of big data engineers is mainly divided into two ways: driven by the marketing department and driven by the data analysis department. The former needs to know the development requirements from the product manager frequently, and the latter needs to find the operation department to understand the actual transformation of the data model.
You can regard these requirements as the direction of becoming a big data engineer, which is a big talent gap. At present, domestic big data applications are mostly concentrated in the Internet field, and more than 56% of enterprises are ready to carry out big data research.
Therefore, analyzing history, predicting the future and optimizing choices are the three most important tasks for big data engineers to play with data. Through these three work directions, help enterprises make better business decisions.
1. Find out the characteristics of past events.
A very important job of big data engineers is to find out the characteristics of past events by analyzing data. For example, Tencent's data team is building a data warehouse, sorting out the huge and irregular data information on all the company's network platforms, and summarizing the characteristics that can be queried to support the company's various business needs for data, including advertising, game development and social networking.
Finding out the characteristics of past events can help enterprises better understand consumers. By analyzing the user's past behavior trajectory, we can understand this person and predict his behavior. "You can know what kind of person he is, his age, hobbies, whether he is a paying Internet user, what kind of games he likes to play and what he likes to do online." Zheng Lifeng, general manager of Beijing R&D Center of Tencent Cloud Computing Co., Ltd. told CBN Weekly. Next, at the business level, we can recommend related services for all kinds of people, such as mobile games, or derive new business models according to different characteristics and needs, such as the movie ticket business of WeChat.
2. predict what may happen in the future
By introducing key factors, big data engineers can predict future consumption trends. On Ali's mother's marketing platform, engineers are trying to help Taobao sellers do business by introducing meteorological data. "For example, if it is not hot this summer, it is very likely that some products could not be sold last year, except air conditioners, electric fans, vests and swimsuits. , may be affected by it. Then we will establish the relationship between meteorological data and sales data, find the related categories, and warn the seller's turnover inventory in advance. " Xue said to.
Taking Baidu's scenic spot forecast as an example, big data engineers need to collect all the key factors that may affect the tourist flow in scenic spots for a period of time, and rank the future congestion of scenic spots across the country-will it be smooth, crowded or generally crowded in the next few days?
3. Find out the result of optimization
According to the business nature of different enterprises, big data engineers can achieve different purposes through data analysis.
In the past, decision makers could only judge by experience, but now big data engineers can help the marketing department make the final choice through a wide range of real-time tests-for example, taking social networking products as an example, let half users see interface A, and the other half use interface B to observe and count the click-through rate and conversion rate over a period of time.
As an e-commerce, Alibaba hopes to help sellers do better marketing through accurate crowd positioning of big data. An example of Taobao is that a ginseng seller originally promoted pregnant women, but engineers found that the marketing conversion rate for pregnant women was higher by mining the correlation between data.
Required capacity
1. Mathematics and statistics related background
As far as the three major BAT Internet companies we interviewed are concerned, the requirements for big data engineers are all master's or doctoral degrees in statistics and mathematics. Data workers who lack theoretical background are more likely to enter the danger zone)-skills-a bunch of numbers. According to different data models and algorithms, they can always get some results, but if you don't know what it represents, it is not really meaningful and it is easy to mislead you. "Only with certain theoretical knowledge can we understand models, reuse models and even innovate models to solve practical problems." Shen Zhiyong said.
2. Computer coding ability
Practical development ability and large-scale data processing ability are some essential elements for a big data engineer.
For example, many records generated by people on social networks are unstructured data. How to extract meaningful information from these clueless words, sounds, images and even videos requires big data engineers to dig for it themselves. Even in some teams, big data engineers are mainly responsible for business analysis, but they should also be familiar with the way computers handle big data.
3. Knowledge of specific application fields or industries.
The role of big data engineers is very important, because big data can only generate value if it is combined with applications in specific fields. Therefore, experience in one or more vertical industries can accumulate industry knowledge for candidates, which is very helpful for becoming a big data engineer in the future, so this is also a convincing plus item when applying for this position.
vocational development
1. How to become
Due to the current shortage of big data talents, it is difficult for companies to recruit suitable talents-both highly educated and preferably experienced in large-scale data processing. So many companies will dig inside.
In August of 20 14, Alibaba held a big data competition, taking out the data on Tmall platform, removing sensitive issues, putting them on the cloud computing platform and handing them over to more than 7,000 teams for competition. The competition is divided into internal and external competitions. "This not only inspires internal employees, but also explores external talents, making big data engineers in various industries stand out."
At present, people who have been engaged in database management, mining and programming for a long time, including traditional quantitative analysts and Hadoop engineers, and managers who need to make judgments and decisions through data in their work, such as operation managers in some fields, can try this position. Experts in various fields can also become big data engineers as long as they learn to use data.
2. Wages and treatment
As a "giant panda" in the IT industry, the salary of big data engineers can be said to have reached the top of the same industry. In domestic IT, communication and industry recruitment, 10% is related to big data, and the proportion is still rising. Nicole Yan said, "The era of big data is coming to expect the unexpected. The domestic development momentum is radical, but the talents are very limited. Now the supply is completely in short supply. " In the United States, the average annual salary of big data engineers is as high as $654.38 +0.75 million. It is understood that in the top Internet companies in China, the salary of big data engineers at the same level may be 20% to 30% higher than other positions, which is highly valued by enterprises.
3. Career development path
Due to the lack of big data talents, the data departments of most companies are generally flat hierarchical models, which are roughly divided into three levels: data analysts, senior researchers and department directors. Large companies may divide different teams according to the dimensions of application fields, while in small companies, they need to hold several positions. Some Internet companies with special emphasis on big data strategy will set up other high-level positions-for example, chief data officer of Alibaba. On the other hand, big data engineers have no less understanding of business and products than employees in business departments, so they can also transfer to product departments or marketing departments, or even rise to the top of the company.