Introduction of Three Employment Directions and Ten Jobs in Big Data Industry

Introduction of Three Employment Directions and Ten Jobs in Big Data Industry

At present, the trend of big data has gradually moved from concept to landing. In the transformation of IT people following the wave of big data, the demand for high-end talents of big data by major enterprises is becoming more and more urgent. This trend also provides a rare career opportunity for people who want to work in big data.

Sishu Cloud Computing and Big Data Service Center, referred to as Sishu Shu Yun (affiliated to Beijing Sishu Technology Co., Ltd.), is a professional big data analysis training and consulting institution in China. China Cloud Computing Big Data Processing Committee, together with the Institute of Software of Chinese Academy of Sciences, Tsinghua University, Google, Yahoo, Tencent, Ali, Mobile Research Institute and other big data technicians, established the "Newby-Sishu Cloud Service" big data service center on 20 12.

Shu Yun summed up three main employment directions of big data from long-term practice: big data system R&D talents, big data application development talents and big data analysis talents. In these three directions, their basic positions are generally big data system R&D engineers, big data application development engineers and data analysts.

From an enterprise perspective, big data talents can be roughly divided into three areas: product and market analysis, security and risk analysis, and business intelligence. Product analysis refers to testing the effectiveness of new products through algorithms, which is a relatively new field. In terms of security and risk analysis, data scientists know what data needs to be collected, how to analyze it quickly, and finally effectively curb network intrusion or arrest cyber criminals by analyzing information.

I. ETL research and development

With more and more kinds of data, the demand for data integration professionals in enterprises is increasing. ETL developers deal with different data sources and organizations, extract data from different sources, convert them and import them into data warehouses to meet the needs of enterprises.

The research and development of ETL is mainly responsible for extracting data from scattered heterogeneous data sources such as relational data and plane data files to a temporary middle layer for cleaning, transformation and integration, and finally loading them into a data warehouse or data mart, which becomes the basis of online analytical processing and data mining.

At present, the ETL industry is relatively mature, and the work life cycle of related positions is relatively long, which is usually completed by internal employees and outsourcing contractors. One of the reasons why ETL talents are hot in the era of big data is that in the early days of enterprise big data application, Hadoop was just ETL for the poor.

Second, Hadoop development

The core of Hadoop is HDFS and MapReduce. HDFS provides mass data storage, and MapReduce provides data calculation. With the continuous expansion of data set scale and the high cost of traditional BI data processing, the demand of enterprises for Hadoop and related cheap data processing technologies such as Hive, HBase, MapReduce and Pig will continue to grow. Today, technicians with Hadoop framework experience are the most sought-after big data talents.

Thirdly, the development of visualization (front-end demonstration) tools

The analysis of massive data is a great challenge, and new data visualization tools such as Spotifre, Qlikview and Tableau can display data intuitively and efficiently.

Visual development means that visual development tools automatically generate application software by operating interface elements on the graphical user interface provided by visual development tools. It is also easy to connect all data across multiple resources and levels. After the test of time, the fully extensible and fully functional visual component library provides a complete and easy-to-use component collection for developers to build an extremely rich user interface.

Data visualization used to belong to business intelligence developers, but with the rise of Hadoop, data visualization has become an independent professional skill and post.

Fourth, information architecture development.

Big data reignited the upsurge of master data management. Making full use of enterprise data to support decision-making requires very professional skills. Information architects must know how to define and document key elements to ensure that data is managed and utilized in the most effective way. Key skills of an information architect include master data management, business knowledge and data modeling.

Research on verb (verb's abbreviation) data warehouse

Data warehouse is a strategic collection of all types of data, which supports the decision-making process at all levels of enterprises. It is a separate data storage for analysis report and decision support. Provide business intelligence for enterprises to guide business process improvement and monitor time, cost, quality and control.

Data warehouse experts are familiar with big data all-in-one machines of Teradata, Neteeza and Exadata. Data integration, management and performance optimization can be completed on these all-in-one machines.

Sixth, the development of OLAP.

With the development and application of database technology, the amount of data stored in database has developed from megabytes (M) and gigabytes (G) in 1980s to megabytes (T) and gigabytes (P) now. At the same time, users' query needs are becoming more and more complex, which not only involves querying or operating one or several records in a relational table, but also involves analyzing and processing the data of tens of millions of records in multiple tables. Online analytical processing (OLAP) system is responsible for solving this kind of massive data processing problem.

The developer of OLAP online analysis is responsible for extracting data from relational or non-relational data sources to build a model, and then creating a user interface for data access to provide high-performance predefined query functions.

Seven, data science research

This position used to be called data architecture research. Data scientist is a brand-new type of work, which can transform enterprise data and technology into enterprise business value. With the development of data science, more and more practical work will focus on data, which will enable human beings to understand data and thus understand nature and behavior. Therefore, data scientists should first have excellent communication skills and be able to explain the results of data analysis to the leaders of IT departments and business departments at the same time.

Generally speaking, data scientists are a combination of analysts and artists, and need to have a variety of interdisciplinary scientific and business skills.

Eight, data prediction (data mining) analysis

Marketing departments often use predictive analysis to predict user behavior or target users. Some scenarios of predictive analysis developers seem to be similar to data scientists, that is, testing thresholds and predicting future performance through assumptions based on historical data of enterprises.

Nine, enterprise data management

In order to improve the data quality, enterprises must consider data management and set up the position of data steward. The staff in this position need to be able to use various technical tools to collect a large amount of data around the enterprise, clean and standardize the data, and import the data into the data warehouse to become a usable version. Then, through reporting and analysis techniques, the data is sliced, diced and delivered to thousands of people. As data stewards, people need to ensure the integrity, accuracy, uniqueness, authenticity and non-redundancy of market data.

X. Data security research

Data security positions are mainly responsible for the management of large-scale servers, storage and data security in enterprises, as well as the planning, design and implementation of network and information security projects. Data security researchers also need to have strong management experience, operation and maintenance management knowledge and ability, and have a deep understanding of traditional business of enterprises, so as to ensure the safety and integrity of enterprise data.

;