As a data analyst, I know at least one data analysis software, such as SPSS, STATISTIC, Eviews, SAS, etc. You should master at least one mathematical software such as matalab and mathmatics to build new models, and at least one programming language. In short, an excellent data analyst should keep up with his business, management, analysis, tools and design.
2. Data architect.
The data architect is responsible for the overall data architecture design of the platform, completing the design from business model to data model, designing database modeling according to business functions and business models, and completing the definition and application development of various business-oriented data analysis models, platform data extraction, data mining and data analysis.
To be a data architect, you need to have strong business understanding and business abstraction ability, database model design ability of large-capacity Internet platform for things and transactions, very deep knowledge and understanding of scheduling system and metadata system, familiar with commonly used analysis, statistics and modeling methods, familiar with data warehouse related technologies, such as ETL and report development, familiar with Hadoop, Hive and other systems, and have practical experience.
3. Data Mining Engineer
Generally speaking, it refers to engineering professionals who search for knowledge hidden in a large amount of data through algorithms. Using this knowledge can make enterprise decision-making intelligent and automatic, improve work efficiency and reduce the possibility of wrong decision-making, so as to be invincible in the fierce competition.
To be a data mining engineer, you need to have a deep theoretical foundation of statistics, mathematics and data mining and relevant project experience, be familiar with one of statistical analysis software such as R, SAS and SPSS, and have participated in complete data collection, collation, analysis and modeling. Experience in machine learning and algorithm implementation under massive data, familiar with hadoop, hive, map-reduce, etc.
4. Data algorithm engineer
Responsible for the design of data mining algorithms and model parts of enterprise big data products, and integrate business scenarios with model algorithms; Deeply study the data mining model, participate in the construction, maintenance, deployment and evaluation of the data mining model, and support the algorithm construction and integration of the product R&D team model; Formulate and implement data modeling, data processing and data security architecture specifications.
The required knowledge is: solid basic knowledge of data mining, proficient in common algorithms of machine learning and mathematical statistics; Familiar with big data ecology, master common distributed computing frameworks and technical principles, such as Hadoop, MapReduce, Yarn, Storm, Spark, etc. Familiar with Linux operating system and Shell programming, at least familiar with Scala/Java/Python/C++/R and other programming languages; Familiar with the basic principle of large-scale parallel computing and have the basic ability to realize parallel computing algorithms.
5. Data product manager.
Data platform construction and maintenance, client data analysis, data statistics assistance, data operation and collation, refining existing data reports, finding data changes, in-depth thematic analysis, forming conclusions and writing reports; Responsible for the design, development and implementation of the company's data products to ensure the realization of business objectives; Conduct data product development.
The required skills are: practical experience in data analysis/data mining/user behavior research; Have a solid theoretical foundation of analysis, be proficient in statistical analysis tools above 1, such as SPSS and SAS, and skillfully use tools such as Excel and SQL; Familiar with SQL/HQL statements, working experience in SQL server/My SQl is preferred; Skillfully operate excel, ppt and other office software, and skillfully use one of statistical analysis software such as SPSS and SAS; Those who are familiar with hadoop cluster architecture, have practical experience in BI and have participated in overflow computing will get extra points; Familiar with the design and development process of customer products.
6.Hadoop operation and maintenance engineer
Technical knowledge you need: deployment, maintenance and technical support of platform big data environment, handling and tracking of application failures, statistical summary and analysis, application security, daily backup and emergency recovery of data.
7.Hadoop development engineer
Hadoop is a distributed file system (HDFS). Hadoop is a software framework that can process a large amount of data in a distributed way. It processes data in a reliable, efficient and extensible way. So Hadoop solves the problem of how to store big data, so it is a course that big data training institutions must learn.
Technologies required by hadoop development engineers: build data analysis platform based on hadoop and hive, design data platform architecture, develop distributed computing services, apply big data, data mining, analytical modeling and other technologies, mine massive data, discover its potential association rules, and develop Hadoop, hive, hbase and Map/Reduce related products in the early stage. Hadoop-related technologies solve the problem of massive data processing and analysis, optimize and improve the performance of Hadoop-related business scripts, and continuously improve the operating efficiency of the system.
8. Big Data Visualization Engineer
With the application of big data in people's work and daily life, big data visualization has also changed the way people read and understand information. From the migration of Baidu to the flu trend of Google, and then to the launch of county economic visualization products in Alibaba Cloud, big data technology and big data visualization are all behind the scenes.