What is software such as database?

The following information was found in Baidu, and the landlord can have a look at it when he is free.

I'm inside. Let me talk about my feelings for reference only.

Database is used to facilitate data storage and operation. You can if you don't need it. Sometimes,

You can also record the data in a file (such as txt), but it is more troublesome to operate if the data is large.

At present, there are many types of databases, big and small, which will be more convenient if they can be used.

Definition 1

When people describe this concept from different angles, they have different definitions (descriptive, of course). For example, a database is called a "record keeping system" (this definition emphasizes that a database is a collection of records). For another example, a database is "a collection of related data that people store together in a certain organizational way in order to solve specific tasks" (this definition focuses on the organization of data). More importantly, the database is called "data warehouse". Of course, although this statement is vivid, it is not rigorous.

Strictly speaking, a database is a "warehouse for organizing, storing and managing data according to data structure". In the daily work of economic management, it is often necessary to put some relevant data into such a "warehouse" and handle it according to the needs of management. For example, the personnel department of enterprises and institutions often stores the basic information of employees (job number, name, age, gender, place of origin, salary, resume, etc.). ) in a table, can be regarded as a database. With this "data warehouse", we can query the basic situation of an employee at any time as needed, and we can also query the number of employees whose wages are within a certain range, and so on. If all these tasks can be done automatically on the computer, then our personnel management can reach a very high level. In addition, in financial management, warehouse management and production management, it is necessary to establish many such "databases" in order to realize the automatic management of finance, warehouse and production by computer.

J. Martin gave a relatively complete definition of database: database is a collection of related data stored together, which is structured, without harmful or unnecessary redundancy and serves a variety of applications; The storage of data is independent of the program that uses it; Inserting new data into the database, modifying and retrieving original data can be done in a universal and controllable way. When there are several completely separated databases in a system, the system contains a "database collection".

Definition 2

A database is a collection of data organized according to a specific data model and stored in an auxiliary memory. This kind of data set has the following characteristics: it is as non-repetitive as possible and serves a variety of applications of a specific organization in an optimal way. Its data structure is independent of the application program that uses it, and the addition, deletion, modification and retrieval of data are managed and controlled by unified software. From the development history, database is an advanced stage of data management, which was developed by file management system.

[Edit this paragraph] The basic structure of the database

The basic structure of the database is divided into three levels, reflecting three different perspectives of observing the database.

(1) physical data layer.

It is the innermost layer of the database and a collection of data actually stored on the physical storage device. These data are original data, which are processed by users and consist of bit strings, characters and words processed by instruction operations described by internal modes.

(2) Conceptual data layer.

It is the middle layer of the database and the overall logical representation of the database. It is pointed out that the logical definition of each data and the logical relationship between data are a collection of storage records. It involves the logical relationship of all objects in the database, not their physical conditions, and is a database under the concept of database administrator.

(3) Logical data layer.

It is a database that users see and use, representing a data set used by one or some specific users, that is, a logical record set.

The relationship between different levels of databases is transformed by mapping.

[Edit this paragraph] The main functions of the database

(1) to realize data sharing.

Data sharing includes that all users can access the data in the database at the same time, and users can use the database in various ways through the interface and provide data sharing.

(2) Reduce data redundancy.

Compared with the file system, because the database realizes data sharing, users are prevented from creating application files separately. Reduce a large number of duplicate data, reduce data redundancy and maintain data consistency.

(3) independence of data.

The independence of data includes that the logical structure of database and application program are independent of each other, and the change of physical structure of data does not affect the logical structure of data.

(4) Centralized control of data.

In the file management mode, the data is in a decentralized state, and different users or the same user do nothing with their own files in different processes. Database can be used to centrally control and manage data, and data model can be used to express the organization of various data and the relationship between data.

(5) Consistency and maintainability of data to ensure the safety and reliability of data.

It mainly includes: ① security control: preventing data loss, wrong update and unauthorized use; ② Integrity control: ensure the correctness, validity and compatibility of data; ③ Concurrency control: multiple accesses to data are allowed in the same time period, which can prevent abnormal interaction between users; ④ Fault discovery and recovery: The database management system provides a set of methods to find and repair faults in time, thus preventing data from being destroyed.

[Edit this paragraph] Database development stage

The database development stage can be roughly divided into the following stages:

Manual management stage;

File system stage;

Database system stage;

Advanced database stage.

[Edit this paragraph] Database structure and database type

Databases are usually divided into three types: hierarchical database, network database and relational database. Different databases are connected and organized according to different data structures.

1. data structure model

(1) data structure

The so-called data structure refers to the organizational form of data or the relationship between data. If d represents data and r represents a set of relationships between data objects, then DS = (d, r) is called a data structure. For example, there is a telephone directory in which the names and corresponding telephone numbers of N people are recorded. In order to find someone's phone number conveniently, names and numbers are arranged in dictionary order, and the corresponding phone number is followed by the name. In this way, if you want to find a person's phone number (assuming that the first letter of his name is Y), you only need to find those names that start with Y. In this example, the data set D is the name and phone number, and the relationship R between them is arranged in dictionary order, and its corresponding data structure is DS = (d, R), which is an array. (2) the type of data structure

Data structure is divided into logical structure and physical structure of data. The logical structure of data is to observe and analyze data from a logical point of view (that is, the connection and organization of data), regardless of the storage location of data. The physical structure of data refers to the structure of data stored in the computer, that is, the realization form of the logical structure of data in the computer, so the physical structure is also called storage structure. Only the logical structure of data is studied here, and the method of reflecting and realizing data connection is called data model.

At present, there are three popular data models, namely hierarchical structure model and network structure model based on graph theory and relational structure model based on relational theory.

2. Hierarchical, mesh and relational database system

(1) hierarchy model

The hierarchical structure model is essentially a directed ordered tree with root nodes (mathematically, "tree" is defined as an acyclic connected graph). For example, Figure 20.6.4 is the organization chart of an institution of higher learning. This organization chart is like a tree. School departments are roots (called root nodes), departments, majors, teachers and students are branches (called nodes), the connection between roots and branches is called edges, and the ratio of roots to edges is 1:N, that is, there is only one root and n branches.

The database system established according to hierarchical model is called hierarchical model database system. Ims (Information Management System) is its typical representative.

(2) Network structure model

The database system established according to the grid data structure is called grid database system, and its typical representative is DBTG(Data Base Task Group). Grid data structure can be transformed into hierarchical data structure by mathematical methods.

(3) Relationship structure model

Relational data structure simplifies some complex data structures into simple binary relations (that is, two-dimensional table form). For example, the employee relationship in a certain unit is a binary relationship.

A database system composed of relational data structures is called a relational database system.

In a relational database, almost all operations on data are based on one or more relational tables, and data management can be realized by classifying, merging, connecting or selecting these relational tables. DBASEII is a typical representative of this kind of database management system. For a practical application problem (such as personnel management), sometimes a variety of relationships are needed to achieve it. The relationship established with dBASEII is called database (or database file), and multiple databases established corresponding to multiple relationships are called database systems. Another important function of dBASEII is to use and manage the database by creating command files. The command sequence file corresponding to the database system is called the database application system. So simply speaking, a relationship is called a database, and several databases can form a database system. Database system can derive various types of auxiliary files and establish its application system.

[Edit this paragraph] Public database

1.IBM's

As a pioneer in the field of relational database, IBM completed the prototype of System R in 1997, and began to provide integrated database server-System/38 in 1980, followed by SQL/DSforVSE and VM. Its initial version is closely related to the research prototype of system R, and DB2 forMVSV 1 was launched in 1983. The goal of this version is to provide the simplicity, data independence and user productivity promised by this new scheme. DB2 for MVS provides powerful online transaction processing (OLTP) support in 1988, and implements distributed database support in 1989 and 1993 respectively. The recently launched DB2 Universal Database 6. 1 is a model of Universal Database, the first multimedia relational database management system with online function, and supports a series of platforms including Linux.

2. Oracle

Oracle Bone Inscriptions, formerly known as SDL, was founded by Larry Ellision and two other programmers in 1977. They developed their own fist products and sold them in large quantities in the market. 1979, Oracle launched the first commercial SQL relational database management system. Oracle Company is one of the earliest manufacturers to develop relational databases, and its products support the most extensive operating system platforms. At present, the market share of Oracle relational database products is among the best.

3.Informix

Informix was founded in 1980, providing professional relational database products for Unix and other open operating systems. The company name Informix is taken from the combination of Information and Unix. Informix's first relational database product that really supported SQL language was Informix SE(StandardEngine). InformixSE was the main database product under the microcomputer Unix environment at that time. It is also the first commercial database product transplanted to Linux.

4. Sybase

Sybase company was founded in 1984. The company name "Sybase" comes from the combination of "system" and "database". Bob Epstein, one of the founders of Sybase Company, is the main designer of Ingres University Edition (a relational database model product at the same time as System/R). The first relational database product of this company is Sybase SQLServer 1.0, which was launched in May 1987. Sybase first put forward the idea of client/server database architecture, and took the lead in implementing it in Sybase SQLServer.

5.SQL Server

1987, Microsoft cooperated with IBM to develop OS/2, and IBM bundled OS/2Database Manager in its OS/2 ExtendedEdition system, but Microsoft product line still lacked database products. Therefore, Microsoft turned its attention to Sybase and signed a cooperation agreement with Sybase to develop a relational database based on OS/2 platform by using Sybase technology. 1989, Microsoft released the 1.0 version of SQL Server.

6. A database system

PostgreSQL is a free software object with complete characteristics-Relational Database Management System (ORDBMS), and many of its characteristics are the predecessors of many commercial databases today. PostgreSQL originally started with the Ingres project of BSD. The features of PostgreSQL cover SQL-2/SQL-92 and SQL-3. First of all, it contains the support of the richest data types in the world. Secondly, PostgreSQL is the only free software database management system that supports transaction, sub-query, multi-version parallel control system and data integrity check.

7.mySQL

MySQL is a small relational database management system, which was developed by MySQL AB Company in Sweden. It was acquired by Sun Company on June 65438+ 10/6, 2008. At present, MySQL is widely used in small and medium-sized websites on the Internet. Because of its small size, high speed and low total cost of ownership, especially open source, many small and medium-sized websites choose MySQL as the website database to reduce the total cost of ownership. The official website of MySQL is www.mysql.com.

[Edit this paragraph] Database development history

From its birth to the present, database technology has formed a solid theoretical foundation, mature commercial products and a wide range of applications in less than half a century, attracting more and more researchers to join. The birth and development of database has brought a great revolution to computer information management. Over the past 30 years, thousands of databases have been developed and built at home and abroad, which have become the basic facilities for enterprises, departments and even individuals in their daily work, production and life. At the same time, with the expansion and deepening of application, the number and scale of databases are also increasing, and the research field of databases has also been greatly broadened and deepened. Three computer Turing (C.W. Bachman, E.F.Codd, J.Gray) have been obtained in the field of database in the past 30 years, which fully shows that database is a field full of vitality and innovative spirit. Let's trace the development of the database along the historical track.

A brief history of database development

1. The birth of data management

The history of database can be traced back to fifty years ago, when data management was very simple. Classify, compare and tabulate through a large number of machines, run millions of punched cards to process data, and print the running results on paper or make new punched cards. Data management is the physical storage and processing of all these punched cards. However, in 1 9 5 1 year, a computer named Univac I from Remington Rand Company introduced a tape drive that could input hundreds of records in one second, which triggered a revolution in data management. 1956 IBM produced the first disk drive-305 RAMAC. This drive has 50 disks, each with a diameter of 2 feet, and can store 5MB of data. The biggest advantage of using disk is that it can access data randomly, while punched cards and magnetic tapes can only access data sequentially.

195 1: Univac system uses magnetic tape and punched cards as data storage.

The germination of database system appeared in the 1960s. At that time, computers began to be widely used in data management, which put forward higher and higher requirements for data enjoyment. The traditional file system can no longer meet people's needs. The database management system that can manage and share data in a unified way came into being. Data model is the core and foundation of database system, and all kinds of DBMS software are based on some data model. Therefore, according to the characteristics of data model, traditional database systems are usually divided into three categories: mesh database, hierarchical database and relational database.

The first network DBMS appeared, which was the IDS(Integrated DataStore) successfully developed by Bachman and others of General Electric Company of the United States in 196 1 year. 196 1 year, Charles Bachman of American General Electric Company successfully developed the world's first mesh DBMS and the first database management system-integrated DataStore IDS, which laid the foundation of mesh database and was widely distributed and applied at that time. IDS has the characteristics of data pattern and log. But it can only run on GE host, and the database has only one file, and all tables in the database must be generated by manual coding. After that, GE's customer BF Goodrich Chemical finally had to rewrite the whole system. The rewritten system is named Integrated Data Management System (IDMS).

The mesh database model can naturally simulate hierarchical and non-hierarchical things. Before the emergence of relational database, mesh DBMS was more widely used than hierarchical DBMS. In the history of database development, mesh database occupies an important position.

The hierarchical database management system is followed by the network database. The most famous and typical hierarchical database system is IMS developed by IBM in 1968.

(Information Management System), a hierarchical database suitable for its host. This is the earliest large-scale database system program product developed by IBM. It came into being in the late 1960s, and now it has developed to IMSV6, providing support for advanced features such as clustering, N-way data sharing and message queue sharing. This 3 0-year-old database product plays a new role in today's WWW application connection and business intelligence application.

1973, Cullinane Company (later Cullinet Software Company) began to sell the improved version of IDMS of Goodrich Company, and gradually became the largest software company in the world at that time.

2. The origin of relational database

Mesh database and hierarchical database solve the problem of data set sharing well, but they still lack data independence and abstraction. When users access these two databases, they still need to make clear the storage structure of data and point out the access path. The relational database that appeared later solved these problems well.

1970, Dr. E.F.Codd, a researcher at IBM, published a paper entitled "Relational Model of Data in Large Shared Database" in ACM's Journal of Communication, and put forward the concept of relational model, laying a theoretical foundation for relational model. Although Childs proposed a set-oriented model in 1968, this paper is generally regarded as an epoch-making milestone in the history of database systems. Codd's wish is to build a beautiful data model for the database. Later, Codd published many articles one after another, discussing the paradigm theory of measuring relational system and 12 standard, and laid the foundation of relational database with mathematical theory. The relational model has a strict mathematical foundation, a high degree of abstraction, simplicity, and easy understanding and use. But at that time, some people thought that relational model was an idealized data model, and it was unrealistic to use it to realize DBMS, especially worried that the performance of relational database was unacceptable, and some even regarded it as a serious threat to the ongoing mesh database normalization. In order to promote the understanding of the problem, 1974 ACM took the lead in organizing a seminar, at which there was a debate between the two factions for and against relational databases, led by Codd and Bachman respectively. This famous debate promoted the development of relational database, and finally made it the mainstream of modern database products.

1969: Edgar F. "ted" codd invented the relational database.

After the relational model of 1970 was established, IBM added more researchers to the San Jose laboratory to study this project, which is the famous system R. Its goal is to demonstrate the feasibility of a fully functional relational DBMS. The project ended in 1979, and the first DBMS to realize SQL was completed. However, IBM's commitment to IMS prevented System R from being put into production. It was not until 1980 that System R was officially put into the market as a product. There are three reasons for the slow pace of IBM's productization: IBM attaches importance to reputation and quality, and minimizes failures; IBM is a big company with a huge bureaucracy. IBM already has a hierarchical database product, and the relevant personnel are not active or even opposed.

At the same time, however, in 1973, michael stonebraker and Eugene Wong of the University of California, Berkeley began to develop their own relational database system, Ingres, using the data published by System R. The Ingres project they developed was finally commercialized by manufacturers such as Oracle and Ingres in Silicon Valley. Later, both System R and Ingres won the "Software System Award" of 1988 issued by ACM.

From 65438 to 0976, Honeywell developed the first commercial relational database system-Multics relational data storage. Relational database system is based on relational algebra. After decades of development and practical application, this technology is becoming more and more mature and perfect. Its representative products are Oracle, DB2 of IBM, MS SQL Server of Microsoft, Informix, ADABASD and so on.

3. Structured Query Language (SQL)

1974, Ray Boyce and Don Chamberlin of IBM expressed the mathematical definition of the Codd relational database 12 criterion with simple keyword grammar, and proposed SQL (Structured Query Language) as a milestone. The functions of SQL language include query, operation, definition and control. It is a comprehensive and universal relational database language, and it is also a highly non-procedural language, which only needs users to point out what to do instead of how to do it. SQL integration realizes all operations in the database life cycle. SQL provides a way to interact with relational databases, which can work with standard programming languages. Since its birth, SQL language has become the touchstone for testing relational databases, and every change of SQL language standard guides the development direction of relational database products. However, it was not until the mid-1970s that relational theory was applied to commercial databases Oracle and DB2 through SQL.

1986, ANSI adopted SQL as the American standard of relational database language, and published the standard SQL text in the same year. At present, there are three versions of SQL standard. The basic definition of SQL is ANXIX 3 135-89, Database Language-SQL with Enhanced Integrity [ANS 89], commonly known as SQL-89. SQL-89 defines schema definition, data manipulation and transaction processing.

SQL- 89, followed by ANXIX 3 168- 1989 and "Database Language-Embedded SQL" constitute the first generation of SQL standards. Ansix 3135-1992 [ans92] describes an enhanced SQL, which is now called the SQL-92 standard. SQL-92 includes enhanced features, such as schema operation, dynamic creation and execution of SQL statements, and network environment support. After completing the SQL-92 standard, ANSI and ISO began to cooperate to develop the SQL3 standard. The main feature of SQL3 is that it supports abstract data types, which provides a standard for a new generation of object-relational databases.

1969: Edgar F. Codd invented the relational database.

1976, IBM E.F.Codd published a landmark paper "r system: database relational theory", which introduced relational database theory and query language SQL. Ellison, the founder of Oracle Bone Inscriptions, read this article very carefully and was shocked by its contents. This is the first time that someone has used a comprehensive and consistent scheme to manage data information. Author E.F.Codd published the theory of relational database ten years ago and developed a prototype in IBM research institute. This project is an R system, and the language for accessing data tables is SQL. After reading it, Ellison was keenly aware that a software system could be developed on the basis of this research. At that time, most people thought that relational databases would not have commercial value. Ellison thought this was their chance: they decided to develop the general commercial database system Oracle, which was named after the project they had done for the CIA. A few months later, they developed Oracle 1.0. But this is just a toy. They can do nothing but complete simple relationship queries. It took them a long time to make Oracle available. Maintaining the company's operation mainly depends on undertaking some database management projects and consulting work. However, IBM has no plans to develop it. There are many reasons why Big Blue gave up this product worth tens of billions: IBM researchers are mostly from academic backgrounds, and they are most interested in theory, not products on the market. From an academic point of view, research results should be made public, and you can become famous by publishing papers and speeches. Why not? Another main reason is that IBM has a hierarchical database product IMS, which sold well at that time. Until 1985, I B M published the relational database DB 2, and Ellison had become a multi-millionaire at that time. Ellison once compared IBM's choice of Microsoft's MS-DOS as the operating system of IBM-PC: "The most serious mistake in the history of world business, worth more than hundreds of billions of dollars." The mistake of IBM publishing R system papers without launching relational database products soon may be second only to this. The market value of Oracle Bone Inscriptions reached $28 billion in 1996.

At present, there are three versions of SQL standard. The basic definition of SQL is ANXIX 3 135-89, "database language-SQL with integrity enhancement" [ANS 89], commonly known as SQL-89. SQL-89 defines schema definition, data manipulation and transaction processing. S Q L-8 9 and ANXIX 3 168- 1989 "Database Language-Embedded S Q L" constitute the first generation of SQL standards. Ansix 3135-1992 [ans92] describes an enhanced SQL, which is now called the SQL-92 standard. SQL-92 includes enhanced features, such as schema operation, dynamic creation and execution of SQL statements, and network environment support. After completing the SQL-92 standard, ANSI and ISO began to cooperate to develop the SQL3 standard. The main feature of SQL3 is that it supports abstract data types, which provides a standard for a new generation of object-relational databases.

4. Object-oriented database

With the development of information technology and market, people find that although the technology of relational database system is mature, its limitations are obvious: it can handle the so-called "tabular data" well, but it can do nothing about the increasingly complex data types in the technical field. Since 1990s, the technical field has been researching and seeking new database systems. However, the industry was once quite confused about the development direction of the new database system. Influenced by the technical trend at that time, people spent a lot of energy on the research of "object-oriented database system" or "object-oriented database system" for short. It is worth mentioning that the object-oriented relational database theory put forward by Professor Stonebraker in the United States was once favored by the industry. At that time, Stonebraker himself was hired by Informix as the chief technology officer at a large price.

However, the development of several years shows that the market development of object-oriented relational database system products is not ideal. Theoretical perfection has not brought enthusiastic response from the market. The main reason for its failure is that the main design idea of this database product is to try to replace the existing database system with a new database system. For many customers who have used the database system for many years and accumulated a lot of work data, especially big customers, the huge workload and huge expenses brought by the conversion of old and new data are unbearable. In addition, the object-oriented relational database system makes the query language extremely complex, which makes both database developers and application customers regard its complex application technology as a daunting road.

5. Changes in data management

At the end of 1960s, a new database software, Decision Support System (DSS), appeared, aiming at making managers use data information more effectively in the decision-making process. So at 1970, the first on-line analytical processing tool-Express was born. Other decision support systems followed, many of which were developed by the company's IT department.

1985, the first business intelligence system was developed by Metaphor Computer System Co., Ltd. as Procter &; Developed by Gamble Company, it is mainly used to connect sales information and retail scanner data. In the same year, Pilot Software Company began to sell the first commercial client/server execution information system-Command Center. Also in this year, the Ingres project of the University of California, Berkeley evolved into Postgres, with the goal of developing an object-oriented database. The following year, Graphael Company developed the first commercial object database system-GBASE.

1988, IBM researchers Barry Devlin and Paul Murphy invented a new term-information warehouse. After that, IT vendors began to build experimental data warehouses. 199 1 year, W.H. "Bill" Inmon published the book "How to Build a Data Warehouse", which made the data warehouse really start to be applied.

199 1: W.H. "Bill "published" Building a Data Warehouse "on mon.

In 1990s, with the widespread adoption of client/server computing mode based on PC and enterprise software packages, the reform of data management was basically completed. Data management is no longer just to store and manage data, but to transform into various data management methods that users need. The sudden emergence of Internet and the appearance of XML language have opened up a new world for the development of database system.

[Edit this paragraph] The future development trend of the database

With the continuous expansion of information management content, various data models (hierarchical model, grid model, relational model, object-oriented model, semi-structured model, etc. ) has emerged, and new technologies (data flow, Web data management, data mining, etc. ) have emerged. At present, every few years, some international senior database experts get together to discuss the current situation, existing problems and new technology focus that needs attention in the future. Several similar reports in the past include:1989 Future direction of DBMS research-Laguna beach participants, 1990 database system: achievements and opportunities, 1995 database 199 1: W. H.Inmon published the establishment of data warehouse.