Information retrieval homework questions and some answers
1. What are the specific content requirements of information literacy or quality?
The term information literacy originated from the United States. Simply put, information literacy refers to the ability to acquire, process, and process information resources and master and use information tools in the information society through education. . In 1998, the United States formulated nine information literacy standards for student learning, including: being able to obtain information effectively and efficiently; being able to evaluate information skillfully and critically; being able to use information accurately and creatively; being able to explore information related to personal interests. information; be able to appreciate works and other creative expressions of information; be able to strive to be the best in information inquiry and knowledge innovation; be able to recognize the importance of information to a democratic society; be able to perform compliance with information and information technology Ethical and moral code of conduct; able to actively participate in activities to explore and innovate information. To sum up, complete information literacy should include three levels: cultural literacy (knowledge level), information awareness (awareness level), and information technology (technical level).
2. What are the concepts of information, knowledge, intelligence and documents? What are the components of literature?
Information includes knowledge, documents and intelligence. It is a collection of information from low-level to high-level.
Knowledge is a spiritual product formed by human beings’ understanding and processing of various information. It is the human brain’s re-understanding of a large amount of information through thinking.
Intelligence refers to the knowledge or facts that are transmitted , is the activation of knowledge, which is the use of certain media (carriers) to deliver specific knowledge and information to specific users across space and time to solve specific problems in scientific research and production.
Intelligence should have three basic attributes: first, it is knowledge or information; second, it must be transmitted; third, it must be used by users to generate benefits. Intelligence not only depends on the intelligence source, but also on the intelligence user.
Documentation is a carrier that records human knowledge using text, graphics, symbols, audio, video and other technical means, or it can be understood as knowledge solidified on a certain material carrier. Now it is usually understood as the sum of various publications such as books and journals. Documents are the most effective means of recording, accumulating, disseminating and inheriting knowledge. They are the most basic and main source of information in human social activities and the most basic means of exchanging and disseminating information.
The constituent elements of documents should include: intellectual content, document symbol system, document recording method, and document carrier. These elements are interconnected and mutually reinforcing
3. Information, information What types of resources are there?
Types of information:
According to the nature of the object that generates information, it can be divided into natural information (instantaneous sound, light, heat, electricity, various weather changes, Slow crustal movement, evolution of celestial bodies...), biological information (various forms and behaviors exhibited by living things for survival, such as genetic information, information exchange within organisms, information exchange within animal populations), machine information (automatic control systems) and (human) social information.
Types of information resources:
Divided according to the carrier material and production method of documentary information:
(1) Printed type
( 2) Microform
(3) Audio-visual type (audio-visual type)
(4) Electronic type (machine-readable type)
According to the purpose of writing and Style classification:
Documentary information resources are divided according to the purpose of writing and style, and can be mainly divided into works, academic papers, patent specifications, scientific and technological reports, technical standards, scientific and technological archives, and product information. Among them, the top five are those with higher information content, academic value and frequency of use.
Divided according to the order of generation of document information and the depth of collation and processing:
Documentary information resources are divided according to their information processing depth and can be divided into zero-time literature information, primary literature information, and secondary literature information. Document information, tertiary document information and high-order document information.
Classification based on publication form and content disclosure level:
Documents can be divided into three types: white documents, gray documents, and black documents
4. What are the documents? type? What is the basis for dividing the literature into these types?
According to different classification standards, there are many ways to classify documents.
Divided according to the editing method and publishing characteristics of the document:
1. Books
2. Journal
3. Special documentation
Special documentation mainly includes the following types:
(1) Scientific and technological reports
(2) Government publications
< p>(3) Conference documents(4) Dissertations
(5) Patent documents
(6) Standard documents
(7) Product samples
4. Other scattered information
Divided according to the form of document carrier:
1. Printed literature
2. Microfilm documents
3. Audio-visual literature
4. Machine-readable documents
Divided according to document processing levels:
1. Primary document
2. Secondary literature
3. Tertiary Literature
5. What type of database? What are the characteristics of network resources?
Databases are divided into types, which are divided according to data models. Currently, the data models that are maturely used in database systems include: hierarchical model, network model and relational model.
Another:
Fuzzy database
Refers to a database that can handle fuzzy data. General databases are based on linear logic and precise data tools, and cannot express many ambiguous things. With the establishment of the fuzzy mathematical theoretical system, people can use quantities to describe fuzzy events and perform fuzzy operations. In this way, incompleteness, uncertainty, and fuzziness can be introduced into the database system to form a fuzzy database. There are two main aspects of fuzzy database research. The first is how to store fuzzy data in the database; the second is defining various operations to establish functions on fuzzy data. The representations of fuzzy numbers mainly include fuzzy interval numbers, fuzzy center numbers, fuzzy set numbers and membership functions.
Statistical database
A database system that manages statistical data. This type of database contains a large number of data records, but its purpose is to provide users with various statistical summary information rather than providing individual record information.
Network database
A database that handles a network data model with record types as nodes. The processing method is to decompose the network structure into several secondary tree structures, called systems. A relationship type
is a description of the relationship between two or more record types. In a system type, one record type is in a dominant position, called the main record type, and the others are called member record types. The relationship between the department owner and members is a one-to-many relationship. The representative of network database is DBTG system. In 1969, the CODASYL organization in the United States proposed a "DBTG report". From now on, systems implemented based on the DBTG report are generally called DBTG systems. Most of the existing network database systems adopt the DBTG solution. The DBTG system is a typical three-level structure system: sub-mode, schema, and storage mode. The corresponding data definition languages ??are respectively called subschema definition language SSDDL, schema definition language SDDL, and device media control language DMCL. There is also a data manipulation language DML.
Deductive database
refers to a database with deductive reasoning capabilities. Generally, it is implemented using a database management system and a rule management system. Storing factual data for reasoning in a database is called an extensional database; using logical rules to define facts to be derived is called a connotation database. The main research content is how to effectively calculate logical rule reasoning. Specifically: optimization of recursive queries, consistency maintenance of rules, etc.
Characteristics of online academic information resources:
1. Extremely rich in content.
Online academic information resources cover a wide range of subjects, covering various subject areas; there are many types of information, formally published, informally published, provided by academic institutions, and provided by individuals, all intertwined. Of course, there is both valuable information and a lot of meaningful information.
2. The overall distribution is chaotic
Because there is no unified management organization for online information, and there is no unified release standard, and changes, replacements, rebirths, and demises occur from time to time, it is difficult to control.
This results in online academic resources being ordered within a certain local scope, while the overall distribution of resources is scattered, disordered, and even chaotic.
3. Dynamic changes in information
The Internet is a huge dynamic system. Not only is information scattered and disorderly, but it is also frequently replaced. New websites appear every day, and some websites are withdrawn or deleted. Reorganization, and each website's own link address and column settings also change frequently.
4. Network information has strong timeliness
The release of network information has compressed the editing, publishing and distribution of traditional documents, and some are even completely published online, realizing the connection between authors and Editors can communicate in real time without being restricted by time and space, which greatly shortens the time for editing and publishing information, making the information highly timely.
5. Fast and fast retrieval
6. What are "core journals"? What are the core journals in this major?
Core journals are journals with higher academic level and are an important part of my country’s academic evaluation system. It is mainly reflected in the confirmation of academic level. For example, it is found in quite a number of teaching and research units. Apply for senior professional titles, obtain qualifications for doctoral thesis defense, apply for scientific research projects, evaluate the academic level of scientific research institutions or colleges and universities, the workload completed by teachers and staff, etc. One of the prerequisites is to publish in core journals within a certain period of time. Several papers are divided into national, provincial, municipal and other levels.
The core journals in the material forming profession include:
Metal heat treatment, metal forming technology, mold industry, Northern Forum, thermal processing technology, mold industry, special casting and non-ferrous alloys, engineering Plastic applications, forging technology, casting technology, specialty casting and non-ferrous alloys, light alloy processing technology, casting, casting technology, journal of materials research, thermal processing technology, mechanical engineering materials, weapon materials science and engineering, automotive technology, Chinese plastics, engineering Plastic applications, machine tools and hydraulics, forging equipment and manufacturing technology, Chinese plastics, modern manufacturing engineering, engineering plastics applications, modern manufacturing engineering, forging machinery, micro motors, new technologies and new processes, micro motors, Journal of Chongqing University (Natural Science Edition), Journal of Chongqing University (Natural Science Edition), Journal of Wuhan University of Technology, Journal of Plastic Engineering
7. What is information retrieval? What are the types of information retrieval?
Information retrieval refers to the process of organizing and storing information in a certain way, and finding the required information according to the user's needs. It is also called "information storage and Retrieve".
Information retrieval (Information Retrieval) refers to the process and technology of organizing information in a certain way and finding relevant information according to the needs of information users.
Information retrieval in the narrow sense is the second half of the information retrieval process, that is, the process of finding the required information from the information collection, which is what we often call information search (Information Search or Information Seek).
Type:
(1) Divided by search content
1. Bibliographic search
2. Data search
< p>3. Fact retrieval4. Full-text retrieval
5. Image retrieval
6. Multimedia retrieval
(2) Press Classification based on whether search tools are used
1. Direct search
2. Indirect search
(3) Classification based on information retrieval methods
1 , Traditional information retrieval
2. Modern information retrieval
8. Briefly describe the principles of information retrieval.
The basic principle of information retrieval is to establish various retrieval systems by collecting, processing, organizing, and storing large amounts of scattered and disordered literature information, and through certain methods and means Make the feature identifiers used in the two processes of storage and retrieval consistent so that information sources can be obtained and utilized effectively. Storage is for retrieval, and retrieval must first be stored.
In addition
The basic principles of information retrieval can be described from three aspects: document substitution, ordering and identification matching.
1. Substitution of literature
2. Sorting of documents
3. Matching of document feature identifiers and search question identifiers
The so-called information retrieval principle, simply put, is the search question identifier and the search query identifier stored in the retrieval tool. A retrieval method that compares or matches document feature marks and then extracts matching document information.
9. What is a computer information retrieval system and how many parts does it consist of?
Computer information retrieval system: a computer application technology developed by utilizing the computer system's ability to effectively store and quickly search. It is concerned with the construction, analysis, organization, storage and dissemination of information. Computer information retrieval system is a collection of hardware resources, system software and retrieval software used for information retrieval. It can store large amounts of information and classify, catalog or index information items (basic information units with specific logical meanings). It can extract specific information from the stored information collection according to user requirements and provide the ability to insert, modify and delete certain information.
The composition of a computer information retrieval system, in terms of physical composition, the retrieval system consists of three parts: hardware, software, and database
10. What are the computer information retrieval technologies?
Boolean logical search
Truncation search
Proximity search
Field restriction search
Fuzzy search
p>
11. From the perspective of appearance characteristics and content characteristics, what are the types of search languages? Here are simple search terms for each of the different features.
Retrieving literature information based on the characteristics of the literature is the simplest way. It has two characteristics: one is the appearance characteristics of the document, that is, "author, book title, journal title, number", etc.; the other is the content characteristics, that is, "category, theme, keywords", etc.
Retrieval languages ??mainly include classification languages ??characterized by code languages ??and subject languages ??characterized by names and terms of things.
Classification language: Welding Engineer's Manual Chen Zhunian Machinery Industry Press
Subject language: (Tang OR Song) AND poetry
Retrieval of document information based on the characteristics of the document is The easiest way. It has two characteristics: one is the appearance characteristics of the document, that is, "author, book title, journal title, number", etc.; the other is the content characteristics, that is, "category, theme, keywords", etc.
12. Why create a "search language" and what types does it have? Compare the advantages and disadvantages of classification languages ??and topic languages.
(1) Classification according to the nature and principle of identification
1. Classification language
Classification language refers to using numbers, letters or a combination of letters and numbers as the basic Character, a type of retrieval language that uses a writing method in which characters are directly connected and dots (or other symbols) are used as separators, basic categories are used as basic vocabulary, and complex concepts are expressed by the subordination of categories.
The information processing method that uses knowledge attributes to describe and express information content is called classification. Famous classifications include the International Decimal Classification, the Library of Congress Book Classification, the International Patent Classification, and the Chinese Library Book Classification.
2. Theme language
Theme language refers to a type of search language that uses natural language characters as characters, noun terms as basic vocabulary, and a set of noun terms as search identifiers. The information processing method that uses thematic language to describe and express information content is called thematic method. Theme language can be divided into title words, meta words, descriptors, and keywords.
(1) Title words
Title words refer to words, phrases or phrases selected from natural language and standardized to represent the concept of things. Title words are the earliest type of subject language system. They form a search identifier through a fixed combination of main title words and subtitle words. Only "stereotyped" title words can be used for indexing and retrieval, which reflects that the concept of document subject is bound to be limited. , does not meet the needs of the development of the times, and is now rarely used.
(2) Meta-words
Meta-words are also called unit words, which refer to the smallest and most basic vocabulary unit that can be used to describe the topic of the information. A standardized set of meta-words that can express the theme of information constitutes a meta-word language. Metalexical method is a method of expressing complex thematic concepts through the combination of several unit words.
Metaword languages ??are mostly used for mechanical retrieval, and are suitable for identifying information using simple identification and retrieval methods (such as punch cards, etc.).
(3) Thesaurus
Thesaurus refers to dynamic information that is based on concepts, standardized and optimized, has a combination function and can display the semantic relationship between words. word or phrase. Generally speaking, the selected descriptors are conceptual, descriptive, and combinative. After standardization, it also has semantic relevance, dynamics, and intuitiveness. The descriptor method combines the principles and methods of multiple information retrieval languages. It has many advantages and is suitable for computer and manual retrieval systems. It is currently a widely used language. Famous search tools such as CA and EI all use thesaurus for arrangement.
(4) Keywords
Keywords refer to words that appear in the title, abstract, and text of a document and have substantial significance for characterizing the subject content of the document. They are also important for revealing and describing the subject content of the document. It is an important and critical word. The keyword method is mainly used for computer information processing to extract words and compile indexes, so this kind of index is called keyword index. The "CMCC" database, which is frequently used in retrieving Chinese medical literature, was established using the keyword index method.
3. Code language
Code language refers to using a certain code system to represent and arrange the concepts of certain aspects of things, thereby providing a retrieval language for retrieval. For example, based on the code language of the molecular formula of a compound, a molecular formula index system can be constructed, allowing users to search for corresponding compounds and related literature information based on the molecular formula.
(2) Classification according to the characteristics of expressing documents
1. Retrieval language that expresses the external characteristics of documents
The retrieval language that expresses the external characteristics of documents mainly refers to documents The title (title), author’s name, publisher, report number, patent number, etc. Arrange different documents according to the alphabetical order of the article title and author name, or arrange them in numerical order according to the report number and patent number, forming a search language that meets the user's needs through the search method of article title, author and number.
Describe e-book copyright solutions and carry out large-scale contract licensing work with authors and publishers. After unremitting efforts, 300,000 authors have so far agreed to authorize their works to Superstar Digital Library;
Huge user base, considerate services
Millions Registered users are spread all over the world, involving people from all walks of life in all provinces, industries, universities, and scientific research institutions across the country; 16×7 online technical customer service staff who do not rest on holidays can answer your questions at any time through customer service hotlines, online forums, emails, etc. .
The Scholar's Home Digital Library is a comprehensive digital library based on the China Information Resources Platform. The Scholar's Home Digital Library integrates books, journals, newspapers, papers, CDs, etc. In terms of carriers, it includes resources from various carriers such as printed editions, CD editions, and online editions. It includes more than 500 online publishing houses, more than 7,000 periodicals, and more than 1,000 newspapers. Every year, it collects 30,000 newly published Chinese books, 600,000 periodical documents, and 900,000 newspaper documents. It consists of sub-networks such as China Book Network, China Periodicals Network, China Newspaper Network, China Information Network and China CD Network. The resource content is divided into three levels: title, abstract, and full text. It provides ten database retrieval functions such as full text, titles, and subject words, as well as CN-MARC format data copying functions. It provides online ordering functions for printed books, newspapers, CD-ROM databases, and other databases. It also provides customized digital processing of resources for member units. Serve. In short, the Scholar's Home Digital Library is a comprehensive digital library that integrates a database application platform, an information resource e-commerce platform and a resource digital processing service platform.
The "Chinese Journal Full-text Database (CJFD)" is currently the world's largest continuously and dynamically updated Chinese journal full-text database, with an accumulation of 8 million full-text documents and more than 15 million entries, divided into nine major albums. 126 thematic literature databases.
Knowledge source: The full text of 6,100 core journals and professional specialty journals published in China.
Database features:
● Highly integrated with massive data, integrating bibliography, abstracts, and full-text document information to achieve one-stop document information retrieval (One-stop Access);
● The knowledge content is organized with reference to the popular knowledge classification system at home and abroad, and the database has a knowledge classification navigation function;
● It has many search portals including full-text search, and users can A certain search portal can be used for primary search, or you can use Boolean operators and other flexible search query formulas for advanced search;
● It has a citation connection function, which can not only build relevant knowledge networks, but also be used for Measurement and evaluation of individuals, institutions, papers, journals, etc.;
● Full-text information is completely digitized, and through the most advanced browser available for free download, the original layout structure and style of journal articles can be achieved without distortion. Display and print;
● Every paper in the database has obtained clear electronic publishing authorization;
● Diversified product forms and timely data updates can meet the needs of different types, The personalized information needs of users in different industries and sizes;
● Database exchange service centers across the country and overseas, coupled with year-round user training and efficient technical support.
Application of database:
In addition to regular services such as information retrieval, information consultation, and original text delivery, CJFD can also be used for the following special services:
● Citation service, generate citation search report;
● Novelty search service, generate novelty search report;
● Journal evaluation, generate journal evaluation search report;
< p>● Scientific research capability evaluation, generate scientific research capability evaluation search report;● Project background analysis, generate project background analysis search report;
● Topic setting service, generate CNKI news.
VIP Information's "Chinese Science and Technology Journal Database" adopts the domestic first-class search core "Shangwei Full-text Search System" to realize the search management of the database. "Shangwei full-text retrieval system" is a retrieval system that has been unanimously recognized by a domestic expert team as "domestic leading and internationally advanced". Its various indicators and comprehensive performance are far ahead of other similar products.
"Chinese Science and Technology Journal Database" is the first large-scale database product in China to adopt the OpenURL technical specification. The OpenURL (Open Uniform resource Locators) protocol is a context-sensitive open link framework that implements simultaneous access to different Heterogeneous databases or information resources perform data association to easily provide user units with secondary development and utilization of resources, such as data association with the library's OPAC system. The OpenURL protocol has become an American national standard. VIP is the first database manufacturer in China to apply the OpenURL protocol. It has been successfully used in the Chinese Academy of Sciences, the National Library, the Northern University of Aeronautics and Astronautics, and the Chinese Biomedical Literature Database, with obvious results and very popular.
Wanfang Data Knowledge Service Platform
System functions and features
Wanfang Data Knowledge Service Platform provides users with more functions and services. Mainly reflected in the following aspects:
The system provides flexible classification organization functions. By defining the association between resources, it can break the physical boundaries of the database and organize related database resources in a unified view. . For example, through the category browsing view, you can browse resources in multiple databases such as thesis database and digital journal full-text database at the same time.
Search history function
Users can view their recent search records (CQL expressions) through the "Search history" link in the search entrance and view the search records in the corresponding database Search results. As shown in the figure:
Integration function of cross-database retrieval systems
The system can cross multiple database retrieval systems and realize the integration of various retrieval systems. Currently, the system already supports cross-RMS database and MS SQL Server database. At the same time, the system provides an expansion mechanism to add support for other databases according to user needs.
Complete load balancing and fault-tolerant retrieval cluster
The system provides complete management and control functions of retrieval servers and file server clusters, and can dynamically add, delete, and modify servers cluster.
File cluster
The system can support a variety of file engines and realize the integration of various file systems. Currently, the system supports local files, shared files, and ftp file services. An extension mechanism is provided to add support for other file systems according to user needs.
Supports multiple backend databases
The system uses O/R mapping technology to realize the isolated configuration of the underlying configuration database. Currently, the system can run on Oracle/MS Sql Server/Firebird and other databases. On top of that.
The perfect combination of resource integration and user personalized services
The provision of SRW interface and OpenUrl interface
In order to facilitate the exchange of metadata databases and full-text acquisition, the system also Based on the SRW standard, it provides a retrieval extension interface; based on the OpenUrl standard, it provides an open interface for full-text resources such as journal articles.