What are the internal characteristics of documents?

Question: Briefly describe what are the internal characteristics and external characteristics of documents?

Document external features are a document retrieval language. The retrieval language for external features of documents mainly refers to the retrieval of the document title (i.e. document title), author name, publisher, report number, patent number, etc. Arrange different documents according to the alphabetical order of the article title and author name, or according to the numerical order of report number and patent number, forming a search language used to meet user needs.

Documents are all carriers that record knowledge. This definition may seem simple, but it is actually very objective, comprehensive, and rigorous.

It is not difficult to see from the literal meaning of the definition that there are basically three components of the literature described here. The first is the carrier itself, the second is the content contained in the carrier, and the third is the recording method or means of the carrier content.

Obviously, according to the definition of literature, the special carrier of literature must meet the following three basic conditions at the same time: first, it must have certain information and knowledge content; second, it must use certain recording methods; third, it must express Take a certain carrier form.

Information and knowledge are the main components of documents. Without any form of carrier of information and knowledge content, it can only be called material, but it cannot be called documents. Any form or type of document must first have certain information and knowledge content.

What are the content characteristics of documents

Quoted from Baidu Encyclopedia Document Search Article

2 What are the content characteristics and appearance characteristics of documents? What's the difference between the two? Information that has little or no relationship with the subject content of the document information is called the appearance characteristics of the document information, such as author, author's organization, journal name, patent number of patent specification, report number of *** report, etc.

Information closely related to the subject content of the literature information is called the content characteristics of the literature information. The main features of document information content include various forms of subject headings and classification numbers. Because the title of a document can often reflect the theme of the document, it is often classified as a content feature. The difference between content features and appearance features is: content features are closely related to the content features of document information, and vice versa.

Reference: baike.baidu/view/156526?wtp=tt

Which approach is to retrieve the internal characteristics of the literature

By reflecting the literature data Search the literature using the subject words of the content. Since the topic method can reflect all aspects of literature on a topic, it is convenient for readers to conduct comprehensive and systematic thematic research on a certain issue, thing or object. We can find all aspects of literature on the same topic through the subject directory or index.

What are the characteristics of the literature? What characteristics of the document do the book title, author, publisher, classification number, publication date, subject headings, etc. belong to?

The characteristics of Wentu are mainly divided into external characteristics and internal characteristics. The external features mainly include book title, author (responsible person), publisher (publishing house), publication date, place of publication, volume, etc., while the internal features mainly include classification number, subject headings, etc.

As you said: book title·author·publisher·classification number·publication time are external features.

Subject headings are internal features.

What are the main content characteristics of the literature described in the search language?

Search language and its functions 1. The concept of search language

Search language is a specialized language compiled to meet the common needs of document information processing, storage and retrieval. A concept identification system that expresses a series of concepts and their interrelationships that summarize the document information content and retrieval topic content. In short, retrieval language is an artificial language used to describe the characteristics of information sources and perform retrieval. It can be divided into two categories: standardized language and non-standardized language (natural language).

2. The role of retrieval language

Retrieval language plays an extremely important role in information retrieval. It is a bridge between the two processes of information storage and information retrieval.

During the information storage process, it is used to describe the content and external characteristics of the information, thereby forming a retrieval identifier; during the retrieval process, it is used to describe the retrieval question, thereby forming a question identifier; when the question identifier completely matches or partially matches the retrieval identifier , the result is a hit document.

The main functions of the search language are as follows: ① Index document information content and appearance characteristics to ensure the consistency of documents represented by different indexers; ② Concentrate or reveal document information with the same and related content Relevance; ③ Centralize, systematize and organize the storage of document information, making it easier for searchers to conduct ordered searches according to a certain order; ④ Facilitate comparison of the consistency of indexing quotes and search terms to ensure that different searchers Consistency in expressing the content of the same document, as well as consistency in the expression of the content of the same document by searchers and indexers; ⑤ Ensure that searchers can obtain the highest recall and precision rates when searching for documents according to different needs.

Types of retrieval languages ??

Currently, there are thousands of information retrieval languages ??in the world, and their types are also different depending on how they are divided. The following describes two commonly used search language classification methods and their types.

(1) Classification according to the nature and principle of identification

1. Classification language

Classification language refers to using numbers, letters or a combination of letters and numbers as the basic Character, a type of retrieval language that uses a writing method in which characters are directly connected and dots (or other symbols) are used as separators, basic categories are used as basic vocabulary, and complex concepts are expressed by the subordination of categories.

The information processing method that uses knowledge attributes to describe and express information content is called classification. Famous classifications include the International Decimal Classification, the Library of Congress Book Classification, the International Patent Classification, and the Library of China Book Classification.

2. Theme language

Theme language refers to a type of search language that uses natural language characters as characters, noun terms as basic vocabulary, and a set of noun terms as search identifiers. . The information processing method that uses thematic language to describe and express information content is called thematic method. Theme language can be divided into title words, meta words, descriptors, and keywords.

(1) Title words

Title words refer to words, phrases or phrases selected from natural language and standardized to represent the concept of things. Title words are the earliest type of subject language system. They form a search identifier through a fixed combination of main title words and subtitle words. Only "stereotyped" title words can be used for indexing and retrieval, which reflects that the concept of document subject is bound to be limited. , does not meet the needs of the development of the times, and is now rarely used.

(2) Meta-words

Meta-words, also known as unit words, refer to the smallest and most basic vocabulary unit that can be used to describe the topic of the information. Standardized meta-words that can express the theme of information constitute a meta-word language. Metalexical method is a method of expressing complex thematic concepts through the combination of several unit words. Metaword languages ??are mostly used for mechanical retrieval, and are suitable for identifying information using simple identification and retrieval methods (such as punch cards, etc.).

(3) Thesaurus

Thesaurus refers to a dynamic word that is based on concepts, standardized and optimized, has a combination function and can show the semantic relationship between words. word or phrase. Generally speaking, the selected descriptors are conceptual, descriptive, and combinative. After standardization, it also has semantic relevance, dynamics, and intuitiveness. The descriptor method combines the principles and methods of multiple information retrieval languages. It has many advantages and is suitable for computer and manual retrieval systems. It is currently a widely used language. Famous search tools such as CA and EI all use thesaurus for arrangement.

(4) Keywords

Keywords refer to words that appear in the title, abstract, and text of a document and have substantial significance for characterizing the subject content of the document. They are also important for revealing and describing the subject content of the document. It is an important and critical word. The keyword method is mainly used for computer information processing to extract words and compile indexes, so this kind of index is called keyword index.

Searching Chinese medical...gt;gt;

What are the content features and external features of the book?

Are you a wealthy person?

What are the search methods for document content features and external features?

The search methods for document content features include: keywords, subject headings, classification numbers, etc.

The search methods for external characteristics of documents include: document title, document author, document source, citation information, etc.

What are the methods of document retrieval, and what are the methods of retrieval based on internal characteristics?

Retrieve documents through subject words that reflect the content of the literature. Because the topic method can reflect all aspects of literature on a topic, it is convenient for readers to conduct comprehensive and systematic thematic research on a certain issue, thing or object. We can find all aspects of literature on the same topic through the subject directory or index.