Question 1: What does weighted retrieval in literature retrieval mean? Are there any settings? Each identifier participating in the combined search is given a numerical value representing different degrees of importance based on the search requirements. It is a method of computerized information retrieval. The so-called "power" refers to the numerical value indicating the degree of importance. The so-called "weighting" is a measure of quantitative retrieval of documents. Different combinations of identifiers can be queued according to weight, which is an effective method to control retrieval quality.
Question 2: How to carry out subject path weighted search in pubmed IM EM CA patent index BA SCI special document patent type machine search 4 stages sub-topic keyword database structure type CBM weighted search PUBMED full text key EBM 1. Explanation of names : 1. Literature:
Question 3: What are the differences between search terms, subject terms, and subtopic terms in the MeSH database? Please help me. Search terms, as the name suggests, are the words you enter in the search box, a large part of it. It is a free word search. Subject words are standardized words that allow diseases with different names to be called by one "official" name. Subject words can be matched with subtopic words to improve the accuracy of retrieval. Different subject words have different combinations of subtopic words. Need to pay attention. There is also the issue of weighting, depending on your search needs.
Question 4: There are several ways to determine weighting coefficients in design methodology. What are they and what are their characteristics? Questions and Answers 1. Briefly describe the concepts of information, knowledge, documents and the relationship between them. . 1. Answer: Information: It is a reflection of the way things exist, their state of movement and their characteristics, and it is the signal and message sent by things. Knowledge: It is the result of human beings' thinking analysis, processing and refinement, systematization and theorization of information reflections of various phenomena and laws in nature and human society. Documents: all carriers that record knowledge. From the conceptual scope of the three, information is greater than knowledge than documents. Knowledge is a part of information, which is theoretical and systematic information. Documents are the part of knowledge that is recorded. 2. What types of documents are divided into according to different carrier forms, and give examples. 2. Answer: Documents are divided into different carrier forms: handwritten documents, such as oracle bones and bronze inscriptions, printed documents, such as books, periodicals, microfilmed documents, such as microfilm, microfiche, audio-visual documents, such as video tapes, audio tapes, science and technology Electronic film documents, such as database documents and network documents 3. The levels of documents and their interrelationships. 3. Answer: Documents are divided into four types according to the degree of processing: zero-time, primary, secondary and tertiary documents. Among them, zero-time documents refer to unpublished experimental records, original recordings (images), letters, manuscripts, oral communication information or physical objects, etc.; zero-time documents become primary documents after they are processed by the author and published publicly and enter the social circulation field. Documents; organize, process, and condense primary documents according to certain rules and methods according to various characteristics, and the resulting documents are secondary documents; use secondary document clues to synthesize, analyze, and process a large number of primary document contents , the document formed after refining is the third document.
4. How to narrow the search scope in computer search? 4. Answer: In computer search, methods to narrow the search scope include: (1) Add search terms connected by and, or use "secondary search" (2) Use specific subtopic words to limit (3) Use fields Limited search, such as title word search, subject word search, weighted search, etc. (4) Perform limited search of document types, languages, important journals, clinical core journals, years, etc. (5) Enter word search within a more specific classification range 5. What are the methods to obtain the full text of literature? 5. Answer: Methods to obtain full text include: (1) Search online full-text databases (2) Use online publishers and magazines (3) Use library collection catalogs (joint collection catalogs) (4) Use "online full-text delivery service" ( 5) Ask the author 6. What are the commonly used search methods? 6. Answer: Commonly used search methods include: free word search, subject word search, classification search, author search, institution search, citation search, limited search, etc. 7. Briefly describe the principle of information retrieval 7. Answer: The principle of information retrieval is : Comparing the similarities and differences between the question characteristics that describe the information required by a specific user and the retrieval identifier of the information storage, to find information that is consistent or basically consistent with the question characteristics. The essence is to compare and select the user's information needs with the information stored in the information ***, that is, the matching process. 8. What steps do computer searches usually involve? 8. Answer: The steps of information retrieval include: (1) Analyze the retrieval topic and clarify the purpose and requirements (2) Select appropriate retrieval tools (3) Select the retrieval method and determine the retrieval identifier. (4) Search for literature clues. (5) Browse the search results and obtain original documents. 9. Briefly describe the arrangement rules of IM topic index. 9. Answer: The arrangement rules of the subject index are as follows (1) The entire index is sorted by subject words (2) The same subject heading is sorted by subtopic words (3) Literature references with related content are included in the corresponding subject headings. Or under subject headings/subtopic headings; general general documents are placed directly under subject headings, and specific documents are placed under corresponding subtopic headings; the same document entry can be placed under multiple subject headings (4) for the same topic For bibliographic references under words or subtopics, English documents are arranged first, followed by non-English documents. The English translation titles of non-English documents are enclosed in [ ] to show distinction (5) English documents are arranged in order by the abbreviation of the journal title. (6) Non-English literature should be sorted first by the abbreviation of the genre, and in the same genre by the abbreviation of the journal title. 10. What are the reference systems for the "Medical Subject Headings" (MeSH) word order list? Give examples to illustrate their significance. 10. Answer: The first group: substitute reference, used to deal with the equivalent relationship between words. In the MeSH vocabulary list, for multiple synonyms, only one of the more scientific and common words is used as the standardized subject heading, and the others...gt; gt;
Question 5: What does it mean?
Basic explanation
A question word for a noun, usually expressing a question about something.
1. Unknown things.
2. Everything.
3. It has the same meaning as what.
4. Express doubts.
Detailed explanation
1. It means asking about someone, something or the nature or nature of something, and what information you got from there.
2. Indicates asking about something or something. Tell me what you are looking for.
3. A virtual finger indicates something uncertain, smelling a floral fragrance.
4. To express denial
Who is he, you actually miss him
5. To express blame
Why are you laughing?
p>
6. Indicates asking for possibilities not included in the preceding word or series of words
Is this a reptile, an amphibian, or something else?
7 Express surprise or excitement
What, no breakfast!
8. Everything
Regardless of the earth - give birth to everything Mother - what to give
9. Used before "ye" to indicate that there are no exceptions within the stated range
He is not afraid of anything
10 . Used before "都" to indicate that there are no exceptions within the scope of what is said
As long as you study hard, you can learn anything
11. Used as pronouns: all, all, everything ;For example: As long as you agree, I will give you everything. ....I'm not afraid of anything.
Question 6: Key points of information retrieval Questions and Answers
1. Briefly describe the concepts of information, knowledge, documents and the relationship between them.
1. Answer: Information: It is a reflection of the way things exist, their state of movement and their characteristics, and it is the signal and message sent by things.
Knowledge: It is the result of human beings’ thinking analysis, processing and refinement, systematization and theorization of information reflections of various phenomena and laws in nature and human society.
Documents: all carriers that record knowledge
From the conceptual perspective of the three, information is greater than knowledge than documents. Knowledge is a part of information and is theoretical and systematic information. Documents are the recorded part of knowledge.
2. What types of documents are divided into according to different carrier forms, and give examples.
2. Answer: Documents are divided into different carrier forms:
Handwritten documents, such as oracle bone inscriptions and bronze inscriptions
Printed documents, such as books, Journals
Microfilm documents, such as microfilm and microfilm
Audio-visual documents, such as video tapes, audio tapes, and scientific films
Electronic documents, such as database documents and network literature
3. The levels of literature and their interrelationships.
3. Answer: Documents are divided into four types according to the degree of processing: zero-time, primary, secondary and tertiary documents.
Among them, zero-time documents refer to unpublished experimental records, original recordings (images), letters, manuscripts, oral communication information or physical objects, etc.; zero-time documents are publicly published after processing by the author. The field of social circulation becomes primary documents; according to the various characteristics of primary documents, they are sorted, processed and condensed according to certain rules and methods, and the resulting documents are secondary documents; the clues of secondary documents are used to analyze a large number of primary documents The document formed after the content is synthesized, analyzed, processed, and refined is the third document.
4. How to narrow the search scope in computer search?
4. Answer: In computer search, methods to narrow the search scope include:
(1) Add search terms connected with and, or use "secondary search"
p>
(2) Use specific subtopic words to qualify
(3) Use fields to limit the search, such as title word search, subject word search, weighted search, etc.
(4) Conduct a limited search of document types, languages, important journals, core clinical journals, years, etc.
(5) Enter a word search within a more specific classification range
5 .What are the methods to obtain the full text of literature?
5. Answer: Methods to obtain the full text include:
(1) Search online full-text databases
(2) Use online publishers and magazines
(3) Use the library collection catalog (Union Collection Catalog)
(4) Use the "online full-text delivery service"
(5) Request from the author
6. What are the commonly used search methods?
6. Answer: Commonly used search methods include:
Free word search, subject word search, classification search, author search, institution search, citation search, limited search, etc.
7. Briefly describe the principle of information retrieval
7. Answer: The principle of information retrieval is to compare the similarities and differences between the question characteristics that describe the information required by a specific user and the retrieval identification of information storage. , to find information that is consistent or basically consistent with the characteristics of the question. The essence is to compare and select the user's information needs with the information stored in the information ***, that is, the matching process.
8. What are the steps usually involved in computer retrieval?
8. Answer: The steps of information retrieval include:
(1) Analyze the retrieval topic and clarify the purpose and requirements
(2) Choose appropriate retrieval tools
(3) Select the search method and determine the search identifier.
(4) Search for literature clues.
(5) Browse the search results and obtain original documents.
9. Briefly describe the arrangement rules of IM topic index.
9. Answer: The arrangement rules of the subject index are as follows
(1) The entire index is sorted by subject words in alphabetical order
(2) Under the same subject term, press Subtopic headings are arranged in alphabetical order
(3) Literature references with relevant content are included under the corresponding subject headings or subject headings/subtopic headings; general general literature is placed directly under the subject headings, and specialized documents are placed directly under the subject headings. Referential documents should be placed under corresponding subtopic headings; the same document reference can be placed under multiple subject headings
(4) For document references under the same subject heading or subtopic heading, English documents should be arranged first , then arrange the non-English literature, and the English translation titles of the non-English literature are enclosed in [ ] to show the difference
(5) The English literature is arranged in order by the abbreviation of the journal name
( 6) Non-English literature should be sorted first by the abbreviation of the genre, and in the same genre by the abbreviation of the journal title.
10. What are the reference systems for the "Medical Subject Headings" (MeSH) word order list? Give examples to illustrate their significance.
10. Answer: The first group: substitute reference, used to deal with the equivalent relationship between words.
In the MeSH vocabulary list, for multiple synonyms, only one of the more scientific and common words is used as the standardized subject heading, and the others...gt;gt;
Question 7: Only one of the more scientific and common words is used in the PUBMED database What is the difference between searching with main subject words only and without expanding subordinate subject words? Using the Chinese version of pubmed, the differences between all operations are clear at a glance
Question 8: Why should weighting be done before contingency table analysis in data mining? There is a lot of knowledge hidden in data warehouses, databases or other information bases that can provide the knowledge needed for decision-making in business, scientific research and other activities. Classification and prediction are two forms of data analysis that can be used to extract models that describe important data*** or predict future data trends. The classification method (Classification) is used to predict the discrete category (Categorical Label) of the data object; the prediction method (Prediction) is used to predict the continuous value of the data object.
Classification technology is used in many fields. For example, a classification model can be constructed through customer classification to conduct risk assessment on bank loans; a very important feature in current marketing is the emphasis on customer segmentation. The function of customer category analysis is also here. Using classification technology in data mining, customers can be divided into different categories. For example, when designing a call center, it can be divided into: customers who call frequently, customers who call occasionally in large numbers, customers who call steadily, In addition, it helps call centers find the characteristics between these different types of customers. Such a classification model allows users to understand the distribution characteristics of customers of different behavioral categories; other classification applications such as automatic text classification technology in document retrieval and search engines; security field There are intrusion detection based on classification technology and so on. Researchers in fields such as machine learning, expert systems, statistics, and neural networks have proposed many specific classification prediction methods. The following is a brief description of the classification process:
Training: training set--gt; feature selection--gt; training--gt; classifier
Classification: new sample- ―gt; feature selection――gt; classification―gt; judgment
Most of the initial data mining classification applications were based on these methods and algorithms constructed based on memory. Current data mining methods require the ability to process large-scale data based on external memory and have scalability. Here is a brief introduction to several main classification methods:
(1) Decision tree
Decision tree induction is a classic classification algorithm. It constructs a decision tree using top-down recursive breakthrough methods. The information gain metric is used to select test attributes at each node of the tree. Rules can be extracted from the generated decision tree.
(2) KNN method (K-Nearest Neighbor)
The KNN method is the K nearest neighbor method. It was originally proposed by Cover and Hart in 1968. It is a relatively mature method in theory. method. The idea of ????this method is very simple and intuitive: if most of the k most similar (that is, the closest in feature space) samples of a sample belong to a certain category, then the sample also belongs to this category. This method only determines the category of the sample to be classified based on the category of the nearest one or several samples in the classification decision-making.
Although the KNN method also relies on the limit theorem in principle, it is only related to a very small number of adjacent samples when making category decisions. Therefore, this method can better avoid the problem of sample imbalance. In addition, since the KNN method mainly relies on the limited surrounding samples rather than the method of discriminating the class domain to determine the category, the KNN method is better than other methods for sample sets to be divided that have more intersections or overlaps in the class domain. method is more suitable.
The disadvantage of this method is that it requires a large amount of calculation, because for each text to be classified, the distance to all known samples must be calculated to obtain its K nearest neighbor points.
The commonly used solution at present is to edit known sample points in advance and remove samples that have little effect on classification. There is also a Reverse KNN method, which can reduce the computational complexity of the KNN algorithm and improve the efficiency of classification.
This algorithm is more suitable for automatic classification of category domains with relatively large sample sizes, while it is easier for misclassification to use this algorithm for category domains with smaller sample sizes.
(3) SVM method
The SVM method is the Support Vector Machine (Support Vector Machine) method, which was proposed by Vapnik et al. in 1995 and has relatively excellent performance indicators. This method is a machine learning method based on statistical learning theory. Through the learning algorithm, SVM can automatically find those support vectors that have better discrimination ability for classification. The classifier thus constructed can maximize the distance between classes, so it has better adaptability and higher classification accuracy. Rate. This method only needs the categories of boundary samples in various domains to determine the final classification result.
Support to...gt; gt;
Question 9: The literature search question is 20 points. The first basic search is enough. The second one is advanced search. The restriction bar is for novices who enter the author's time period and then search twice for the second author of the restricted core journal. These are very basic and have a look for yourself