How to revise your graduation thesis so you can avoid the school computer test?

The detection system currently used by universities for master's and doctoral theses was developed by CNKI. However, the specific algorithm and judgment criteria of this software have not been clear before.

This article was obtained from an internal staff member of CNKI. It reveals the algorithm of CNKI’s anti-plagiarism detection system and how to determine whether a paper is Cheats for plagiarism and how to modify to pass. Send it out to benefit everyone.

Quote:

1. Requirements for format

CNKI dissertation detection requires the entire article to be uploaded. The format may have an impact on the detection results. It needs to be The final submission format is submitted for testing to minimize the impact. This impact may not be detected for a small paragraph of several digits. It will not affect passing. The algorithm of the system is relatively complex. Every time you revise your paper and re-test, there may be plagiarism in a small paragraph that was not detected the first time (two years of practical experience has proven that this small paragraph will not exceed 200 words, and the second revision

< p>Revised papers will generally greatly reduce the plagiarism rate)

2. Comparison database

The comparison database is: China Academic Journal Online Publishing Database, China Doctoral Dissertation Full-text Database/China Excellent master's degree thesis full-text database, China's important conference papers full-text database, China's important newspaper full-text database, China patent full-text database, personal comparison database, other comparison databases, some books are not in the CNKI database, and plagiarism cannot be detected. The CNKI database is a nationally designated thesis testing and comparison database. The nationally designated university thesis testing system is the CNKI dissertation testing system. This system is currently the most effective and widest official testing system. All universities are tested by CNKI. system, which was implemented by the Ministry of Education for national academic misconduct equity considerations.

3. About producing results by paragraphs and chapters

After uploading a paper, the system will automatically detect the chapter information of the paper. If your school’s directory settings comply with the built-in CNKI system If the condition is judged by chapter, the system will detect it by chapter and produce the result by chapter. Otherwise, the system will produce the result by segment. Regarding segmentation or chapter division, it mainly involves the threshold in 4. Integrity paper reminder, whether it is divided into chapters or paragraphs, just keep it consistent with the school.

4. Can the reference be detected?

Some students asked: "I clearly quoted someone else's paragraph or sentence, why was it not detected?" Others asked: "My quotation marked the source, why is it considered plagiarism?" First of all, Whether a quote counts as plagiarism or not has nothing to do with marking the source. Whether the quote can be detected has nothing to do with whether the system is accurate. All of these are determined by system thresholds. CNKI has set a threshold for the sensitivity of this detection system. The threshold is 3%, calculated based on the number of words in a paragraph (or chapter). Plagiarism or citations below 3% in a single document cannot be detected. , this situation is common in small sentences or small concepts in large paragraphs of text. For example: If the detection paragraph 1 (Chapter 1) has 10,000 words, then the citation of document A within 300 words (10,000 times 3% = 300) will not be detected. If the citation of document B exceeds 300 words, then the plagiarism of document B distributed in the first chapter will be marked in red. No matter where it is in the first chapter, even if it is broken into sentences, as long as it exceeds 20 words, it will be marked. ①In fact, students are also told here a modification method, which is to never choose one article to cite when plagiarizing a paragraph. Choose as many documents as possible and intercept a few sentences from each article. This will not be detected. . ② Regarding some students asking why citations are considered plagiarism, this is mainly due to the threshold problem of CNKI. Anything above 3% is uniformly considered plagiarism, which means that the critical point for citations and plagiarism is between 3%. Once you exceed the standard, even if you mark the reference, it will not help. So please pay attention, students. Let us give an example: the first chapter of a certain paper has 5,000 words. In the first chapter, we can only quote less than 150 words from Document A, otherwise it will be considered plagiarism by the system. Chapter 2 is 4,000 words, so we can only quote less than 120 words from Document A, otherwise it will be considered plagiarism by the system. Chapter 3 is 8,000 words, Chapter 4 is 7,000 words, which are respectively less than 240 words and less than 210 words, and so on. To sum up, the calculation method for excessive citation is per chapter, which is the same as the calculation method for plagiarism.

5. How does the system consider a sentence to be plagiarism?

How can plagiarism in a paper be detected? The conditions for CNKI paper detection are that similarity or plagiarism of more than 20 words will be marked in red, but it must meet the prerequisites in 4: that is, the total text of the A documents you quoted or plagiarized in each of your detection paragraphs (each chapter) to reach 3%.

Quote:

6. Modification methods for plagiarism

In addition to the modification methods mentioned in 3, the modification methods for red text include changing words, substituting sentence, change the description method (change the original sentence into an inverted sentence, a passive sentence, an active sentence, etc.), disrupt the order of paragraphs, delete key words, key sentences, etc. Practice has proven that using a combination of the above methods can effectively reduce the copy ratio and ensure smooth passage.

In general, we need to try to remain as literally different from the original sentence as possible while ensuring that the revised sentence is smooth.

Example 1: For example, the following sentence:

There is a difference between overheating in overheating fault and heating under normal operation of the transformer. The heat source during normal operation comes from the winding and iron. Core, namely copper loss and iron loss, while transformer overheating failure is due to accelerated insulation degradation caused by effective thermal stress, it has a medium level of energy density.

It is almost marked in red, indicating that there is overlap and high similarity with similar documents. After combining the above methods, this sentence can be changed to:

The overheating that occurs in overheating faults is easily related to the normal condition of the transformer. The latter is caused by the phenomenon of copper loss and iron loss in its windings and iron cores. This is the heat generated during normal operation, while the overheating fault of the transformer is caused by the accelerated deterioration of the insulation caused by effective thermal stress.

① The 300 words mentioned here is an approximate value, not a critical value. The lower the number of citations, the harder it is to detect.

② The updated CNKI academic misconduct detection system has adjusted this threshold to 3%. It was previously 5%, which means that the detection system has stricter requirements for citations. However, using what we mentioned later The method is not difficult either. Has a medium level of capability density.

This modification can almost reduce the plagiarism rate by half.

Example 2: Look at the following example sentence:

3.7.1.2 Put a small amount of fiber into the clear water in the transparent water cup and stir it. You can intuitively find that the fiber is three-dimensionally suspended. It is dispersed in the direction and does not change much after being left for a long time, indicating that the quality of the synthetic fiber is better; fibers of poor quality may disperse after being stirred, but they will float up into a floc layer after a short time. Poor quality fibers are often difficult to disperse evenly during the actual preparation process of concrete.

This paragraph is completely marked in red. There is only one way to modify it, which is to disrupt the order and reorganize it.

3.7.1.2 Put a small amount of fiber into a transparent container filled with clean water, and observe the changes in the fiber while stirring. If the quality of the synthetic fiber is good, you can visually see that the fiber is dispersed in a three-dimensional suspension. , the position will not change significantly over time; if the quality of the synthetic fibers is poor, the fibers may disperse during the stirring process and easily float to form a floc layer. Poor quality fibers are often difficult to disperse evenly during the actual preparation process of concrete.

Example 3: Next sentence:

The design change requirements proposed by the construction unit or owner must be considered as a whole to determine their necessity, and at the same time, the impact of the design change on the construction period and cost Conduct a comprehensive analysis of the impact, and adjust the construction plan if necessary to minimize the adverse impact on the project.

Modify to:

Once the construction unit or owner proposes a design change request, it must make overall considerations and examine the necessity of the change. At the same time, the impact of the design change on the construction period, cost, etc. Conduct a comprehensive and scientific analysis of the possible impacts, and adjust the construction plan when encountering unavoidable changes to minimize the adverse impact on the project as much as possible