Spend money to check the duplicate, and the results are all references. How to break it?

I found an article on the Internet-"Algorithm of Paper Duplicate Checking and Revision Raiders". After you read it carefully, it will definitely help.

At present, the detection system for master's and doctoral dissertations in colleges and universities is developed by HowNet. However, the specific algorithm and judgment standard of the software were not clear before. This paper was obtained from the internal staff of HowNet, revealing the algorithm of HowNet's anti-plagiarism detection system, how to judge the paper as plagiarism, and how to modify it to pass the cheats. Send it out to benefit everyone. 1. Requirements for the format: I know that the whole paper is uploaded, and the format may have an impact on the test results. It is necessary to submit the final submission format for testing to minimize the impact, which may not be detected for a few tens of pieces. Will not affect the passage. The algorithm of the system is more complicated, and there may be a small plagiarism that was not detected for the first time after each revision of the paper (it has been proved by two years' practical experience that the small plagiarism will not exceed 2 words, and the plagiarism rate will be greatly reduced after the second revision of the paper)

2. Contrast database

Contrast databases are: China Academic Journal Online Publishing Database, China Doctoral Dissertation Full-text Database/China Excellent Master Dissertation Full-text Database. Full-text database of important conference papers in China, full-text database of important newspapers in China, full-text database of patents in China, personal comparison database and other comparison databases, and some books are not in HowNet database, so plagiarism cannot be detected. HowNet database is a national designated paper detection and comparison database, and the national designated university paper detection system is HowNet dissertation detection system. This system is the most effective and extensive official detection system at present, and all universities are HowNet detection systems, which is implemented by the Ministry of Education for the sake of fairness of academic misconduct in the whole country.

3. Results by sections and chapters

After uploading the paper, the system will automatically detect the chapter information of the paper. If your school's directory settings meet the built-in judgment conditions of chapters in HowNet system, the system will detect the results by chapters, otherwise the results will be segmented. The segmentation or chapter mainly involves the threshold in 4. Honesty paper reminds us that whether it is divided into chapters or sections, it can be consistent with the school.

4. Can the cited ones be detected?

Some students asked, "I clearly quoted other people's paragraphs or sentences, why didn't I detect them?" Some students also asked: "My quotation is marked with the source, why is it still plagiarism?" First of all, whether a citation is plagiarism has nothing to do with marking the source, and whether the citation can be detected or not has nothing to do with the accuracy of the system. All these depend on the threshold of the system. China HowNet has set a threshold for the sensitivity of this detection system, which is 3%. Calculated by the number of words in a paragraph (or chapter), plagiarism or citation of a single document below 3% cannot be detected, which is common in clauses or small concepts in large paragraphs. For example, if the detection paragraph 1 (Chapter 1) has 1, words, it will not be detected within 3 words (1, times 3%=3) if the A document is cited. If more than 3 words are cited from the B document, the plagiarism of the B document distributed in the first chapter will be marked in red, no matter where it is located in the first chapter, even if it is interrupted into a sentence, it will be marked as long as it exceeds 2 words. In fact, here also tells the students a modification method, that is, don't choose an article to quote from paragraph plagiarism, choose as many documents as possible, and intercept a few words from one, so that it will not be detected. ② Some students asked why citations are plagiarism, mainly because of the threshold problem of HowNet. If it is higher than 3%, it is plagiarism, that is to say, the critical value of citations for plagiarism is between 3%. Once you exceed the standard, even if you mark the quotation, it won't help. So please pay attention to the students. Let's give an example: the first chapter of a paper has 5 words, so in the first chapter, we can only quote A document with less than 15 words, otherwise it will be considered plagiarism by the system. The second chapter is 4 words, so we can only quote A document with less than 12 words, otherwise it will be considered plagiarism by the system. Chapter III contains 8, words, and Chapter IV contains 7, words, which are less than 24 words and less than 21 words respectively, and so on. To sum up, the calculation method of citation exceeding the standard is calculated by chapter, which is the same as the calculation method of plagiarism.

5. How can the system plagiarize a sentence?

how can plagiarism in a paper be detected? The condition of hownet paper detection is that the similarity or plagiarism with a unit of more than 2 words will be marked in red, but the precondition in 4 must be met: that is, the sum of A documents you quoted or plagiarized should reach 3% in each detected paragraph (chapter).

6. Modification methods of plagiarism Apart from those mentioned in 3, there are also ways to modify the words marked in red, such as changing words, changing sentences, changing description methods (changing original sentences into inverted sentences, passive sentences, active sentences, etc.), disrupting paragraph order, deleting key words and sentences, etc. Practice has proved that the combination of the above methods can effectively reduce the copy ratio and ensure the smooth passage. Generally speaking, we need to keep different from the original sentence on the premise of ensuring the smoothness of the revised sentence.

Example 1: For example, there is a difference between overheating in overheating fault and heating in normal operation of transformer. During normal operation, its heating source comes from winding and iron core, that is, copper loss and iron loss, while the overheating fault of transformer is accelerated deterioration of insulation caused by effective thermal stress, which has a medium level of energy density.

is almost marked in red, indicating that there is overlap and high similarity with similar documents. Through the combination of the above methods, this sentence can be changed to: overheating in overheating fault is easily confused with heating in normal operation of transformer, which is caused by copper loss and iron loss in its winding and core, which is heating in normal operation, while overheating fault of transformer is accelerated insulation deterioration caused by effective thermal stress.

① The 3 words referred to here are approximate values, not critical values. The lower the number of references, the less likely it is to be detected.

② The updated CNKI academic misconduct detection system has adjusted this threshold to 3%, compared with 5% in the past, which means that the detection system has stricter requirements for citation, but it is not difficult to use the methods we mentioned later. Have a medium level of capacity density.

This modification can almost reduce the plagiarism rate by half.

Example 2: Look at the following example:

3.7.1.2 When a small amount of fiber is put into the clear water of a transparent glass and stirred, it can be intuitively found that the fiber is dispersed in a three-dimensional suspended state, and it will not change much for a long time, indicating that the quality of synthetic fiber is better; Poor-quality fibers may disperse after being stirred, but they will soon float to a flocculent layer. Poor quality fibers are not easy to disperse evenly in the actual preparation process of concrete.

this paragraph is completely marked in red, and there is only one way to modify it, which is to disrupt the order and reorganize it.

3.7.1.2 Put a small amount of fibers into a transparent container filled with clean water, and observe the changes of fibers while stirring. If the synthetic fibers are of good quality, you can intuitively see that the fibers are dispersed in a three-dimensional suspension state, and the position will not change obviously with the passage of time; If the synthetic fiber is of poor quality, the fiber may disperse during the stirring process and easily float up to form a flocculent layer. Poor quality fibers are not easy to disperse evenly in the actual preparation process of concrete.

Example 3: Next sentence: The design change requirements put forward by the construction unit or the owner should be considered as a whole, and the necessity should be determined. At the same time, the influence of the design change on the construction period and cost should be comprehensively analyzed. If it is necessary to change, the construction plan should be adjusted to minimize the adverse impact on the project.

it is revised as: once the construction unit or owner puts forward the requirements for design changes, it is necessary to make overall consideration and investigate the necessity of the changes. At the same time, it is necessary to make a comprehensive and scientific analysis of the possible impact of the design changes on the construction period and expenses, and adjust the construction plan when it is necessary to make changes, so as to minimize the adverse impact on the project as far as possible.