too much plagiarism, once it is found to be more than 3%, the consequences will be serious. Those who are light will postpone graduation, and those who are heavy will cancel their degrees. How uncomfortable it is to study hard at a university and have your degree reimbursed.
However, after all, the software is a mechanism set manually, and the detection algorithm is embedded in it. As long as we find out the mechanism, we can successfully pass the detection through simple modification.
this article is information collected on the internet. I sorted out the most important parts for your reference.
Paper plagiarism detection algorithm:
1. Paragraph and format of the paper
Paper detection is basically to upload the whole article. After uploading, the paper detection software first divides it into parts, and the final manuscript format has a great influence on the plagiarism rate. The division of different paragraphs may cause small paragraphs of dozens of words to be undetected. Therefore, we can reduce the plagiarism rate by dividing more small paragraphs.
2. Database
Paper detection is mostly aimed at matching published graduation papers, journal articles and conference papers, and some databases also contain some articles from the network. Let's reveal here that many books are not included in the detection database. Before, my friend extracted a lot of words from a research work, which was not found out. You can see that this method is still effective.
3. Chapter transformation
Many students changed the order of chapters, or extracted different articles from different articles, which had almost no impact on the results of plagiarism detection. Therefore, the master of plagiarism detection suggests that you should not think that copying a few articles or dozens of articles will pass.
4. Labeling references
How are referring to other people's articles and copying other people's articles defined in the detection software? In fact, it's very simple. In our paper, reference symbols are added, but in the plagiarism detection software. Generally speaking, the threshold of software is set at 1%. For example, an article has 5, words, and 1% of the article is 5 words. If you plagiarize more than 5 words, even if you add references, it will be judged as plagiarism.
5. Word matching
The plagiarism detection system is relatively strict. As long as more than 2 words are matched, it will be considered plagiarism, but the premise is to meet the fourth point, the labeling of references.
Copying and modifying methods of papers:
First, changing words. The professional vocabulary in the article can be kept and synonyms can be changed as much as possible;
Secondly, change the description in the text, such as inverted sentences, passive sentences and active sentences; Disrupt the order of paragraphs, divide paragraphs when copying the original text, and reorganize them.
By the above methods, the plagiarism rate can be effectively reduced.
Here are some examples for your reference:
Example A:
In this paper, the problem of constructing HFS is studied by taking the maximization of equipment utilization as the objective function and adopting the genetic algorithm combining integer coding with real coding. The chromosome coding method and the corresponding genetic operation method proposed in this paper can realize the global random optimization of the research object. The study of car series standard examples shows that the method proposed in this paper has high computational repeatability and efficiency.
Revision A:
In this paper, the construction of HFS problem is studied, which is solved by genetic algorithm combined with integer and real number coding, and the objective function is to maximize equipment utilization. The chromosome coding method and the corresponding genetic algorithm operation in this paper can effectively improve the global search ability of the algorithm. Through the study of some column benchmark examples, the effectiveness of the algorithm in this paper is verified, and it has high computational repeatability and high operational efficiency.
Example B:
Due to the strong regionality of real estate commodities, real estate development enterprises usually need to establish project companies when investing in different regions, and then they will be faced with the choice of establishing branches or subsidiaries. A subsidiary is an independent legal person, while a branch is not. They have differences in tax benefits. The subsidiary is an independent legal person, and is regarded as a taxpayer in the established area, and usually bears the same comprehensive tax payment obligations as other companies in the area; A branch is not an independent legal entity, and it is not regarded as a taxpayer in the area where it is established, and it only bears limited tax obligations. The profits and losses incurred by the branch should be calculated with the head office.
modify b:
when real estate development enterprises invest in different regions, they need to establish project companies because of the strong regionality of such commodities. At this point, enterprises need to choose whether to establish branches or subsidiaries. The main difference is that subsidiaries have independent legal persons, while branches are not. Secondly, in terms of tax benefits, because the branch is not an independent legal entity, it is not regarded as a taxpayer in the area where the branch is established, and it only bears the tax obligation. The head office needs to calculate the profits and losses of the branch together; The subsidiary is an independent legal person, which is regarded as a legal entity in the region where it is located and needs to bear the same comprehensive tax obligations as other companies in the region.
There are no more ways to correct plagiarism than these. Here, students are advised to familiarize themselves with the reference papers you read, close the documents and write them in their own words, so that they will not be influenced too much by the references.
Some students have raised questions here. The detection system used by the school is the academic misconduct detection system of HowNet, not the wanfang data detection bought by Taobao for several yuan.
In fact, the algorithm of each detection system is not very different, but there are many databases. If you don't have too many, don't be afraid of any system. Since you copied it, you should revise your article first while getting the test report.
after copying, change the phase similarity, and you can leave the middle in this way, with different meanings and words.
1. The principle of duplicate checking
1. Hownet dissertations are uploaded as a whole, and the format may have an impact on the detection results. It is necessary to submit the final submission format for detection to minimize the impact, which may not be detected for a few tens of pieces. Papers with more than 3, characters can be ignored.
The comparison databases are: China Academic Journals Online Publishing Database, China Doctoral Dissertation Full-text Database/China Excellent Master Dissertation Full-text Database, National Important Conference Papers Full-text Database, China Important Newspapers Full-text Database, China Patent Full-text Database, Personal Comparison Database and other comparison databases. Some books are not in HowNet Library and cannot be detected.
2. After uploading the paper, the system will automatically detect the chapter information of the paper. If there is automatically generated directory information, the system will detect the paper by chapter, otherwise it will automatically detect it by section.
3. It is normal for some students to report that they clearly quoted or copied paragraphs or sentences from other documents in their paragraphs, and why they were not detected. China Knowledge Network has set a threshold for the sensitivity of this detection system, which is 5%. In terms of paragraphs, plagiarism or quotation below 5% cannot be detected, which is common in clauses or small concepts in large paragraphs. For example, if there are 1, words in paragraph 1, a single document with less than 5 words will not be detected. In fact, here also tells the students a modified method, that is, never choose an article to quote from paragraph plagiarism, choose as many documents as possible, and intercept a few words from one, so that it will not be detected.
4. How can plagiarism in a paper be detected? The condition of hownet paper detection is that 13 consecutive words that are similar or plagiarized will be marked in red, but the precondition in 3 must be met: that is, the sum of A documents you quoted or plagiarized should reach 5% in each detection paragraph.
second, seven ways to quickly check the duplicate of papers
method 1: translation of foreign documents
consult foreign documents in the research field, especially those in high-level journals, such as Science, Nature, WaterRes, etc., and translate the theoretical explanations into Chinese and put them in your own papers.
Advantages: 1. Everyone's language habits are different, and the translated Chinese is bound to be different. Therefore, even if the same paragraph is translated by different people, there will be no plagiarism. 2. Reading foreign literature can improve one's English level and broaden one's professional field of vision.
Disadvantages: Students with poor English, especially those with poor professional English, are hard to implement.
method 2: change the wording method
rewrite the words in other people's papers according to the meaning, or change the sentence structure, change the active and passive voice, or change the keywords, or increase or decrease. Of course, if it is a classic famous sentence, it should be quoted in the classic way.
Advantages: 1. After the text is modified, according to HowNet program and algorithm, as long as there is no repetition of 13 consecutive words and keywords, it will not be marked red. 2. I know every word and sentence of the paper like the palm of my hand, and I know it by heart, and I will be like a duck to water when I reply.
Disadvantages: word for word correction is time-consuming and laborious.
method 3: cut off the head and tail, and change the word order in the middle
replace the words in other people's papers with the head and tail and leave them in the middle, and change the left part into a passive sentence, so that the sentence pattern and structure will change, and then you can successfully avoid the duplicate search after correcting the language defects yourself.
advantages: it is convenient and quick, and can be modified from section to section.
Disadvantages If you don't learn Chinese well, it will be very hard, and it will take you half a day.
method 4: transform the picture method
cut the words in other people's papers into pictures and put them in your own papers. Because at present, the duplicate checking system of HowNet can only check words, but not pictures and tables, so it can avoid duplicate checking.
Advantages: It is more convenient and faster than changing sentence order.
Disadvantages: If you use it conveniently, it is easy to see pictures all over the page, which will affect the word count of the whole paper.
method 5: insert document method
insert some referenced words into the paper in the form of word documents.
advantages: this method is even better than the fourth method, because it can be re-edited in the inserted document in the future, while the image conversion method is not convenient for further modification.
disadvantages: I haven't found them yet.
method 6: insert spaces
insert spaces between all the words in the article, and then adjust the space between the empty words to the minimum. Because the basis of duplicate checking is based on words, spaces cut off words and naturally skip the duplicate checking system.
advantages: based on the principle of duplicate checking system, it has high reliability.
Disadvantages: The workload is huge, and the course can be completed through macros, but the compilation of macros needs to be studied.
Method 7: Make your own original method
Write your own paper, or don't copy and paste the original text when writing; Either add a quote correctly.
advantages: basically, you will never worry about failing the duplicate checking, even if the threshold of the duplicate checking system is adjusted lower.
Disadvantages: If there are advantages and disadvantages, it is that after writing a graduation thesis, more brain cells may die. Ha ha. . .
Detailed description of the calculation standard of HowNet system:
1. After reading the introduction of this system, there is a doubt that this system is still good for the identification of text reproduction, but what about other aspects, such as data and charts? Isn't it still useless if it can't be detected?
among all kinds of behaviors of academic misconduct, word copying is the most common and serious. At present, the detection system has reached a quite high level, and the detection of plagiarism and tampering of charts, formulas and data is currently under development, and great progress has been made. Welcome to continue to pay attention to the progress of this detection system and make more critical and constructive comments and suggestions.
2. According to this system, less than 39% of them are displayed in yellow, so does it mean that it is within the tolerable limit? Recently, I saw the news that the project of the National Social Science Fund for a teacher in Shanghai University was cancelled, because two papers published by him were plagiarized, accounting for 25% and 3% respectively. Please specify how much is the warning line?
the percentage only describes the proportion of overlapping words in the detected literature, and does not refer to the plagiarism of the literature. It can only be said that the greater the percentage, the more overlapping words, and the greater the possibility of plagiarism. Whether it is plagiarism or not and the severity of plagiarism needs to be decided by experts after review.
3. How to prevent the academic misconduct detection system for dissertations from becoming a platform for personal revenge?
This is something we are seriously considering. At present, this detection system is only used by users at the institutional level. We have established a strict management process. At the same time, technically, we have also taken various measures to prevent malicious acts as much as possible, including a series of strict identity authentication, logging and so on.
4. The minimum detection unit is a sentence, so you can't detect it by changing one or two words in each sentence?
We also deal with sentences accordingly, and have an algorithm for sentence similarity. It is not the same sentence that is judged to be the same. Sentences have a sentence-level similarity algorithm, and paragraphs have a paragraph-level similarity algorithm. Calculating whether a document or a paragraph is similar to other documents is based on this.
5. If the original words are taken from relevant books, but the words have been copied from relevant documents in the database, that is to say, the previous article also picked the same words from relevant books, but the words marked in my paper are from relevant books, is this academic plagiarism?
The detection system can't draw a conclusion, whether it is plagiarism or not, and finally there is a manual review. Therefore, if it is the situation you described, experts will make corresponding judgments. Our system only provides all kinds of clues and basis, so that people can quickly grasp the information of testing literature.
6. the authority of hownet detection system?
The academic misconduct document detection system does not draw a conclusion, that is, the detection system does not characterize the detection document, but only displays the similarities between the detection document and other published documents and lists the objective facts. However, whether this detection document belongs to academic misconduct needs final examination and confirmation by experts.
how can plagiarism in a paper be detected? The condition of hownet paper detection is that 13 consecutive words that are similar or plagiarized will be marked in red, but the precondition in 3 must be met: that is, the sum of A documents you quoted or plagiarized should reach 5% in each detection paragraph.
The law of revision of paper duplication checking:
1. If it is a quotation, don't use a full stop easily after quoting the label. If a full stop is written, it will be plagiarized (although I think it is a quotation), so try to use a semicolon before the quotation ends. Some people put the quotation mark after the period, which is wrong and should be before the period.
2. You can convert the text into a table and hide the border of the table.
3. If you read a lot of foreign languages, which are translated and quoted by foreign languages themselves, personally, you don't need endnotes, so you can take them as your own, because the duplicate database is only the matching of characters, and you can't match Chinese and English.
4. Duplicate checking is a matching process, which takes sentences as a unit. If a sentence is repeated, it is easy to judge that it is repeated. Therefore,
If it is indeed a classic sentence, it should be expressed in the reference by superscript endnotes, or the quoted content should be framed by the original author's name and quotation marks.