What three stages has my country gone through in the Chinese character input method?

Development History

The Chinese (Chinese character) input method has a history of nearly 30 years from the 1980s to today. Among them, Wubi and Pinyin have developed rapidly, especially Entering the 21st century, the Pinyin input method with a certain degree of intelligence combines the characteristics of easy learning of Pinyin, large vocabulary, and thoughtful user design. It is loved by the majority of users and has made important contributions to the popularization of the Internet era. The following article (published in 2001), which explains the history of input methods in more detail. Is input method a hurdle that Chinese people can’t overcome? Once upon a time, when Chinese people talked about computers, they could not do without the word "Chinese". If you count on your fingers, some of the most powerful people in China's IT field today did not start their careers with "Chinese": Lenovo and Giant relied on Hanka. , Sitong relies on Chinese typewriters, Founder relies on Chinese printing and typesetting, Chinese Star and Sina rely on Chinese platforms, and so on! In the field of Chinese, the most basic problem is the input and output of Chinese characters. Whoever can solve it well will take the lead in the business war and have a chance to win. Things seem so simple, and you can make a lot of money by solving input and output problems? Such a naive conclusion may have been made 10 years ago. Although Wang Xuan, who is known as the "Contemporary Bi Sheng", has made our printing say goodbye to lead and fire, and has made a qualitative leap in the output of Chinese characters, it cannot make the input of Chinese characters any easier. As Chinese people's money flows into the pockets of Microsoft and Intel, China's Chinese character input has no profit at all. In addition to charging a licensing fee for Wubi fonts, charging a registration fee for natural code to support four or five people, and hyping up smart phones to gain some new popularity, who can say how much money I earned? Chinese Input Method, this "half-aged lady" who has been tortured by all Chinese computer users for many years, has no longer attracted the attention of the public. It seems that she has no choice but to put on some makeup, put on some powder, disguise herself, hide her name, or get married. others! She was originally a lady from a famous family, but she ended up miserable for the rest of her life. At that time, there were thousands of horses galloping and the crowds were bustling, but for a while, thousands of horses were silent and there were only a few people around. Even so, there are still good people who are secretly trying their best: I don’t believe this evil. With my beauty and background, I can’t raise a beautiful and fragrant daughter? There are also those who don’t have to worry about food and clothing. They think that idleness is also idle. I will get a cat and dog by myself, and I can raise it as my daughter. When I am happy, I can take it out for a walk. What do you want to do? I have nothing to do today, so I want to see which mature ladies and beautiful women are worthy of our exploration of their current situation and life experience. 1. Keyboard input. About 10 years ago, a friend’s Chinese card already provided an input method for inputting Chinese pinyin strings continuously to produce Chinese characters. However, this friend did not sell a few pieces and was nowhere to be found. This may be the earliest "sentence" input method actually used. Later, around 1993, version 1.3 of Chinese Star provided a new Pinyin input method. Until now, people still think it is one of the most convenient input methods. It can display individual words in real time, that is, typing pinyin and displaying Chinese characters at the same time. If you make a mistake on the keyboard, you can see it immediately. If you don't have a word, you can remember it after selecting it once. There are also some clever designs of key positions, such as space confirmation, Comma and period repeat code selection, fuzzy sound error tolerance, etc. have become must-have functions for all pinyin input methods today. Products at this time also included smart ABC. In early 1993, the editor of Peking University Press told the author that a very useful input method had been developed, which had functions similar to New Pinyin and could also quickly input some symbols, such as typing v. In the future, you don’t need to switch to input English. You can also input unknown Chinese characters according to strokes. This is the early smart ABC. Later, it cooperated well with Microsoft. Almost all Chinese versions of Windows OEM it, but this software may have been basically formed at that time. , it’s been almost 10 years, and there aren’t many new features and changes. In fact, whether it is Smart ABC or New Pinyin, they may technically originate from the PJS project hosted by Zhang Pu, Li Huiqin and others in the late 1980s, and there may be a cooperative relationship among them. By 1994, one incident played an important role in the development of China's input method.

On October 18, in a small white building next to the Beijing Language Institute, Longguang Weir and Bondel jointly established the Autoway Chinese platform project team. The company was preparing to develop Chinese platforms for DOS and WINDOWS environments. Word processing software has a huge appetite, and some strong development efforts have been concentrated. Due to funding problems, the goal was later adjusted in 1995 and the Autoway input method was specially developed. In the second half of the year, a plug-in for the Windows environment was launched to enable continuous processing. A Chinese character sequence input system, the software invited celebrities in the language industry such as Zhou Youguang and others to write inscriptions. Later, it was promoted on radio stations and newspapers, and it became really popular. The biggest feature of this input method is that the user only needs to input pinyin continuously, and the system will automatically display the previous Chinese characters every few pinyin. After reaching a certain length, the Chinese characters will automatically or the user presses the Enter key to enter the application software. The editor does not require manual word segmentation, but due to low accuracy and problems with the operation interface and ease of use, it has not been widely promoted. In 1996, a "dark horse" came out of the input method. It happened that the manufacturer was called Beijing Dark Horse Company. Its "Dark Horse Input Method" can only be used under DOS. Now I still have the genuine copy I bought at that time, which is a few floppy disks. If used under Windows, this input method provides a DOS interface, and the user enters the pinyin string of a sentence. , press Enter, convert it into Chinese characters, store it in a text file, and then copy it to other application software. Now it seems that this software is very difficult to use, but with the manufacturer's experience, data and accumulated funds in Chinese proofreading, it developed step by step. By 2001, it was still being continuously improved and upgraded. Both "Zitong" and "Dark Horse" claim to have pioneered the input of whole sentences of Chinese characters (also known as sentence input), but in fact, in addition to the Chinese card that the author mentioned earlier has this function, the earliest ones can also be traced back to Harbin Institute of Technology in the late 1980s. At that time, Wang Xiaolong, a doctoral student at the school, conducted research on Chinese character segmentation, applied for the 863 project, and wrote a paper on "Minimum Segmentation Problem and Its Solution." Later, Wang Xiaolong developed the InSun input method, which is an input system based on whole sentences. In the early 1990s, it was only used for demonstrations and achievement exhibitions. I heard that it was occasionally sold to some companies in Japan for use in certain industries. Some special typewriters were used, but there was no movement for many years. In the mid-1990s, it was sold to Microsoft for US$100,000. Of course, this price was quite good. Ever since, starting from the Chinese version of Windows 95, there has been the "Microsoft Pinyin Input Method" that everyone has seen. Although there are many critics, Microsoft has adopted a similar approach and even obtained smart ABC, which is sent to China "for free" user. But this "free" is in form. In fact, the price has been calculated in the Windows operating system, and ultimately it is still calculated on the user. As a result, input method developers and manufacturers suffer. Even the Pinyin input method provided by Microsoft is not necessarily easy to use. Someone once ridiculed that this input method is like wiping your nose when you have a cold. It stands to reason that if you have a little nose, you should wipe it off quickly and don't wait for it to grow. Don't worry about it until it falls on your mouth, but Microsoft Pinyin doesn't. It lets you type for a long time, and then go back to modify it. Since the level of intelligence is not high, the mistakes are inexplicable. If you type on the manuscript, it's okay. Looking for mistakes, if you want to fight, you forget to choose those words. The Chinese characters cannot be displayed synchronously with the typed pinyin (Microsoft Pinyin lags behind by one character, Zitong lags behind by several characters, and Black Horse Pinyin requires final confirmation before Chinese characters appear), as well as high error conversion and inconvenience in selecting Chinese characters when modifying Pinyin. This has overcome the Achilles heel of early sentence input methods, greatly limiting their use. People still continue to use New Pinyin or Smart ABC, but they have the disadvantages of not supporting GBK Chinese characters and not having new function upgrades for a long time. In addition, the sentence input method is immature. , making the Chinese input method almost reach an unprecedented low. This silence was broken in 1998, thanks first to the emergence of sharing software. As the Internet began to become more popular, the power of the Internet became more and more powerful, and new personal power began to emerge. New input methods such as Pinyin Star, Universal Code (Universal Wubi) and Smart Wubi have emerged.

Pinyin Star was invented by Tan Yajun. It is a single word, word, phrase and sentence input system that includes Quanpin, Shuangpin and Tanma. Perhaps the author realized the advantages and disadvantages of traditional word input methods and sentence input methods, so A completely "real-time display" method is designed. No matter how many pinyin inputs are entered, each letter will be pressed and the Chinese characters will be displayed at the same time. If there is an error in pinyin, the user will immediately find it. Since it supports automatic word segmentation and whole sentence input, the user does not need to Don’t worry about whether you enter a word or a sentence, the system can handle it. If there is no such word, the system can automatically learn and save it. It seems to have the convenience of the word input method and the intelligence of the whole sentence input method. It is worth trying. What I mentioned is that words or whole sentences can also be input by using double pins and tan codes with radicals or strokes, which can further speed up typing, which is probably not possible with other input methods. This input method only requires a floppy disk to install. The program is small and stable, with few running errors. The entire sentence is intelligent and has reached a practical level. Therefore, the software was put on the Internet in 1997-1998. The response was strong and some functions were also improved. It was imitated by future input methods, such as "real-time display", inputting various symbols like inputting pinyin, intelligent recognition of digital punctuation and symbols, quick selection of multiple pinyin encodings, etc. At that time, in addition to Win9x environment, it could also be used in Win9x environment. One of the few input methods used in NT, Pinyin Star has a large user base. The Chinese Star website also recommends downloads for a long time, and Kingsoft's WPS2000 is sold in bundles across the board. "Flying Bird" commented at the end of 1999 that "Pinyin Star 2000 is significantly better than Microsoft Pinyin Input Method (version 2.0) in terms of functionality, and it is definitely a dazzling star." There are many flattering words in this, but it definitely shows that using Pinyin method combines the convenience of word input with the intelligence of whole sentence input, which is one of the directions of input methods. Pinyin Star uses plug-in technology, similar to Chinese platforms such as Chinese Star or Richwin, so it can be used under both Chinese and Western Windows. This is originally a good idea, but it also brings some problems. In Chinese In the Windows environment, since the IME format is not the standard input method of Windows itself, garbled characters may occur if the installation is incorrect. This problem also has a negative impact on plug-in input methods such as Pinyin Star. This problem has occurred in the latest Pinyin Star 2002 build 1.3 That's the real solution. And because the previous version of Pinyin Star did not provide an operation mode where Pinyin and Chinese characters are displayed on two lines at the same time, when Pinyin input errors need to be corrected, although you can click the square brackets [ or ] to change the Chinese characters back to Pinyin and move them at the same time. The movement of the cursor (no need to use the left and right arrow keys) and the finger is relatively small. It is originally a good design, but it is different from the traditional operation method, so the user does not know it, and it makes people feel that it is not convenient enough to modify pinyin. Therefore, after Millennium Edition 2.0, Pinyin Star's operation interface is completely retro. Like the new Pinyin of Chinese Star, it provides two lines to display all typed Pinyin letters and automatically converted Chinese character strings. (Note: This is the old emperor's calendar. The current Pinyin Star has been designed completely according to the IME. It has the same mechanism as Microsoft Pinyin and Google Pinyin.) Using plug-in technology to design input methods also has unique benefits, such as overcoming the standard IME (Such as smart ABC, etc.) There are defects such as missing punctuation, not being able to be used in Spanish Windows, and not being able to follow the cursor in Western application software. Another development direction of input methods is the diversification of functions. The representative in this regard is the "Universal Code", which is now the "Universal Wubi". Universal code is a word input method that combines Pinyin, Wubi, English, and strokes. It can use multiple functions without switching. For example, if you input the word "Apple", you can type its pinyin "pingguo", or You can use the Wubi code to input, and you can also use the English apple to input, so for Pinyin or Wubi users who are accustomed to the traditional input method, it is easy to use the universal code.

Edit this paragraph Wubi input method

In the early version, the universal code used Pinyin as the main design method, so similar to the new Pinyin, it can create words in real time, but the Pinyin function is not powerful , far inferior to Pinyin Star and New Pinyin, so someone once suggested that Deng Shiqiang, the author of Universal Code, focus on Wubi, and mainly promotes "Universal Wubi" while taking into account multiple input methods. This system has developed well and once won the "Top Ten* **enjoy software" title.

The biggest shortcoming of this input method is that there are too many menu options and the menu interface design is messy, leaving users at a loss. The garbled characters under Chinese Windows and the shortcomings of not many Pinyin characters and words are also factors that limit the further promotion of this input method. Smart Wubi is another example of fully absorbing the essence of Wubi and carrying it forward. Wangma Company probably never dreamed that there would be so many people advising on its behalf. Smart Wubi has written a lot of articles on Wubi, including coding tips for Wubi, which prompts whether a certain word exists in the vocabulary. Multiple Chinese characters that have been entered before can be quickly input using a string of Wubi abbreviations, and the vocabulary is large (because the vocabulary is large) Using Wubi coding, the larger the vocabulary, the more duplicate codes), which is why many users like it. However, there are problems with the quality design of the software itself. The interface is unsightly, the menus are messy, and the design of the operation keys is arbitrary, which fully reflects the limitations of personal sharing software. By 1999, several other Pinyin input methods appeared: Pinyin Jiajia, Free Pinyin input method and Kaola input method. Pinyin Jiajia is actually the comeback work of Liao Hengyi, who was originally involved in the new pinyin design of Chinese Star. It is compact, has a stable program, and a reasonable key design. In addition, it also has some new functions, such as inputting Spanish without switching, which is similar to a smart phone. ABC's features of using strokes to input unknown Chinese characters and using Jianpin to quickly input various symbols make this input method popular among word input users. Together with Pinyin Star, Smart Wubi, and Universal Wubi, it is OEM in the Great Wall Chinese Hurricane Sale. However, the shortcomings of Pinyin Jiajia are obvious. The thesaurus is too small. If you input more than two words in a row, you have to constantly select words and press the space button to confirm. The biggest feature of Free Pinyin Input Method is that the source code is disclosed (the operation method and functions are not much new), so it has been used as a reference by many input method enthusiasts to compile their own input methods. When the Koala input method was first launched, it was promoted on the Tsinghua BBS. The operation method almost completely imitated the new pinyin of Chinese Star. However, it overcame the shortcomings of the new pinyin that the font size was very small in some systems, and was well received by netizens. From the beginning, Koala's authors stated in the software description that they were selling. Later, it was sold to Ziguang Company and improved into Ziguang Pinyin Input Method in 2000. The biggest feature of this input method is that it is completely loyal to the operation method of New Pinyin. It provides a large vocabulary library. In subsequent versions such as 2.2 and 2.3, intelligent word combinations have been added, which means that the user can input 9 characters continuously. The system can automatically convert the pinyin strings within the word into Chinese characters, regardless of whether the word exists or not. The system will predict the combination of word strings based on the word frequency, which enhances the fluency of the operation. It is also worth mentioning that the Ziguang Pinyin input method is good at absorbing the advantages of other input methods, such as the real-time display of Pinyin stars, intelligent identification of symbols, and custom strings. Pinyin plus plus can be used to type Western text directly without switching, and finally becomes The input method that the user prefers. However, Ziguang Pinyin input method has some obvious shortcomings. Due to flaws in program design, it is not as stable as Pinyin Star and Pinyin Plus. In many versions, input method engine errors often occur, and user lexicon errors occur when the user database is large and cannot be used. It has been improved in version 2.3, but there will still be problems with the screen flickering when switching applications, the input bar appearing and disappearing in some Western software such as Dreamweaver, and garbled characters in some applications, affecting the normal performance of the software. use. In 2000, the well-established Xintiandi branch was split to establish Chinese Star Company, which mainly promotes a whole-sentence input method called Intelligent Kuangpin. In essence, it is similar to the sentence input method of Microsoft Pinyin, Dark Horse Pinyin and Pinyin Star, but this company is very good at it. Publicity: As soon as Smart Kuangpin I was launched, it launched an overwhelming advertising campaign and announced that it had launched the whole sentence input method for the first time. In 2001, it released the upgraded Smart Kuangpin II. Smart Kuangpin has given the field of input methods a shot in the arm. Although Chinese Star still hasn’t made much money, China has begun to pay attention to Chinese Star again - the former software overlord of China’s IT industry. In 2000, China, like the rest of the world, was in the midst of an Internet craze. Chinese Star’s former competitor, Sitong Lifang, had already completed the initial stages of financing and established itself as China’s number one Chinese portal Sina. , in preparation for listing on NASDAQ in the United States, no wonder Chinese Star needs some attention at this time.

The interface of Smart Pinyin is quite good. You can customize a variety of colors and fonts, and the size can be stretched at will like a Windows window. Modifying pinyin and selecting repeated codes are improved compared to Microsoft Pinyin. The accuracy of pinyin to Chinese character conversion is also pretty good. , especially after learning a large number of ancient poems and famous sayings, Smart Kuangpin was once known as the most intelligent, but its self-learning ability is not as good as Pinyin Star and Pinyin Jiajia. Self-learning is mainly reflected in two aspects: one is to input a pinyin string separately. If it is not accurate the first time, you can modify it. Then the next time you type the same pinyin or simplified pinyin, you should be able to get the required results. This aspect is different for traditional word input methods. On the other hand, learning the corresponding words from the sentence being input is a bit difficult, and all current systems are not satisfactory. The obvious shortcoming of Intelligent Kuangpin is that it is too bulky. In order to increase the conversion accuracy by 1% to 2%, hundreds of megabytes of disk overhead are added. An input method is more bloated than the operating system. This trick may only be used by those who are impatient. Talent will figure it out. There is also a software called Natural Code, which is an old-style input method with many subtleties in its functional design. It uses double spelling plus radicals or strokes to combine sounds and shapes for coding, providing a way to quickly input Chinese characters. Its characteristic is a large vocabulary library. It was once popular in the DOS era, and its program design is also very unique. It's just that after entering the Windows era, development was slow, and the menu design was not considered and was relatively messy. It had the same problems as the universal Wubi and smart Wubi introduced earlier. In addition, it was difficult to launch the NT version, which made many old users reluctant to give up and invest in it. The embrace of new input methods. In 2000, natural code was also affected by the whole sentence input function, and a whole sentence input function was introduced that had slow conversion speed, low accuracy, and was difficult to modify. However, it was so difficult to use that it was not practical. The new version launched in 2001 has greatly improved the input of whole sentences. The method of using Chinese radical codes to select repeated codes without switching is cleverly designed. If it can be further improved to reduce the complexity and ambiguity of the operation and carry it forward, there will still be a lot of potential. promising.

Edit this paragraph for voice input and pen input

The keyboard input method, which had been used for many years, suddenly came under fierce attack around 1998, which was simply that Wubi was too troublesome and needed to be memorized. The pinyin for the root of the word is too simple, but there are too many repeated codes. Typing is slow. To count the famous people, you still have to look at the current situation - voice and pen input. Major manufacturers including IBM, Microsoft, Motorola, Zhongzi, Ziguang and other companies have launched their own non-specific voice input systems or connected handwriting input systems, and they have been aggressive in marketing and media promotion for a while. However, the author believes that These two input methods are okay, but they are not the right way to input Chinese characters. No, how much share will these two methods account for in the next few years? Chinese character speech input is derived from speech recognition technology, which usually uses Markov information model for statistical processing and rule-based method for ambiguity discrimination. For example, when we usually speak, when we say a word, others may not understand it because of the repeated code. However, when we say the previous word, the possibility that others can understand it increases. When we say the previous sentence, everyone else will understand it. This is because the characters and words in the discourse are related to each other. This related factor is statistically analyzed in a quantitative way to obtain the statistical quantitative relationship between the collocations of commonly used words. Based on this quantitative relationship, the computer usually Ability to possess "intelligence" within a certain range. To recognize recorded speech, it is sometimes necessary to adopt certain language rules and supplement statistical methods to improve the intelligence level of the machine. Making machines "understand" what people say is a very beautiful but very difficult thing. Research on it can drive the development of many technologies, and its results can be applied in many aspects, such as voice control of instruments, and of course, the input of Chinese characters. In the mid-to-late 1990s, IBM finally launched ViaVoice, a non-person-specific continuous speech recognition system, which is currently the leader in speech recognition. In recent years, a group of people in my country engaged in Chinese character speech recognition research have joined foreign companies, taking advantage of the sufficient funds of foreign companies and using the knowledge or research results learned in domestic research institutes or universities to establish a huge Chinese language database ( Also called corpus), a Chinese Mandarin voice input system was launched, achieving high-speed input of more than 150 words per minute. There is a similar system in China. In order to prove the advancement and practicality of the voice input system, multiple keyboard voice competitions were also held.

In the first voice keyboard input competition in 10 major cities in the first half of 1998, the fastest input speeds of contestants using voice input were higher than those of keyboard input, which fully verified the principle of "mouth is faster than hands". For a time, voice input, The future is brilliant. However, voice input has some weaknesses that are currently difficult to overcome. First of all, it requires a quiet input environment and accurate and loud pronunciation. And because this system is interconnected, one error will trigger a series of errors. If there is an accent, the result will be even more unimaginable. If professional entry clerks adopt this method, the spacious computer rooms now will be turned into small soundproof spaces, and people will be exhausted from reading manuscripts loudly for hours on end. When non-professionals use computer input, they mainly use the "thinking-typing" method, that is, writing directly on the computer while thinking. Voice input requires accurate and smooth sounds, which does not leave enough time for thinking. Secondly, it requires learning the user's pronunciation so that the user can use it normally, which virtually increases the complexity and inconvenience of use. Because language environments vary widely, no matter how large the corpus is, it cannot be exhaustive. The ability to automatically learn and acquire new knowledge needs to be strengthened, and there is still a long way to go before a truly practical system. In addition to voice input, another hot spot is pen input. A year before the launch of ViaVoice, this deal had already begun to become popular, but it did not mean how hot the market was, but the manufacturers' promotion. In fact, a basic and practical handwritten Chinese character input system has been achieved since 1997. It adopts a pattern recognition method based on semantic syntax, starting from the four levels of pen segment - stroke - root - whole character. To a certain extent, Solved the problem of recognition rate of online handwritten Chinese characters. Among them, the outstanding ones are "Hanwang 99" of China National Self-Group Co., Ltd. and "Huipi" of Motorola Company. However, the input speed is slow, the use is inconvenient, and the eyes are particularly hard to operate for a long time. These are insurmountable obstacles for handwriting input. Since the writing board and the screen are separated, when the typist is writing, his eyes are fixed on the writing board, and the words are easy to run away. In a Windows environment, the "pen" can easily lose the window handle for writing, even when writing in full screen. That might as well be true. The input operator stares at the screen while writing, and his eyes are particularly prone to fatigue, making it impossible to input a large number of Chinese characters. Therefore, handwriting input will only be popular among certain people, such as those who are not familiar with computers and only need to enter a small number of Chinese characters; or those who need to sign. At the same time, handheld PDA computers can also use pen input, because the small size of the machine makes keyboard input inconvenient. At that time, the manufacturer promoted that "every machine has a pen" and that the "pen" would become the same standard configuration as the keyboard and mouse. There was an uproar for a while, just like last year's "Internet Economy", a thick bubble appeared. Over the past few years, pen input has almost disappeared in computers, especially in the most widely used PCs. Instead, it has become popular in handheld computers with single functions and small size such as "Business Communication". Looking at Chinese character input, pinyin input is still difficult to achieve the best results. Wubi input leads the competition in memorizing character roots, and can still get some benefits; voice input while talking into the microphone and "graffiti" on a board The "magic pen and celestial fingers" of "Packing Birds" seem to have little new achievements. Can the Chinese people overcome the import hurdle? =================== In addition to the input methods mentioned in the above article, after entering 2006, new input methods have been born one after another, including Google Pinyin input French, sogou Pinyin input method (Sogou Pinyin), QQ Pinyin input method, plus the previous Microsoft Pinyin, Ziguang (currently Ziguang Huayu) Pinyin, Pinyin Jiajia, Pinyin Star, Natural Code, Smart Pinyin, and many more The five-stroke input method, glyph codes, and phonetic-shape codes constitute the Chinese (Chinese character) input method era in which a hundred flowers bloom in the Internet era, adding more convenience to our lives. I would like to thank the inventors and developers of these input methods. .