It is said that in a few decades, the world will be a society where two major languages coexist, Chinese and English. I said that if we follow the current way of thinking, this may not be the case.
Since more than a decade ago, the public and experts have been talking about Chinese character input, and each has its own opinions; Later, it gradually cooled down until no one applauded. However, a fundamental factor has not attracted attention-the inconvenience of oral expression of Chinese characters.
Letters or components of any language need and should be expressed in spoken language. As we all know, each Latin letter is not only regarded as a phoneme in vocabulary, but also can be expressed by pronunciation alone. It is easier to express a letter than to pronounce it with words. People of all ages who can't speak English can not only speak VCD, America and CCTV fluently, but also "read" letters by telephone, and can clearly say foreign names and vocabulary.
In contrast, the components of Chinese characters, such as wood, fire, earth, people, mouth and so on. , of course, can be used independently, and the components of Chinese characters, such as Yi, cannot be expressed orally. It is often said that "the bow grows long and the plum grows long", but the word "official servant mouse" cannot be described by the way of "splitting parts" mentioned above.
The informatization of Chinese characters is inseparable from the expression of components. Without appellation, it can't be described in words, and it's difficult to assign values with characters. This is the root cause of "Pentium".
Quite a few middle school students can recite and memorize the periodic table of elements, which is naturally beneficial to learning chemistry. However, I don't know how many Chinese teachers and students can be equally familiar with the Chinese character parts list. Of course, this is not to say that Chinese students are not diligent.
What needs to be pointed out is that if even Chinese majors can't master language norms, can they still provide them to the public?
If our Chinese character component specification cannot be mastered by the public, can this specification be effectively used for informationization? Chinese characters come from the collection of components, so it is better to standardize the formal components of Chinese majors more strictly. But is it possible to have a short list of practical component specifications? It is not only suitable for children to learn, but also beneficial for foreigners to learn Chinese. Language is used by the public, and so are language norms. Can language norms be more popular in the information age?
Focus on the process and ignore the goal.
For many years, we have attached great importance to the digital representation of Chinese characters and the assignment of Chinese characters when they are input into the computer, which is the goal of Chinese character informatization. In fact, the digital conversion of Chinese characters is only a process. What really deserves attention is the thinking process of "man-machine dialogue" in the human brain, that is, the goal.
The informatization standard of Chinese characters should not only adapt to machine processing, but also attach importance to the process of "man-machine dialogue". The primary consideration of language informatization is the people who speak and write, and it is necessary to change "strokes" into "characters" reasonably, which is of course to adapt to people's thinking habits.
Before and after the May 4th Movement, Chinese changed from the archaic language, in which characters were divided into two parts, to the vernacular language, and entered people's homes from the ivory tower, which was a milestone in the history of Chinese development. With the vernacular, we can create conditions for the popularization of Chinese, produce "phonetic codes" and widely use pinyin input.
In 1997, GB 13000. 1 "Character Set for Information Processing-Specification of Chinese Characters Components" issued by the State Language Committee stipulates 560 components of the Chinese Character Basic Components List and their usage rules.
The basic letters of any phonetic symbol are so concise that children and foreigners can recite them quickly. The "component table" of Chinese characters, together with the "component assignment table" of various Chinese character stroke code input methods, is magnificent and difficult to remember.
Why does the army need ranks? An army has many officers, and outsiders can remember their names and hundreds of positions. However, as long as you have a rank, you can address any officer. For Chinese character parts, should we also create more reasonable part naming and classification?
In social life, appellation is constantly changing. As far as people are concerned, with the development of society, the relationship between people is constantly changing, and the appellation of people will naturally change. The appellation of people has both the characteristics of the times and the regional characteristics, which will spread with the exchange and flow of people. The specification and appellation of components should also change according to the requirements of informatization.
In order to correspond to 26 Latin characters, Chinese characters must change their own laws and restore component titles. In the process of digitalization, Chinese characters must constantly develop their own theories and innovate in order to maintain their position.
Sorting datum and sorting transformation of Chinese character components
Many informatization projects fail because of misunderstanding of data, but people who are not afraid of sacrifice are still brave enough to move forward one after another. They only have new models and versions in their minds, but they can't see the generation and change of data, and of course they can't see the people who manage the data. In the eyes of some bosses, the program is the decisive factor, and the data is only the business of the operator.
The information age is an orderly era and an era of digital earth. People who don't obey the objective laws and pay attention to the meaning of "order" can only fall into a quagmire.
In addition to the lack of appellation, Chinese character components also lack symbol expression and quick sorting rules. In fact, with the title or symbol code, the sorting is solved.
Sorting will not form queues at will, and sorting has recognized benchmarks and standards.
The accepted sorting criteria are: numerical and Latin alphabetical order, which can be used in ascending or descending order.
The standard of computer sorting is always "adjacent competition", just like a hero. As concise as "1" and "0" and "p" and "q". As long as two adjacent parameters meet the sorting criteria, the whole queue becomes an ordered queue.
It is true that everything can choose a variety of comparison standards. However, people have formed the habit of referring to recognized standards, which is the ranking benchmark.
Everything should be converted into corresponding numbers or characters as soon as possible, and the process should be as intuitive and simple as possible.
Intuitive transformation means that in the transformation process of this thing, only one basic feature can be used to directly convert it into numbers or characters, and at the same time, the position of related things in the queue can be determined.
An intuitive transformation applies not only to machines, but also to people's daily expression.
In the existing Chinese character component specifications, the strokes of the components must be calculated one by one. Because the strokes of the parts are 1- 16, there are many parts with the same strokes, such as 99 parts with four strokes. In order to determine the position of parts in the queue, after the first stroke calculation, new features need to be selected for comparison between parts with the same stroke. In this way, the ordering in the component specification is not a one-time or intuitive transformation.
Intuitive transformation is an important principle of digitalization. The existing Chinese character component specifications are sorted by "stroke number" and "stroke shape", which can only be used for manual dictionary retrieval and cannot meet the needs of digital processing.
Basic strokes of Chinese characters
For a long time, in the study and application of Chinese characters, the two endpoints of Chinese character structure, namely, stroke and whole word, stroke shape, stroke order, font and meaning, have become important standards for Chinese character teaching and examination evaluation.
Because the writing tool is just a knife or a pen, the strokes and strokes of words are emphasized and highlighted in the writing process, while the components are weakened. Chinese characters have become more and more "horizontal and vertical" from the original hieroglyphics, which is due to the expansion of application scope and the change of writing tools. However, the most fundamental thing is that due to the progress of society and the improvement of productivity, not only faster writing speed is required, but also faster recording speed is required.
Quick writing requires simplifying structure, reducing strokes, simplifying fonts and straightening strokes;
Distinguishing polysemous words requires accurate distinction and increasing the number of words to achieve accurate expression.
Tools can influence and determine the characteristics of products. The ancient knife carving tortoise shell, the brush writing in agricultural society, and the pencil, pen and ballpoint pen in industrial age determine that the writing process of Chinese characters is based on strokes.
Chinese characters have been tempered for thousands of years, and every word has formed a beautiful structure. The widespread use of Chinese characters has created a demand for pens, and the writing brush has been brought into full play in China. Brush and calligraphy are great contributions of China people to human civilization, which organically combine writing, art and the author's emotion.
Restore the original information of the component.
The "components" that play an important role in the production of Chinese characters have been weakened in the development of Chinese characters. In the long years, although a large number of component information and its source basis have been slowly lost, it does not affect the application and teaching of Chinese characters.
In the process of Chinese character teaching, the number and order of strokes become the main parameters to describe and summarize Chinese characters. As radicals, some components can also become the symbol of Chinese character classification, that is, 10 in these radicals "lost" their titles.
After thousands of years' edification, it takes a long time to learn Chinese characters in the environment of China. It is still a difficult problem to teach Chinese characters by computer in China, and it is even more difficult to teach Chinese characters by computer abroad. So far, there is no mature method for overseas people to learn Chinese characters on the computer.
The title of parts is gradually forgotten by people, so it is difficult to describe and describe parts directly. Fortunately, the "word for word" function in Chinese is convenient and popular, and the components are easy to explain. This result further accelerates the "forgetting" of component titles.
For example, the word "official" is different from the words "an" and "zi" mentioned above. The difference lies in the composition "official (lower half)?" The part title is missing. It can't be described by word splitting in spoken language. However, writing the word "official" face to face with a pen for thousands of years will not bring any trouble to learning and understanding it.
In order to meet the needs of communication and informatization, the word "official" cannot be described by the words "Li" and "Zhang". However, it is equally accurate to express "official" as "official" by association.
People can use associative words to describe Chinese characters, but they can't have a man-machine conversation in this way. Computers are far from reaching people's thinking and judgment level, and they can't adapt to different habits of different people. We naturally wonder why we can't restore the title of "official position (lower part)", so that Chinese characters can be expressed like "Li Muzi".
The extensive use of computers has brought about a qualitative change in "writing" tools. The computer can input components as a whole and weaken the strokes of Chinese characters. In this way, the informationization of Chinese characters may "return" to the stage of word creation, highlighting the overall image of Chinese character components.
The contrast point of Chinese character informatization can not be in accordance with the current modern Chinese character application norms, but should choose the information of the most active period when the components are produced. In other words, we can't choose the language habits of 2000; Nor can the language application level of 1980 be regarded as the comparison standard of Chinese character informatization, but should be sought from the generation time of Chinese characters.
Therefore, the informatization of Chinese character output should restore the original information of Chinese character components and associate the original Chinese character components with the current assignment code.
Components become the fault of Chinese character informatization
Some people call Chinese characters "Chinese character trees", but they actually use more computers and software. Computer Chinese character input method can also be regarded as a tree. First of all, it is divided into "shape code" and "phonetic code". There are two assignment methods for "shape code": based on parts or based on strokes.
The former can express Chinese characters through the collection of component titles, which is in line with the thinking of ancestors creating words. Once the consistency of component appellation and word assignment is abandoned, the input method loses the basic feature of language-the integration of sound and form, which is divorced from the method of language habit, which naturally makes people feel difficult to learn.
Stroke input was once considered as a tedious method, which was not favored by people on the computer. Another village has a bright future, and the overwhelming number of mobile phone users want to surf the Internet. It is the best choice to use the smart screen of telephone numeric keyboard to prompt pen code input. Facing the small keyboard, as long as the strokes don't care about the parts, it becomes an advantage. Although the keyboard is small, the intelligence in the chip makes up for the lack of area. Regardless of the final evaluation, this technology has been widely installed on mobile phones.
It seems a bit redundant to describe what everyone knows in the above paragraph. However, in the rapid development of society, technology and the market it faces are spiraling forward, and methods and technologies that are useless today may find a place to use tomorrow. Although this is a truth, many times, the value of a shining idea is often measured by the "status" of its owner.
The unification of cultural tradition and computer application is still our concern. The progress of tools can not only bring about the progress of thinking, but also enable us to find the most suitable writing mode for computer expression in the long history. Can our ancestors' thinking process of word-making based on components be directly combined with computers? We should not only find the location of components in the long history, but also innovate components in the process of information factory.
With the development of science and technology, people can extract any required parameters from history, astronomy, archaeology, earthquakes, scientific and technological dating and so on. For example, 200 experts worked together to complete the chronology of Xia, Shang and Zhou in five years, which advanced the historical year of China by 1229. In this way, not only did the chronology in the history of China be extended by 1900 years, but also people's new understanding of the subject was enhanced. All the history, resources and culture on the earth are useful and cannot be destroyed or discarded without authorization.
Computer updates parts.
In order to find an input method for the public, we go back to 6000 years ago. Parts are the basis of the whole word, and when the pen is used as a tool, the characteristics of parts will not stand out. The emergence of computers should be said to give parts a chance to be younger, or to try to organically correspond parts and people.
In the industrial period of hundreds of years, western society began to try to use electromechanical equipment to encode and decode characters. After this method is transplanted to China, it can only produce a four-digit "telegraph code" based on rote memorization.
If the appearance of typewriters makes western society get rid of the handwritten letters of random cursive script; Teletypewriters (including keyboards of computer terminals, of course) further enable the public to surpass Morse code and create an application environment for free communication. So,
Because of the three-tier structure of Chinese characters, even if these devices are introduced, it is impossible to directly imitate the western text informationization. Chinese characters should be input by machines. In fact, it is necessary to change the habit of writing with a pen developed for thousands of years and change the habit process of thinking. This point is rarely noticed by people in the process of informationization or digitalization.
"Fission" and "Aggregation" of Chinese Characters
After computers entered the society widely in the mid-1980s, the input of Chinese characters became a difficult problem. No matter how you comment on various codes, most people still find it "difficult to learn". Why is it so difficult to "marry" a computer in our own mother tongue? Not coding or memorizing codes is the wish of many netizens, especially a large number of middle-aged and elderly people.
Paging stations also use computers, and we don't need training for paging girls in Chinese character stations to "speak words". Everyone can make it clear. For example, we can say: "Li Muzi"; You can also say "official".
Note that the "speaking words" on the mobile phone have no auxiliary gestures, nor can they be displayed on the screen and blackboard, and they depend entirely on the mouth. This is because China people can express themselves according to their own language habits, and use one word to explain and illustrate another. There are both the fission of word separation and the aggregation of association. So, can you directly input Chinese characters into the computer by speaking Chinese characters? The answer is yes, but the expression of words must be standardized.
The development of language originates from social progress, and language promotes social progress. The synchronization of characters and languages plays an important role in standardizing and registering languages.
With the rise and fall of the nation, especially the development of science and technology, language is also rising and falling. Of course, the future of Chinese characters cannot be taken lightly.
Coding can't lose language rules.
With the changes and progress of society, the meanings of many components are no longer universal, but these components still exist and can even be found in many commonly used words.
There is an unwritten part "?" In the "brigade", it should be the "flag" referred to by the earliest coinage. In ancient times, people held flags and lined up to fight for the survival of the tribe.
Now "brigade" is more used for "tourism", and it happens that the tour guide also holds small flags to lead the guests to review those ancient battlefields. The thinking of our ancestors has revived in modern life, which is a coincidence in many Chinese characters.
A number of components with lost titles do not affect learning and writing, but bring difficulties to the informatization of Chinese characters. In the modern language environment, it is bound to face difficulties to encode Chinese characters and apply them to the public by using a simple method of assigning values to components.
"If the name is not correct and the words are not fluent", it is difficult to express the nameless part. In order to assign values to common radicals and word-formation components, it is necessary to find and recover the lost original information to ensure the integrity of Chinese character informatization.
What the public needs is only "language", and "coding" does not belong to the public. This is the law of ancient and modern China and foreign countries. The question is, is it possible for us to create a computer input method based on China law? The answer is yes.
First, the description of Chinese character informatization should be based on "natural language";
Secondly, the basis of Chinese character informatization lies in component description;
Third, recover the title of the component and find the missing information.
Revive a lively, vigorous and self-reliant national culture.
Although British and American network experts and linguists also say that Chinese will be the most commonly used language on the Internet, the obstacle to the development of Chinese web pages is complex Chinese input.
Mr. Hu Shi pointed out in 19 14: "Typewriters are made for words and typewriters are made for non-words. Because you can't make a typewriter, you want to waste words. It is ten million times more stupid than those who are suitable for gouging their toes. Besides, Chinese characters are not necessarily suitable for typewriters. " This passage by Mr. Hu is still very applicable even in the "computer age".
However, a few years ago, the input method often paid attention to the corresponding relationship between strokes and characters of Chinese characters, but ignored the thinking process of writing, or broke away from the natural habit of daily speaking. Therefore, even if some input methods are acceptable, it is easy to forget as long as they are interrupted for a period of time.
Facing the problems of civilization and culture, tradition and progress, can we be proud of abandoning the "progress" of traditional cultural heritage? China's long-standing culture is a strong cohesive force of the Chinese nation. We have the responsibility to maintain and carry forward China's traditional culture in the digital revolution, and organically combine modernization with tradition, among which the use of Chinese characters is the most important part.
The component specification is not intended to deal with "input method"
Symbols are not difficult to invent. If China really has the seeds of science, it will certainly create many simple symbols. China used to have only technology, not science. Not all western countries look for symbols from their letters, but they have also invented many other symbols, such as+-*/=. Inventing symbols is a very simple matter, memorizing only a few letters a day, but it is the most difficult to make the society accept it.
In order to solve the "input method", of course, it is necessary to formulate the information standard of Chinese characters, but don't forget that the problem of Chinese characters is not the problem of input method. Therefore, the standardization of Chinese character informatization is not only aimed at amateur coding enthusiasts, but also to meet the needs of Chinese character informatization all over the world.