Departmental Bulletin Paper 漢字の構造分析に関わる問題 : 漢字字体の構造分解とコード化に基づく計量的分析
A Method of the Analysis of Kanji Structure: A New Approach Based on Structural Decomposition and Coding

ヴォロビヨワ, ガリーナ  ,  ヴォロビヨフ, ヴィクトル  ,  Galina, VOROBEVA  ,  Victor, VOROBEV

(9)  , pp.215 - 236 , 2015-07 , 国立国語研究所
ISSN:2186-134x print2186-1358 online
In this study, we specify some important problems concerning the analysis of kanji structure, after reviewing previous research on the subject. After discussing a method of analysis of kanji structure using a new approach, we suggest the following solutions: (1) We emphasize the need to develop a standardized system of elements that covers all of the 2136 Joyo kanji. If such a system were successfully developed, it would be possible to systemize the recognition and acquisition of kanji. (2) By analyzing how previous investigations deal with radicals as part of an element system, we find that many systems that did not use the traditionally defined set of radicals have been developed. (3) The quantitative analysis of kanji form can be made possible by clear representation and indication. In order to achieve this, after carrying out the linear decomposition of kanji, we develop a unique code system of kanji form and constructed alphabetic code and symbolic code systems. The results of this kanji coding make it possible to measure the frequency of kanji strokes and elements, and open up the possibility of creating a new indicator of the structural complexity of kanji. (4) We show that the syntactical hierarchical structure of sentence and word structure in English shares some similarities with the hierarchical structure of kanji. Conducting a hierarchical decomposition of kanji, we present the hierarchical structure using tree diagrams and mathematical formulae. We also compare the hierarchical analysis and coding in our study with those of Fujimura (1973), showing the practicality of our study. (5) We define an indicator of the structural complexity of kanji, and classify the 2136 Joyo kanji by their complexity.

Number of accesses :  

Other information