Japanese Version

Note on the basic concepts of Wnn

In this note, general concepts of Wnn are introduced. Mainly, we will forcus on the two products: free software Wnn4.2 and proprietary Wnn6.

What is Wnn?

Wnn is a Kana-Kanji translation system , developed by a joint project formed by Kyoto University, OMRON Corporation [formerly known as Tateishi Electronics Co.], and ASTEC Inc. The first public release of Wnn has done in 1987 for UNIX operating system. The name "Wnn", which is being an acronym for the Japanese sentence "Watashino Namaeha Nakanodesu" (literally, it means "my name is Nakano."), is derived from a goal of the project: to develop a system so powerful enough that the system can translate such whole sentence at once. Note, in those days the goal had been technically challenging.

The source code has been written in C language, and been distributed freely. Consequently, Wnn spread widely among on workstation platforms, and became a de-facto standard of the Kana-Kanji translation system for UNIX operation systems.

One most significant point is that Wnn works in client-server manner. The server portion of Wnn, or jserver, is used as a Kana-Kanji translation engine for those clients such like "xwnmo" ; an input system on X Window System, or for "Egg" of "Nemacs" or "Mule" which are discussed below.

Figure: Modules that construct Wnn.

In the version 4.1, which is released in 1991, Chinese and some European languages supports are integrated to Wnn. Hence Wnn became multilingual. The latest free software line of Wnn, or Wnn4.2, has adopted the functionality of multi phrase translation of Korean language, as well as support for X11R6.

Figure: Wnn's history
(excerpted from [3] )

Commercial Wnn product is available from Omron Software Corporation for both UNIX and Microsoft Windows95 operating systems. Wnn6, released in 1995, is for some major variants of UNIX. Wnn95, released in 1996 is for Windows95. Wnn95 is released separately with its supporting language: for Japanese, Chinese, and Korean.


The client-server structure of Wnn

Wnn system consists of its server part; jserver, and its clients parts; such like uum. Server and clients communicate to each other in TCP/IP, thus they can be hosted on different machines as far as they are IP reachable.

Figure: The client-server structure of Wnn
(excerpted from [3] )

On the process of translation, Wnn's behave as follows: assume now that an user gave the string "わたしのなまえはなかのです", which is the pre-translated sentence (sentence in Kana characters), to an wnn client such like uum or egg . Then the string is passed from the client to its server. On receiving the request from the client, the server start to try to find out an appropriate translation result, with consulting its dictionaries together with some other information like "frequency database", then gives back the result, say, "私の名前は中野です", which is the translated sentence (sentence consists of Kana and Kanji characters), to the client.

[Due to the nature of its contents, the paragraph above contains the code which represents some characters in Japanese.]


Clients of Wnn

The following is a list of known wnn clients.
uum
The program uum is the standard client of Wnn system. Note that "uum" is the image of 180 degree rotation of "wnn."

xwnmo
Xwnmo is another wnn client which works on X Window System.

egg
Egg is a part of Nemacs; Nihongo Emacs, or Mule; Multilingual Emacs. Egg provides emacs lisp API on Kana-Kanji translation for these localized or internationalized Emacses.

Wnn's 6 characters

Until now, Wnn has been kept on evolving with many features. For instance, translation servers for Chinese, Korean, and Taiwanese as well as their dictionaries are integrated. Also, Wnn has been employed as a standard of multilingual input system for X Window System. Nevertheless, there has been 6 characters kept preserved since the first version of Wnn in 1987.
(1)Ability for multiphrase translation

In the days of Wnn's early development stage in 1985, most of other Kana-Kanji translation system could provide only single phrase translation. Though just one word of a developer of the project Mr. Shuji Nakano (*) "What we shall develop is a system that can translate even a multiphase such like 'Watashino Namaeha Nakanodesu' ('my name is nakano') at once!", made a goal of the project.
(*) Mr. Shuji Nakano, Omron Co.

(2)Client-Server scheme

By centralizing the translation server in a local area network (LAN) environment, the set of dictionaries or data to which the server shall consult can also be unified. Consequently, users can access to the same environment even they are logged on to other machine. Moreover, because only one running server is needed in the LAN, it results in conservation of resources such like memory spaces.

(3) Enabling migration to other system, or sharing the data among users

The dictionaries and its formats has been kept publicly available.

(4) To be available many platforms

Wnn has been portable since it had been written in C language.

(5) Let users can develop their own applications

The routines for translation are provided as libraries, and its C API has been kept open. Actually, the entire source has been freely available. As a result, ETL (Electorotechnical Laboratory) could lay Egg independently to the Wnn consortium.

(6) Let as many users can use as possible

The source code has been distributed free of charge.
Wnn6, on the other hand, is a proprietary software that enhances these 1 to 4 above and abandon 5 and 6. Wnn6 is developed at Information Technology Resarch Center of OMRON Co., and sold by OMRON Software Corporation.


Functions enhanced in Wnn6

Below we will see the features which are enhanced in Wnn6 with comparing the list of Wnn's characteristics above.
(1)Ability for multiphrase translation
--> Efficiency has been increased by FI technology .

The word "FI" stands for Flexible Inteligence in which the essential mechanism is a) FI translation mechanism; that take connectivity with the phrase with case information, and phrase of the predicate into account of translation. togeather with b) FI learning mechanism; that learns the relations between phrases.

In order FI becomes practical, the following databases are used: system dictionary on FI in which 2.7 millions cases of translation patterns are stored, and user dictionaries on FI that is provided for each user. Other than FI dictionaries, Wnn6 also refers to the system dictionary with 200 thousand of vocabulary. (about 6 times more in contrast with Wnn4's 35 thousands words pubdic.)

(2)Client-Server scheme
--> Taking advantage of Client-Server scheme

Nowadays a client-server system is everywhere. So the point is, if any, how much advantage the system can take with the scheme. Wnn6 has the following functionalities which could become practical on being a client-server system.

Offline learning
To improve efficiency of translation as well as conservation of resources, Wnn6 can rearrange the database on frequency of translations with the command wnnoffline .

Administrative tools
With wnnaccess , system administrators can control user's access to the server by setting permissions for host and user basis.

Automatical parameter tuning
7 out of 17 parameters that are critical for translation can be optimized automatically.

More than one jserver runnable on a host
By assigning different port number, more than one jserver can be running on a machine.

(3) Enabling migration to other system, or sharing the data among users
--> Enabling migration from other input system.

There is a converter included in Wnn6 that translate user dictionaries used with other Japanese input systems, like ATOK7, ATOK8, VJE-Delta, EGBRIDGE, into the one Wnn6 can use. Of cause, a dictionary for Wnn4 also can be migrated to Wnn6.

(4) To be available many platforms
--> Not only for Unix workstations, Windows are supported.
Wnn95 for Windows95 has been released. There is Wnn6 for Linux/FreeBSD also.


References
[1] "UNIX no NIHONGO SHORI GA WAKARUHON SAISHIN Wnn KATSUYOU GAIDO" (in Japanese)
Yoshida, Tomoko et al. Nikkan Kougyou Shinbunsha, 1993.
[2] "KANA KANJI HENKAN SYSTEM" in "UNIX USER" 1995.7 (in Japanese)
Yoshida, Tomoko. SOFT BANK, 1995
[3] "Maruchi ringal kankyou no kouchiku" (CREATING A MULTILINGUAL ENVIRONMENT: Multilingualization Using X Windows, Wnn, Mule, and WWW Browsers) (in Japanese)
Nishikimi, Mikiko et al. Prentice Hall, 1996

Glossary
uum

uum is a frontend processor of Wnn, which is included as a part of Wnn system distribution. uum works on [either localized or internationalized] character terminal. With uum running, the bottom most line of the terminal is designated to display user's inputs and translation results. The name of Japanese runtime module is also "uum." uum programs for Simplified Chinese, traditional Chinese, and Korean language are named "cuum," "tuum," and "kuum" respectively. Though the runtime module of these programs are distinct to each other, the source codes for them has been unified. With the compilation options, the result runtime module can be altered.


xwnmo

Xwnmo also is a frontend processor of Wnn, included in Wnn distribution as well. Xwnmo works on X Window System as a client in the sence of X. As of Wnn's version 4.1 where Chinese language began to be supported, xwnmo also was adopted multilingual functionality. Xwnmo included in Wnn4.2 supports Chinese, Korean, and some European languages. Unlike uum, single runtime module of xwnmo can serve to connect with different translation servers for different languages.
Nemacs
Nihongo Emacs

Nemacs is a localized editor, based on GNU Emacs. Nemacs can deal with English and Japanese. Nemacs is developed at the Electorotechnical Laboratory (ETL) of the Agency of Industrial Science and Technology (AIST) at the Ministory of International Trading and Industory (MITI). As of version 2.1 of Nemacs released in June 1988, the system which directly communicate with jserver, Egg , is integrated for the ease of Japanese text inputs.

The development project of Nemacs ended in the release of version 3.3.2 (codename Fujimusume), in June 1990. Then the project turned to develop multilingual Mule .


Mule
MULtilingual Enhancement to GNU Emacs

Mule is an internationalized editor based on GNU Emacs, which also is developed at ETL. Mule can handle many charactersets, mainly that defined in ISO2022 but not only them. Actually, user can define additional characterset for mule to handle. Development of Mule has started in late 1991. In August of 1993, version 1.0 (codename Kiritsubo ) becomes publicly released. The latest version 2.3 (codename Suetsumuhana ), which is released on September 24th 1995, is based on GNU Emacs 19.28.

The work to merge Mule and GNU Emacs has done and is now available as emacs20. Its latest version is 20.2.

[ Fujimusume, Kiritsubo, and Suetsumuhana are the titles of the chapters of Genji-Monogatari ( The Story of Genji ) by Murasakishikibu . Genji-Monogatari is known to be the oldest female literature in the world. All the release of Nemacs and Mule are named after these titles. ]


Egg

Egg ( TAMAGO ) is an input method for Mule and/or Nemacs. By means of communication with a translation server on the network, Egg provides input translation functionality for these editors. The Egg system which is a part of Mule2.3 is called Egg TAKANA version . With Mule2.3, Egg serves as an input system for both Japanese, Korean, and Chinese in simplified characters. The word Egg is the literal translation of its Japanese name TAMAGO , which is the acronym for the Japanese sentence; "TAkusan MAtasete GOmennasai" (literally, "Sorry to kept you wainting for long.") The word "TAKANA" comes from the sentence "TAmagoyo KAshikoku NAre" ("TAMAGO, be smarter.")

This page has been created by Tomoko Yoshida.
Translation by M. Meiarashi.
Last modified: Wed Aug 25 18:00:41 JST 2004