Linking Chan/Seon/Zen Figures and Their Texts: Problems and Developments in the Construction of a Relational Database

Michel Mohr
Hanazono University, International Research Institute for Zen Buddhism
Tsubonouchicho 8-1, Nishinokyo, Nakagyoku, Kyoto 604-8456, Japan

*This file is encoded in Unicode. Should it not be displayed accurately by your browser, please try to switch the character-set to Universal UTF-8.
**This article is a slightly revised version of a presentation at the 2001 Electronic Buddhist Text Institute (EBTI) Conference in Seoul. The proceedings of the conference can be found in the Korean journal Jeon ja bul jeon 3 (2001) published by EBTI (


Issues related to the construction of a database on Buddhist historical figures and their written legacy are discussed in the paper, which deliberately takes the researcher's point of view, reviewing concrete examples rather than elaborating on technical issues. One part of the IRIZ "Zen Knowledge Base" project initiated by Urs App is to establish a unique ID number for each Chan/Seon/Zen figure, thereby enabling each author to be linked with the extant documents. The primary stages of this project having now been completed, the paper presents some initial results and working hypotheses [see endnote], and reflects on wider issues related to the digitization of Buddhist research materials.

1 Chan/Seon/Zen Figures

1.1 Case Study 1: Lineage Charts of the Zengaku daijiten

Everyone involved in Buddhist research is familiar with the Zengaku daijiten 禪學大辭典 ["Large Dictionary of Zen Studies"] compiled at Komazawa University. In many respects this dictionary is far from satisfactory, but it remains a major reference work. Among the materials included is a sequence of charts showing the main lineages of the Chan, Seon and Zen traditions ( Zenshû hôkeifu 禪宗法系譜). Although these charts are based in part on legend rather than history, they nevertheless situate many of the major figures who contributed to the development of the Zen schools in China, Korea and Japan. Another distinctive feature is that each page of these charts carries a line number to facilitate the location of each figure listed in the index.

The presentation of these lineage charts actually suggests a matrix, where each particular location can be mathematically defined by its horizontal and vertical coordinates. This gave Urs App and Christian Wittern the idea of adding a number indicating the column. The digits for the page, line, and column would thus represent an ID number pointing to each Chan/Seon/Zen figure. The first section of these charts was included in ZenBase CD 1. For example, the Sixth Patriarch Caoxi Huineng 曹溪慧能 could be identified with the sequence of characters "ZGD-C-04-01-01" where "ZGD" stands for Zengaku daijiten and "C" for China, "04" for page 4, "01" for line 1, and "01" for column 1.

If we think in terms of a relational database, one of the first requirements for linking different files (known as "tables" in the jargon of relational databases) is that each piece of information be identified by a unique set of characters (called a "key" or "primary key" in the jargon of relational databases). This unique set of characters serves as a common denominator between different types of information. A concrete example would be a database of proper names linked to another database of bibliographic references. To link them, a field (or "column") shared by both databases is required, and it must be unique-that is, there should be only one record (or "row") containing this information. This functions as the "key" for establishing the relation between the different databases. In the case of the Zengaku daijiten charts, the attribution of a unique ID number to each figure obviously comes very close to meeting this requirement. There is only one minor obstacle: some figures appear more than once on the charts. This obstacle can easily be overcome by deciding in each case which occurrence will serve as the "main ID" and which one will serve as the "secondary ID", the latest occurrence usually being chosen as the main one. This choice is arbitrary, but remember that the objective here is not historical accuracy but the establishment of a reference number that can function as a basis for relational purposes.

It should be pointed out here that the requirement for a unique "key" exemplifies the gap that remains between information science and the humanities. Reality is always more complex than the schemes that attempt to describe it. For example, many texts include several layers of authorship, and it would be inaccurate to attribute them to a single author. The Biyanlu 碧巌録 ["The Blue Cliff Record"] is a notorious illustration of such multiple authorship, with a set of cases collected by Xuedou Zhongxian 雪竇重顯 (980-1052) followed by capping phrases added by his disciple Yuanwu Keqin 圜悟克勤 (1063-1135). This issue will have to be addressed when dealing more specifically with bibliographic data.

The first results achieved by this method include a set of Zen-related names from China, Korea and Japan, with each name having a specific ID number. This first group of raw data will help clarify the possibilities and limitations involved in building a relational database. One of our first conclusions drawn from use of these data is that the key should be constructed on the basis of information related to the author, because the human factor always comes first. This excludes the category of anonymous works, however. The scheme shown in Figure 1 might help in understanding this basic structure.

Graphic showing the function of a key

Figure 1. Key allowing Zen figures to be linked to their texts

An additional remark is necessary here, for some of the basic assumptions of the key scheme are being partially revised. Thanks to remarks by Fred Coulson of the Tibetan Buddhist Resource Center ( after my EBTI presentation, I have begun considering how it would be possible to establish a unique ID number or key, without having this number associated with a particular source or a particular meaning. In other words, the key should be automatically generated as a random number, the only condition being that the number is unique.

In the average case of one author having produced many works, these works being in turn the object of modern secondary scholarly literature, this approach has proved satisfactory, especially for the secondary literature. In the application used (FileMaker Pro 5.5), this implied the use of the "Random" function, with the simple operation "Random* 1000000000000000" to obtain a 15 digit number, which serves as the key for all publications linked to a specific Chan text. All records of classical literature will thus have their own "Classic ID" and a list of "random IDs" referring to secondary literature. The only drawbacks of this simple approach are:

  1. Duplicating a record involves duplicating the ID, which must then be manually changed.
  2. When the input is done by different users on different databases that are not connected through a LAN, there is no guarantee that identical randomly-generated IDs will not coexist. Checking for duplicates is possible, but it involves reformatting all the links if duplicates are found.

1.2 Case Study 2: Biographies in the Kinsei zenrin sôbôden

Most Zen researchers are familiar with the Kinsei zenrin sôbôden 近世禪林僧寶傳 (biographies of Tokugawa period Zen monks), which remains a major resource in the study of Tokugawa and Meiji Zen figures. Compiled by Dokuon Jôshu 獨園承珠 (Ogino 荻野, 1819-1895), who recognized the need for material on masters coming after the period covered in the Enpô dentôroku 延寶傳燈録 (the previous comprehensive biography, published in 1706), the Kinsei zenrin sôbôden is readily available in the indexed facsimile edition issued by Shibunkaku (Kyoto 1973).

The original edition, printed in 1890, includes 119 main figures. In 1938 Gyokugen Buntei 玉鉉文鼎 (Obata 1870-1945) wrote a sequel, the Zoku kinsei zenrin sôbôden (Sequel to the Kinsei zenrin sôbôden), which includes 417 figures. This work too is included in the facsimile edition mentioned above.

Producing an electronic version of this text presented several difficulties, one of them being its frequent use of non-standard forms of Chinese characters ( itaiji 異体字). Using this electronic text, I began incorporating data into a database of proper names, aiming for the same results as with the Zengaku daijiten lineage charts.

The linear character of the biographies listed in the Kinsei zenrin sôbôden made it much more complicated to produce a unique ID number. This led to two decisions:

  1. when a figure appears in the Zengaku daijiten lineage charts, this ID number takes precedence;
  2. the attribution of a new ID number to Zen-related figures is to follow a systematic procedure beginning with the Zengaku daijiten lineage charts.

ID numbers for figures not included in those charts are constructed on the basis of the page numbers of a biographical source, if possible the earliest.

Consider one example, the biography of Kansô Zentei 乾叟禪貞 (1624-1680). Kansô Zentei appears in Kinsei zenrin sôbôden Vol. 2, pages 10-12, but is missing from the Zengaku daijiten lineage charts. We can thus attribute the ID number "KZS-2-010-012" to Kansô, with "KZS" indicating the Kinsei zenrin sôbôden , "2" the volume number, and "010-012" the page numbers.

Once this ID number has been established, it is possible to produce HTML filenames or HTML links automatically. A few examples are posted on the IRIZ Web site (, in the Japanese section.

1.3 Lunar Calendar and Solar Calendar: Date Calculation Issues

When dealing with the Kinsei zenrin sôbôden or other traditional Chinese, Korean or Japanese sources, the dates of birth and death are often difficult to calculate. First, most dates are given according to imperial era names, which are based on the lunar calendar. In the case of Japan, this calendar remained in use until 1872 (Meiji 5), when the 3rd day of the 12th  month was declared to be 1 January of Meiji 6 (1873) [1].

Until recently, accurate conversion between the two calendars required careful calculations based on the Japanese Chronological Tables,  but many scholars simply transposed the traditional dates into the Gregorian calendar with no adjustments whatsoever. This is why so many inaccuracies exist in Japanese reference works, beginning with the Zengaku daijiten. Take the well-known example of Hakuin Ekaku 白隱慧鶴. Hakuin was born on the second year of the Jôkyô 貞享 era, twelfth month, twenty-fifth day. Generally speaking the year Jôkyô 2 corresponds to 1685, but Hakuin's birth date is actually 19 January 1686, owing to the gap between the lunar and solar calendars. The situation is the same with the year of his death, which occurred on the eleventh day of the twelfth month of Meiwa 明和 5. This corresponds to 18 January 1769 (Katô 1985: pp. 39 and 248). Further problems can emerge when giving someone's age during any particular year. The ages given in Hakuin's biography, for example, follow the traditional system, in which a person is considered to be one year of age in the year of his or her birth. Thus, Hakuin is said to have been one year old in Jôkyô 2; if one automatically assumes Jôkyô 2 to correspond to 1685 in the Western solar calendar system, one can easily conclude that he was age one at a time when actually he had yet to be born. In 1695 his biography gives his age as 11, although, again, by the solar calendar he would have been at most nine years old.

At Hanazono University, I have tried to point out this lack of accuracy, without success. To make the point clearer, I usually mention the case of Johann Sebastian Bach, whose year of birth, 1685, would generally be given as Jôkyô 2 in the traditional Japanese system. Who, then, was born first, Bach or Hakuin? Even without knowing the birthday of Bach, one can assert that he was born prior to Hakuin if one is aware that, due to the discrepancies between the two calendars, Hakuin's birth actually occurred only in 1686.

This story is just one illustration that care must be taken even with details. When it comes to calculating traditional dates, there are two points to remember. First, one should be careful when dates of birth or death occur in the eleventh or twelfth month of the lunar year, because there is a strong probability that they occur in the following year according to the solar calendar. Second, when only the person's age and year of death are known, there are often two possibilities for the year of birth. Since, as explained above, a person's year of age is calculated in accordance with the traditional system that gives one year of age at the time of birth, there can be two possible birth years depending upon which month of the lunar year that person's birth took place. There is no way to avoid this problem if the month of birth is unknown.

Now, fortunately, there are convenient tools that spare the researcher the trouble of manually calculating the solar equivalents of traditional dates. A Japanese site provides an excellent DOS utility called WHEN.EXE that converts traditional dates into other systems. The conversion can also be done online.

1.4 Critical Assessment of Traditional Accounts

As in most fields that depend on historical sources, Buddhist studies is always confronted with the need to evaluate the reliability of written documents. In addition to dating sources, the researcher must always question the explicit or implicit agendas of those who composed these sources. This need is especially great in the case of biographies.

For example, the Kinsei zenrin sôbôden reflects, first of all, Dokuon Jôshu's and Gyokugen Buntei's criteria for choosing which figures should be included; each individual biography, furthermore, reveals which aspects of that person's life the compilers considered instructive for their readers. Such biographies therefore necessarily omit less prominent figures and leave out aspects of the biographies that appeared unimportant to the author. Although we cannot avoid using these accounts (often they are the only remaining documents), we should treat them essentially as hagiographies and not take their contents at face value. This methodological remark is aimed only at underlining the fact that the digitization of Buddhist material implies a certain amount of selection and evaluation, and cannot be considered a value-free mechanical task to be delegated to computer specialists.

2 Chan/Seon/Zen Texts

2.1 The Question of Sorting Sources

The first questions in sorting sources are "What is a Chan/Seon/Zen text?" and "Does this particular category of a Chan text have any relevance?"

The delimitation of Chan/Seon/Zen texts is less straightforward than it might seem. One simplistic way to define a Chan text would be to say that, since the prototype of a "school" centered on Chan practice began to emerge during the Tang dynasty, texts produced by representatives of this school are "Chan texts." However, a serious examination of which texts were actually used by representatives of the Chan tradition shows that much importance was given to scriptures usually ascribed to so-called "traditional Buddhism." All sutras may to a certain extent fit into this category, not to speak of non-Buddhist classics.

It seems therefore more productive to avoid establishing a rigid sectarian label of "Chan/Seon/Zen text". One pragmatic approach would be to say that every document used by individuals who claim for themselves the designation "Chan monk", "Chan nun", or "Chan lay practitioner" is a Chan/Seon/Zen text.

If we accept this wide and unrestricted way of handling Chan texts, is it still relevant to make a distinction between Chan texts and Buddhist texts?

Although there are certain distinctively Chan genres of literature, such as "recorded sayings" ( yulu 語録) and "lamp histories" (dengshi 燈史), our broad definition clearly extends beyond these. If the premise is that what makes a category meaningful is its distinctiveness, then the question of whether a particular source material is a "Chan text" should perhaps be considered secondary. It might even be more productive to simply consider it as a "document", without precluding its origins or religious setting.

An alternative way of considering the issue is to regard all sectarian categories as fundamentally delusory, and as tending to prevent our correct understanding of specific phases in the "history of ideas". The discipline of religious studies is nevertheless required to respect claims made by the individuals who are the object of research; this means that self-proclaimed affiliations with such and such a branch of the Chan tradition can also be taken as valid information, as an indication that this person claims for himself or herself a link with a type of Buddhism that puts emphasis on meditation. 

We are now increasingly aware that there is no homogeneous "Chan tradition" and that all serious research must take into account the complex maze of reciprocal influences between different schools of meditation such as Tiantai. The range of sources can therefore be considered extendable depending on the focus of the researcher and upon his or her willingness to cope with a variety of "external documents".

Consider one example from my own experience in studying Tôrei Enji 東嶺圓慈 (1721-1792), an important disciple of Hakuin affiliated with the Japanese Rinzai tradition. Tôrei's broad erudition is conspicuous in all of his writings, and he often quotes from what are generally regarded as Shinto sources. For instance, he frequently refers to the apocryphal Sendai kuji hongi taiseikyô 先代舊事本紀大成經. In his Shûmon mujintôron 宗門無盡燈論, Tôrei also quotes from the Toyuke kôtaijin gochinza hongi 豊受皇大神御鎭座本記 and the Jingû gokuhi hôkihongi 神宮極祕寶基本紀, two texts belonging to the Five Scriptures of Ise (Shintô gobusho 神道五部書). Tôrei's familiarity with Shinto scriptures is, of course, rather exceptional among Zen people. Nevertheless, if our methodological rules are to be considered valid they must also apply to unusual cases. Here I would say that the study of Shinto scriptures is so important for the understanding of Tôrei's thought that these documents can, in a loose sense, be considered "Zen texts", not because of their content but because a prominent Zen figure like Tôrei employed them in his writings.

We are thus left with a very broad definition of "Chan/Seon/Zen texts". Now the problem is how to sort these texts in a way that would make them easily retrievable for researchers.

Widely used collections, such as the Taishô shinshû daizôkyô or the Manji zokuzôkyô provide a simple means to classify texts according to the collections' volume numbers. The case of the Manji zokuzôkyô is a bit more complex due to its numerous printings, but the database recently published on our institute's Web site ( should help resolve this difficulty. Later I will give some practical examples using these collections.

The purely bibliographical aspect of these different texts will ultimately have to be resolved by librarians, but here I would like to share a simple "tip" on how to search texts in chronological order. The only requirement is to include date information in the name of the file to be searched. Certain texts cannot be accurately dated, but the completion dates of the major "lamp histories" are known. I would therefore have a folder including these texts, with filenames looking as follows:
missing 0790 Lidaifabaoji 歴代法寶記.TXT
missing 0952 Zutangji 祖堂集.TXT
0961 Zongjinglu 宗鏡録.TXT
1004 Jingde chuangdenglu 景徳傳燈録.TXT
1036 Tiansheng guangdenglu 天聖廣燈録.TXT
missing 1101 Jianzhong jingguo xudenglu 建中靖國續燈録.TXT
1107 Linjianlu 林間録.TXT
missing 1135 Zongmen tongyaoji 宗門統要集.TXT
missing 1204 Jiatai pudenglu 嘉泰普燈録.TXT
and so on...

The dates above rely on Yanagida Seizan's "Zenseki kaidai", with "missing" indicating that there is no electronic version yet and that the text should be searched manually. The advantage is that we have thus a list that will report all occurrences of a particular expression in chronological order. This is much better than any dictionary, and even enables the linguist to trace the evolution of a particular expression. I use Matt Brunk's SpeedSearch ( on a Macintosh, but the same can be done with Fgrep and a DOS batchfile (with shorter filenames) or with an editor including Grep search options, such as Hidemaru (

2.2 Basic Requirements for Philological Searches

This leads to a brief digression on a question no doubt familiar to every reader: what makes data truly useful for researchers? We now have a vast array of digitized Chinese, Korean and Japanese texts, which have helped make textual searches completely different from even 10 years ago. However, many of these digital texts are still far from meeting the basic conditions for use as research tools. One difficulty is in finding a "middle way" between "ready-made" applications that depend on one system and one specific type of software, and raw data that users with insufficient computer literacy have difficulty using.

Among the data proliferating on the Web, many are not accurately searchable simply because the inputters are unaware that a line-feed character prevents accurate searches for Chinese compounds. Even in the case of good quality data, such as the electronic texts recently included in Tendai CD2, one finds that the files have no headers and that the characters missing in the JIS codes are uniformly replaced by a "black star" ( kuroboshi ★) character. The fact that these problems, though identified many years ago, remain unaddressed leads me to believe that the Electronic Buddhist Text Institute (EBTI) should establish guidelines and adopt a more active role in promoting them.

It should also be stressed again that data depending on one platform (usually Windows) or one type of environment or application are unacceptable. Such data are not only unavailable to users of less-common systems, but are a long-term preservation risk, since operating systems and even character encodings change quickly. In other words, the durability of such data is in question.

To be truly useful for researchers, data should therefore be retrievable on any machine or any operating system, and searchable with any search utility. In the case of Chinese texts, each line of the text files should end with a punctuation sign. Information about the sources used, the stage of correction, and the people in charge of the editing work, should be clearly listed in the headers of each file. This seems obvious, but is apparently not so much to many researchers, since data released even by respected institutions fail to meet any of these requirements.

2.3 How to Identify a Single Figure Having Various Names

One of the difficulties faced by researchers who deal with Chan/Seon/Zen figures is the multiplicity of their names. Since the direct mention of a cleric's personal name was considered a lack of respect, it was already customary in China to use a place name or a temple name to indicate the identity of a monk or a nun.

For instance, in the case of Shishuang Chuyuan 石霜楚圓 (986-1039), "Shishuang" indicates the temple where he resided as abbot, the Shishuangshan Chongsheng chanyuan 石霜山崇勝禪院 in Hunan Prefecture, while "Chuyuan" is his ordination name ( hui 諱). His other surname was Ciming 慈明. Since he resided at a number of temples at different times in his career, he came to be know variously as Ciming Chanshi 慈明禪師, Xinghua Ciming 興化慈明, Nanyuan Chuyuan 南源楚圓, and Xinghua Chuyuan 興化楚圓. Inasmuch as he is a well-known figure in Chan history, one of the simplest ways to identify him would be to assign him the ID number ZGD-C-06-05-01, derived from the Zengaku daijiten lineage charts.

In Japan and Korea the matter has become even more complicated because of the widespread use of shitsugô 室號, that is, names deriving from the Zen interview rooms of the respective teachers. Even today many Zen practitioners refer to their teacher by his shitsugô.

If we then add pre-ordination family names and imperially bestowed honorific names (shigô 諡號), it is not uncommon for the same person to be referred to by 10 different names. Assigning IDs to these figures therefore appears a priority, especially if we think in terms of establishing links between these figures and other elements of information, such as bibliographic data. Let us see how this could be done.

2.4 Case Study 1: The Zenseki kaidai

The Zenseki kaidai, a reference work by Yanagida Seizan (1976), remains a convenient introduction to 329 essential texts related to the Chan tradition. An early electronic version was included in the ZenBase CD1 (App 1995), with a revised version in database format recently added to the IRIZ Web site. Among the texts included, 110 have at least one author or compiler whose ID is included in the lineage charts of the Zengaku daijiten and has already been input. This provides the necessary key to establish a link between the texts and their authors, enabling a researcher to go back and forth between this bibliographical database and the lineage charts described above. If one author has several works in the Zenseki kaidai, all of these texts will be accessible from the lineage charts.

One important feature in this case is that the relation is bidirectional, but not equivalent. On the lineage-chart side we have both individuals with texts and individuals without texts, and on the Zenseki kaidai side we have texts with multiple authors as well as texts whose authors are unknown.

2.5 Case Study 2: The Taishô shinshû daizôkyô

The Taishô shinshû daizôkyô is probably the most commonly used reference work in Sino-Japanese Buddhist studies. Despite numerous defects in punctuation and other areas, its status of a standard edition makes it fit for the exchange of information between scholars. For example, even if there are better editions of a text, it often remains more convenient to identify it using the Taishô shinshû daizôkyô volume and page number, allowing other researchers to check the source.

The titles of the 3118 texts included in this collection provide a good example of how it is possible to codify classical texts and to attribute unique ID numbers. For example, the Chinese translation of the Lotus Sutra, usually identified as the Miaofa lianhuajing 妙法蓮華經 T. 9 No. 262, could be codified using the first letters of the title Taishô shinshû daizôkyô followed by the volume number and text number: TSD_09_0265. Since such arrays can be produced automatically, this system is quite convenient for naming HTML files or tagging.

2.6 Case Study 3: The Works of Tôrei Enji

Let us now examine a more complicated case. Tôrei Enji has already been mentioned above, in section 2.1. I am currently preparing the collected works of Tôrei for publication, in electronic and/or printed form. Except for two publications of his included in the Taishô shinshû daizôkyô (T. 81 No. 2575 and No. 2576), most of his works remain in the form either of manuscripts or woodblock print documents. There was thus a need to codify his writings in a way that would make them easily identifiable even if unpublished.

Following the basic idea of categorizing bibliographic data according to author, I started with the simple hypothesis that even the most prolific author would hardly produce more than 899 works in a lifetime. I thus added 100 to the record number automatically produced when creating a new entry, so that the result is always a number of three digits between 101 and 999. For example, record #1 for Tôrei is his Bumo onnanpôkyô chûge 佛説父母恩難報經註解 [Annotated Commentary to The Sutra on the Difficulty of Repaying One's Debt of Gratitude to One's Parents ], composed in 1770. Adding 100 to 1 gives the ID 101, which is appended to Tôrei's lineage chart ID (ZGD-J-49-01-02). Thus for this text the unique number is ZGD-J-49-01-02/101, which codifies the meaning "Tôrei's text no. 1".

2.7 Creating the equivalent of an ISBN number for Classics

The procedure followed for Tôrei's works can easily be extended to almost every writer of a Chan/Seon/Zen text, the only requirement being the assignment of an ID number. For example, the Korean monk Kihwa 己和 (or Hamho Tukt'ong 涵虚得通, 1376-1433) is found in our database under the name Deuk-tong Gi-hwa 得通己和 or 득통기화, with the ID ZGD-K-23-04-20. His Commentary on the Sutra of Perfect Enlightenment 圓覺經疏 could accordingly be identified as ZGD-K-23-04-20/101, and this number could in turn serve as basis for linking this text with Charles Muller's English translation (Muller 1999).

The advantages of assigning a unique ID number to important Chan/Seon/Zen texts are obvious, and can be compared to the usefulness of having an ISBN number for modern books. I believe that building such a databank of ISBN numbers for classics could be of tremendous value not only to scholars but to general readers as well. However, care should be taken to account for textual variants, which are often important for their bearing on the contents of the work. For instance, the Platform Sutra should have at least half a dozen different numbers to reflect the variants. This means that the codification cannot be done mechanically--specialists on the scriptures must decide if different editions of a similar text should be handled under one label or should bear different identifications.

One alternative would be to leave space for subsets of particular IDs. The Platform Sutra is an especially intricate case, because its author(s) is/are difficult to ascertain, with most scholars now believing that it contains several layers with different authorship. Nevertheless, one could assign a number for the purpose of categorization, without any pretension to historical accuracy, that would identify the work as the first text under the ID of Caoxi Huineng. Thus we would have ZGD-C-04-01-01/101. Each variant would then be indicated with supplementary letters or numbers indicating its provenance. For example, we could choose ZGD-C-04-01-01/101/DHG1 to indicate the Dunhuang edition included in the Taishô shinshû daizôkyô (T. 48 No. 2007; Stein manuscript No. 5475). The final word on such matters should, of course, be given to librarians and archive specialists, such as the scholars involved in the Dunhuang project; my aim here is only to underline the utility of such encoding.

Of course, another matter to consider is the need for standardization. As with modern ISBN numbers, a single institution would need to centralize and standardize the information. Since this idea is presently limited to texts belonging to the cultural area influenced by the Chinese language, one institution in the CJK area could probably assume the task, but it would need considerable resources.

2.8 The Need to Account for Nontextual Sources

After putting so much emphasis on the need to differentiate between various texts, a remark on the limitations of textual sources appears necessary. Thanks in particular to the efforts of Bernard Faure (1991, 1993), the field of Buddhist studies is now more conversant with the one-sidedness of historical approaches, and is increasingly receptive to methodological alternatives. In addition to structuralist or hermeneutical approaches, the application of anthropological or sociological methods often helps to perceive this field from a larger perspective. This is even more significant in the case of the different Chan/Seon/Zen traditions, with their strong oral tradition based upon use of the koans. Archeological sources or documents highlighted by art history frequently reveal information that is lacking from written accounts. There is no need to overstress this aspect here, but the work of Gregory Schopen (1997) is a telling example of how archaeology or epigraphy can help overcome preconceptions about Buddhist teachings and practice.

3 Linking Chan/Seon/Zen Figures and Their Texts

3.1 Simple Links and Their Limits

The above discussion on Chan/Seon/Zen figures and their texts suggests the complexity of the relationships involved. Disentangling this nexus of relationships to express the relevant data in mathematical terms necessarily involves simplification. The problem is to convey the necessity of distinguishing between precise biographical or bibliographical research and simplified reconstructions of reality that serve as tools.

Drawing sketches with arrows, writing manuscript annotations, or dog-earing books are common ways to note connections that occur in our brain and coincide, so to speak, with the exchanges of electric current between our synapses. Hyperlinks were devised to imitate this natural way of synthesizing distinct pieces of information, but they remain essentially uni- or bidirectional. Figure 2 tries to represent what happens in our minds regarding, say, Hakuin's Yasenkanna 夜船閑話 [Idle Talks in a Night Boat][2].

Example of mental associations with Hakuin's Yasenkanna

Figure 2. Example of mental associations with Hakuin's Yasenkanna

Each reader would, of course, have different associations depending on his or her background. The fact is, however, that even the relatively simple associations illustrated in Figure 2 would be extremely difficult to express either with hyperlinks or with the relational features of a database. The reasons are many, among them being the vagueness of "Daoist sources" or "Indian sutras." We are, in other words, in the realm of "fuzzy logic" here, but this is the way the human mind works.

The provisional conclusion I draw from this is that the very attempt to codify every element and relation into a database is an illusion. The scope of a database is limited, and those limitations must be clearly expressed from the outset.

3.2 Intertextuality as a Parameter of Religious Practice

When studying Chan/Seon/Zen figures one soon notices strong relations between the texts and the meditation practice, especially in schools using koans. The recent book on the koan edited by Heine and Wright (2000) illustrates from various perspectives the cardinal importance of intertextuality. For instance, in the Japanese monastic context a practitioner who reaches an insight into a specific koan is asked to find a verse in the Zenrinkushû 禪林句集 [Anthology from the Zen Forest] that relates to his or her understanding. This can be seen as a "literary exercise" or a "pedagogical device", but it also demonstrates that patterns of understanding follow certain tracks that are not purely accidental, and that can to a certain extent be mapped. Much about these patterns might be revealed by a systematic analysis of the metaphors contained in collections such as the Zenrinkushû. In this case too, proper digitization of the text could provide a tool aiding in-depth research.

3.3 Translations: How to Codify the Quality Criterion

A few decades ago there were so few modern-language translations of Chan/Seon/Zen texts that researchers could easily keep track of the quality of those that existed. However, the recent proliferation of new translations has created a growing need for some means of evaluating the quality of the various renditions. The task of building a database that includes translations of Chan/Seon/Zen texts can thus be seen as establishing three items:

  1. an ID for the original text;
  2. a derived ID for the translation, including a code representing the language;
  3. criteria to assist readers in choosing the appropriate translation (the first obvious criterion being whether a translation is partial or comprehensive)[3].

The third item runs the risk of becoming quite subjective if it is limited to the kind of rating used to award stars to hotels. Since the work of reading a translation and checking it against the original is equivalent to writing a book review, it might be best to establish links to the actual book reviews.

3.4 Classify or Not? Parallel Treatment of Analog and Digital Information

Researchers often spend far more time collecting and storing information than they do analyzing it and using it to formulate new hypotheses. In the humanities, figures indicate that as much as 80 percent of our time is dedicated to mechanical tasks, among these the classification of documents. Noguchi Yukio 野口悠紀雄 argues that for the individual researcher, classification is an endless and fruitless task (1993, 1995, 1999, 2000), and proposes that library-type classification by subject be discarded in favor of chronological ordering (that is, ordering on the basis of what document has last been used). His method basically involves putting all material into A4 envelopes and placing the most recently used envelope at the end of the row. Having applied it to my own work for the past two years I am completely free of the "lost child syndrome" ("Now where did I put that piece of paper!").

Noguchi's ideas are largely inspired by discoveries related to the use of computers. He argues that although we have entered the age of digital information, our thinking is still largely conditioned by habits inherited from our long dependence on paper. We have been led by force of habit to believe that if information is not properly labeled or classified then it will be impossible to find when needed. Noguchi shows, however, that this is not necessarily the case.

Nevertheless, when building a database there seems to be no way to avoid using fields, which amounts to classifying. Similarly, the entire process of tagging, be it in SGML or XML formats, involves labeling items of knowledge, often for commercial purposes. The digitization of data in itself does not necessitate classifying, but the use of database applications compels it to a certain extent. Categories, even the most sophisticated ones, once used necessarily reflect the limits of our vocabulary and conceptual horizon.

Studying the history of religions implies the willingness to take on the viewpoint of the object of study. When the objects of study are Chan/Seon/Zen figures, this may sometimes demand that we, like Zen monks, impose silence upon our discursive minds and employ our more holistic abilities in order to grasp relationships which are difficult to codify. This should not be misconstrued as a negation of rational ways of thinking, but as an augmentation of them. In Buddhism, after all, the logic of equality precedes the logic of differentiation without invalidating it.


References in Japanese

References in Western languages


[abstract endnote] There are many issues in the digitization of Chan/Seon/Zen sources that should be covered in a more methodical way. In an attempt to prevent the paper from becoming too tedious, I didn't include listings of the structures used in the different databases, but I would be happy to communicate them to interested readers. It is true that more could have been added concerning the construction of a general data model appropriate for this type of relational database. I also neglected the whole area of information retrieval systems, the applicability of which could be considered to avoid some of the issues related to the construction of a database of Chan/Seon/Zen figures. In many areas, I am willing to learn more from database specialists and would be grateful for any feedback on the matters discussed here.

[1] Tsuchihashi, Yachita (1952) Japanese Chronological Tables: From 601 to 1872 A.D., Monumenta Nipponica Monographs No. 11 (Tokyo: Sophia University Press), p. 4.

[2] For those not familiar with this text, see Mohr (1999), pp. 310-311.

[3] The first steps in the direction of listing translations of Chan texts have appeared in a rather unknown Chinese translation of Yanagida's "Zenseki kaidai" with additions (Yanagida 1995, 1996 and 1998).