![]() |
![]() |
||||
|
“BLUE DOTS VIEW” – A NEW WAY LOOKING INTO DIGITAL BUDDHIST CANON Howie Lan Introduction Through more than two thousand years of long history, Buddhist traditions have accumulated rich collections of written text all over the world. Today, new technologies have brought the society into digital age. By no exception, Buddhist traditions and its related academic studies are also inevitably participating in this era change.
Wood Printing Bloocks of Tripitaka Koreana G. Utz Westermann and Ramesh Jain,” Toward a Common Event Model for Multimedia Applications,” IEEE Multimedia, 2006.Traditional Buddhist Text After near two decades’ worldwide collaborative and collective effort such as that of the Electronic Buddhist Text Initiative (EBTI), a huge amount of traditional Buddhist texts has been digitized, including Buddhist Canons in Han, Pali, and Sanskrit, which lays the new foundation for today’s Buddhist studies and practices. It is a giant step forward in history!
Digital version of the Tripitaka Korean Buddhist Canon New Challenges While we now have the digital version of the Buddhist texts easily accessible on the internet, along with a detailed descriptive catalogue, the new format has not yet produced any significant shifts in the field of research. The online version is widely used by scholars from around the world and receives thousands of searches every day. These searches for character strings allow the users to find all examples of a term and to see a menu for the occurrences. As valuable as this has been, it is used simply as a faster way of finding material that could be done by manual scanning of each of the pages. There is no attempt to use the digital format for analysis of patterns that may not be discernable by reading and notating.
Example of a Google Search with Very Long List of Returned String Results Project Description Using abstract “blue color dots”, instead of real character shapes, as a visualization mean, the project is trying to explore its use for the important task of analyzing text structure and patterns for scholars in the humanities. For test data, we use the digital full text of the Tripiṭaka Koreana Buddhist Canon. For context metadata, we use the digital edition of Lewis Lancaster’s The Korean Buddhist Canon: A Descriptive Catalogue, converted by Charles Muller. This approach will involve the following process. (1). Converting the character glyphs that are displayed by fonts into colored dot forms. Here we see a page of the wood printing blocks of the Canon and on it are 23 lines containing 14 characters. Each of the characters has a Unicode designation such as:
Each of these glyphs will be converted into a blue dot that will have the same metadata as the glyph.
The blue dots representing the glyphs are then arranged in the order of appearance on “pages”.
In a 3-D software, these abstracted “Blue Dots” images of the canon are now available in a recognizable arrangement for the user. 1. Searching for terms in the “Blue Dots” presentation The use of the blue dots for the search for patterns can start with a string search (similar to a Google search) of the digital full text data. When our target characters have been found, the search results can then be forwarded to the 3-D view and be seen visually as “signals” in the “pages” as the blue dots of the target word are changed in color and size. 2. Identifying patterns in search results. At this point, the user can “see” all of the places where the target word is found. Rather than moving directly to the glyphs that make up the printed version, we remain in the abstract arena for further exploration of patterns that can be presented in images. A first pattern can be word clustering. The user can spot places where clusters are visually apparent. When we compare the Google search like result with the colored dots imaging, it is at once obvious that certain patterns are easily seen in the latter. It can take many hours for scholars to follow string-only based search results and note clustering.
3. Contextualizing the search results. With the metadata associated with the dots, the abstracted dots can be put into some form of context (When, Where, Who, etc.) for evaluation. A first step is to denote the “pages” that belong to each of the 1500 texts in the Canon. This allows the user to spot the search results as they appear within each of the texts. Then multiple ways of context can be connected to these “pages”. When Want to know which part of the Canon was translated between the 5th and 7th century? With the time related metadata, “pages” translated in century 5 to 7 can be seen easily by coloring in red.
Where Also, “pages” can be colored for the place where the translations were made, linking to the monastery and its location.
Who Yet another display can be made for the authorship of the text, linking the translator and his information.
So, we can also see if some terms were used more by one translator but not by the other.
Or, we can see if some terms were used more frequently in certain time period. From the above approach, the user can make a search for a target term, see it displayed as color dots in contrast to the blue dots. This imagery allows the scholar to view quickly the number of occurrences and the place in each text and in the canon as a whole where the target is located. In some cases, it will be immediately apparent that the word is found regularly throughout the texts. By using the context, it may be seen that the word appeared regularly in the 5th-7th century and irregularly in the 9th-11th. Patterns of Structure in Content For a more complex use of the “Blue Dots” presentation, we can turn to a search for the structure of the content of each text. 1. Ring Composition From Biblical and Classical studies, we are aware that ancient literature exhibits chiastic structure, that is, there is an inverting of the ordering of words. In this type of structure, the end corresponds to the beginning. It can be seen as A B C D D’ C’ B’ A’. From this we have “ring” constructions where the story starts, proceeds to a turning point, and then repeats the elements of the story in reverse order until one arrives at the end which is a corresponding or similar statement to the beginning. A recent volume by Mary Douglas Thinking in Circles: An Essay on Ring Composition points out the need to make cross-cultural studies of the particular literary phenomenon of chiastic structuring.
Using the search capacity of the digital version of the Buddhist canon, we can begin to create a “Blue Dots View” that includes the places where this ring structure appears. 2. Interpreting Ring Composition Since the Ring Composition is composed of a series of words, concepts, minor rings that appear in a sequence which “turns” and all of the series is repeated backwards, we can look for target words in terms of placement within the structure. If a concept appears as one of the thematic words for the parallels of the ring, that gives us an opportunity to deal with the concept in terms of its relationship to the other thematic words. That is, if the target word is B in the sequence of A B C D C’ B’ A’ within an identified ring, we can expect the word to have a special relationship to A, C, and D. The ring structure has a beginning, a turning point, and an end. We can identify the turning point as the place between the last item of the first sequence and its repetition in the second sequence. That is we have A B C D (turning point) D’ C’ B’ A’. The turning point is the place where the general message of the ring is placed. In the space of D we expect to find the kernel of the ring. In the search for a target word, it would be of great interest to see if that word appears as the kernel of a ring. That would give it greater weight than if it is just one of the thematic parallels. We would look to mark the turning point of the macro-structure. Below, we see the imagery where the analysis of the word placement shows us a possible large Ring Composition and also looks for the thematic content that repeats. It could be marked as below with a large colored dot showing the first repetition of a word. The yellow dots are signals that we have a repeat structure. This would be the turning point of the ring. Having found a Ring Composition and the turning points of the last word of the first sequence and the same word as the first one of the second sequence, it is possible to mark off the words that make up this point. In some cases, the turning point may contain a long commentary resulting in the ring being obscured in normal reading. Word Clusters Clustering is quite different from the appearance of a target word within a Ring Composition. The distance between the target word occurrences is used by the software to identify the clustering. In order to understand the nature of the cluster, it is crucial to see a word frequency context for the content. At present, word frequency is usually presented as a listing of words according to their rank in a list of counted occurrences. In the “Blue Dots View”, we can see the frequency as image and this allows us to understand that the total number of occurrence may not be as important as the distribution. In the following example, we see that the clustering of our target word in one text and its absence in most other texts is an indication of the use of the word in the macro-structure. Some of the Benefits Using the abstracted “Blue Dots View” model in the search for word strings, clustering of terms, analyzing of Ring Construction, viewing results by time, creator, and place will provide textual scholarship with a new approach. The following are some of the ways that this strategy will permit novel results to permeate the search and retrieval process. 1. Viewing the search results in the abstracted images allows the user to quickly spot visual patterns of occurrence of target words. Seeing the image of the result rather than hundreds of lines of reports from the present Google search will be a much needed innovation. When working with verbal indexing of the spaces which hold the target word, it takes the scholar many hours or even weeks to discern patterns of distribution when the search result contains hundreds of examples scattered across thousands of pages of text. A number of patterns of distribution are immediately available when the research result is an image. 2. After receiving the data on the appearance of a string of characters, the task of the scholar is to understand those words within various contexts. For example: the specific text of the 1500 canonic list; the time when the text was created by translation; the relationship to all of the texts created by the same individual; the place where the translation task was completed. Since it will be possible to shift the image display to show the results of the search in all of these contexts, researchers will quickly gain control over the significance of the placement of each word string. 3. The abstraction in the “Blue Dots” format provides new ways of exploring each of the search results. The user may visually spot patterns of placement, relationship, density, and frequency that provide novel results not available in the usual search process. Because the ways of seeing the data are expanded, the reactions and responses of the users may be multiplied far beyond the limitations of receiving the data in lines of text description. 4. Currently, searching for a word string in the canonic material yields only an indication of placement with some metadata information attached. Using the “Blue Dots View” search, the word string appears in a fashion that allows the user to note both the presence as well as the absence of it. The “dark matter” of absence often defines the patterns as clearly as the “presence.” With search results given in text lines, we do not have a sense of the places where it is absent. 5. Identifying clustering of the search string is crucial to our understanding of its distribution. It may be that there are 1000 examples of the target word, but the density image might show that most of them appear in one text or in one time period or in the works of one translator. Such density is quite different from a situation where the clustering appears globally in the whole of the canon. Imagery allows the immediate recognition of the nature of the distribution. 6. Ring Composition research has been recognized and used in Biblical and Classical Studies. The significance of the “ring” is the fact that words appearing in different parts of the construction have different functions. There is the dual framework of terms that appear in one order and then change to the reverse. If a target word appears among these terms then it gives an indication of the way in which it has been combined with other ideas. On the other hand, if the term appears as the “kernel” of the “ring,” that is at the turning point between the dual framework, it will have different function that one that is used in the two bounding structures. 7. Once the patterns have been discerned through the abstracted “Blue Dots”, the scholar can then see the glyphs and study the written form of the text. It will be possible to shift back and forth from imagery of the glyph to the “Blue Dots View” for visual context and patterning. 8. At the conclusion of the exercise described above, the scholar will be able to indicate the context of the term, where it has high frequency, where it is central to the Ring Construction, where it is related to any other words, when each of these patterns are to be found in terms of the creation of the documents, where was the document physically made etc. Seeing a word in all this contextualization can give the researcher the ability to understand the term, see the underlying structures of the texts (beyond chapters, pages, paragraphs), and determining the significance of the target word in all its “places.” Using new approaches such as the “Blue Dots View” of the Buddhist canon, we can expect that the digital format will finally break the bounds of our normal procedures and allow us to explore the horizons of the potentials that rest with the new digital technologies. References: The Research Institute of Tripiṭaka Koreana, Digital Tripiṭaka Koreana, 2002 |
|||||
![]() |
|||||