卷首 Home
瞻彼旱麓,榛楛濟濟。豈弟君子,干祿豈弟。 瑟彼玉瓚,黃流在中。豈弟君子,福祿攸降。 鳶飛戾天,魚躍于淵。豈弟君子,遐不作人。 清酒既載,騂牡既備。以享以祀,以介景福。 瑟彼柞棫,民所燎矣。豈弟君子,神所勞矣。 莫莫葛藟,施于條枚。豈弟君子,求福不回。
Welcome to Fanya Hanwen Corpus 泛亞漢文語料庫, a Literary Chinese corpus containing over twice as much text as Siku Quanshu! By employing a pan-Asian, diachronic perspective, the Fanya Hanwen Corpus hopes to propel the Literary Chinese tradition into the modern age.
序 Introduction
Welcome to Fanya Hanwen Corpus! It contains texts from each Chinese dynasty, the Korean peninsula, Japan, Vietnam, and the Ryukyu Kingdom, Singapore, and myriad historical countries and regions. Literary Chinese was and is a living language with a vibrant diversity in texts; it once enjoyed a similar status to Latin during its heyday and should be treated as such. This project has been ongoing for around a year with just myself. It has been a crushing amount of work, attempting to produce a website that meets three goals: (i) Develop a region and culture-first view of Literary Chinese, (ii) Develop a modern educational model that works with this view, and (iii) Develop an accessible website that operationalises these two goals and casualises the information that comes from them. I do not believe this success has been met yet, but it is progressing admirably.
語庫讀器 Corpus Viewer
最新文本 Latest texts
All text folders近期變更 Recent changes
All changed filesOne of the biggest wins of the project so far has been an elaborate Corpus Viewer, which aims to make the Literary Chinese experience as accessible as can be for everyone. This includes:
- Pronunciation annotations for myriad Asian languages and dialects; from Mandarin, Japanese, Korean, and Vietnamese, to Zhuang, Old National Pronunciation, Middle Chinese, Old Chinese, Okinawan, and more! Learn to recite Literary Chinese in your language!
- Options for punctuation (judou 句讀) or to strip it.
- Traditional underlining systems in original colours or colour-blind adjusted ones.
- Kanbun, hanmun, and hanvan annotations.
- Different fonts, including one for seal script.
- Vertical or horizontal orientation with adaptations.
- Conversion between different character traditions, including Simplified Chinese and scrapped initiatives.
- A robust dictionary containing Unihan data, plus more data collected by myself.
- The option to submit translations.
字典 Dictionary
This website contains a fairly robust dictionary that aims to be as inclusive as possible. At present, it contains Unihan data, Ruby text settings, ample phoneticization conventions, and the option to switch between Traditional and Simplified Chinese (as well as the original text in case of mix/var forms) whilst viewing things. Additionally, this database contains some variant form research derived from independent research and open-cc, which hopes to map variants as exhaustively as possible in hopes of an eventual text normaliser. Please give feedback on this!
The main dictionary definitions, which can appear in popups, come from CC-CEDICT, Unihan, and Kangxi Dictionary. Kangxi Dictionary is regex'd to have linebreaks according to specific wording patterns in hopes of making it more readable, and so far it seems to have been successful. The Guangyun and Shuowen Jiezi have also been turned into browsable Dictionaries, and their sorting mechanisms have been documented.
There are hopes to grab more dictionaries for comparison, such as these dictionaries, which are used in Siku Quanshu:
- 切韻
- 重修廣韻
- 集韻
- 切韵指掌圖
- 韻補
- 附釋文互註禮部韻畧
- 増脩互注禮部韻畧
- 增修校正押韻釋疑
- 九經補韻
- 五音集韻
- 古今韻會舉要
- 四聲全形等子
- 經史正音切韻指南
- 洪武正韻
- 古音叢目
- 轉注古音略
- 屈宋古音義
- 御定音韻闡微
- 欽定同文韻統
- 欽定叶韻彚輯
- 欽定音韻述㣲
- 音論
- 詩本音
- 易音
- 唐韻正
- 古音表
- 韻補正
- 古今通韻
- 易韻
- 孫氏唐韻考
- 古韻標準
- 六藝綱目
Of course, more dictionaries are desired, such as those from non-Chinese nations. The aim is to make immersive learning of Literary Chinese through dictionaries as easy as possible within modern pedagogical frameworks, whilst also promoting scholarly study.
Text Submission
Want to add a new text to the corpus? This creates a review ticket rather than writing straight into the corpus.
來想 Future Plans
There are hopes to implement a nation atlas, grammar wiki, and more detailed variant/corrected text items over time. Synopses, blurbs, and summaries are also desired.
所載 Publications
I myself (Llinos Evans) have used this corpus to produce a few publications, some before the corpus was fully established. These go as follows...
- Evans, L. (2025a). Fossil classifiers in Chinese chengyu and proverbs [Application/pdf]. Essex Student Journal, 17(1). https://doi.org/10.5526/ESJ.429
- Evans, L. (2025b). 鷖詣醫鷁憶: A One-Syllable Classical Chinese Poem in the Style of Yuen Ren Chao [Application/pdf]. Essex Student Journal, 17(1). https://doi.org/10.5526/ESJ.433
引據 References
The following texts were used in the making of this website, the corpus, and more.
- Aisin-Gioro H., Ji Y., & Lu X. (Eds). (2007). 四庫全書 [Complete Library of the Four Treasuries] (Vol. 1–36,381). Wikisource. https://zh.wikisource.org/wiki/%E5%9B%9B%E5%BA%AB%E5%85%A8%E6%9B%B8
- Aun, C. (2025). Cheeaun/chengyu-wordle [JavaScript]. https://github.com/cheeaun/chengyu-wordle (Original work published 2022)
- Behr, W. (2008). Dialects, diachrony, diglossia or all three? Tomb text glimpses into the language(s) of Chǔ [Lecture].
- Boltach, J. V. (2013). Ханмун: Вводный Курс [Hanmun: Introductory Course]. Гиперион [Hyperion].
- Aun, C. (2025). Cheeaun/chengyu-wordle [JavaScript]. https://github.com/cheeaun/chengyu-wordle (Original work published 2022)
- Chao Y. R. (1983). 通字方案 [A Project for General Chinese]. 商务印书馆 [Commercial Press].
- Editorial Board of the Encyclopedia of China. (2009). 《四庫七閣·四庫全書·四庫全書總目提要》. In 中国大百科全书 [Encyclopedia of China] (2nd ed). 中國大 [China Publishing House].
- Dai, L., Li, T., Yang, Y., Jia, L., Magically Asia Limited, TudorTech System Co., Ltd, & Founder Electronics Company Limited. (1999). Siku Quanshu Online (Version 3.0 (Wenyuange Ed.)) [Computer software]. Digital Heritage Publishing Ltd, East View Information Services. http://skqs.com/
- Guy, R. K. (1987). The emperor’s four treasuries: Scholars and the state in the late Chʻien-lung era. Council on East Asian Studies, Harvard University.
- Han, D. (2022). 朝鲜汉诗选 [Anthology of Korean Sinitic Poetry] (1st ed). 江西教育出版社 [Jiangxi Education Press].
- Lee, J., & Jung, Y. (2025). A Variant Character Dataset for Historical Narratives of Middle and Late Imperial China. Journal of Open Humanities Data, 11, 33. https://doi.org/10.5334/johd.325
- Li, S., Hu, R., & Wang, L. (2025). AI太炎 [AI Taiyan] (Version v2.20240912.19) [Computer software]. Beijing Normal University. https://t.shenshen.wiki/
- Lichtman, K., & VanPatten, B. (2021). Was Krashen right? Forty years later. Foreign Language Annals, 54(2), 283–305. https://doi.org/10.1111/flan.12552
- Matsumoto J. (2025). 日本漢文の世界 [Nihon kanbun no sekai]. Kambun.jp. https://kambun.jp/
- Meng, K. (2016). 孟子 [The Works of Mencius] (J. Legge & Y. Shi., Trans.; 1st ed.). 中州古籍出版社 [Zhongzhou Ancient Books Publishing House].
- Shi, K. (Ed.). (2024). 日本汉诗选 [Anthology of Japanese Sinitic Poetry] (1st ed, Vols 1–2). 江西教育出版社 [Jiangxi Education Press].
- Stimson, H. M. (1976). T’ang Poetic Vocabulary. Yale University Press.
- Sturgeon, D. (2020). Digitizing Premodern Text with the Chinese Text Project. Journal of Chinese History, 4(2), 486–498. https://doi.org/10.1017/jch.2020.19
- Sturgeon, D. (2021). Chinese Text Project: A dynamic digital library of premodern Chinese. Digital Scholarship in the Humanities, 36(Supplement_1), i101–i112. https://doi.org/10.1093/llc/fqz046
- Takekoshi 竹越, Takeshi. 孝(2011). 『兼滿漢語滿洲套話清文啓蒙』満洲文字注音一覧表. KOTONOHA, (101).
- Trilateral Cooperation Secretariat. (2015). 中日韩共同常用八百八汉字表 [Booklet of the 808 Commonly Used Chinese Characters in China, Japan and the ROK]. Trilateral Cooperation Secretariat.
- The Unicode Consortium. (2025). CJK Unified Ideographs (Plus Extensions). In The Unicode Standard, Version 17.0 (17.0.0, pp. 553–1086). Unicode, Inc.
- Trilateral Cooperation Secretariat. (2015). 中日韩共同常用八百八汉字表 [Booklet of the 808 Commonly Used Chinese Characters in China, Japan and the ROK]. Trilateral Cooperation Secretariat. https://tcs-asia.org/data/etcData/PUB_1570754349.pdf
- untunt. (2025). Nk2028/zhongyuan-data [Python]. nk2028. https://github.com/nk2028/zhongyuan-data (Original work published 2023)
- untunt. (2025). Nk2028/menggu-ziyun-data [Python]. nk2028. https://github.com/nk2028/menggu-ziyun-data (Original work published 2025)
- Wang D., Liu C., Liu L., Liu J., Hu H., Shen S., & Li B. (2022). SikuBERT与SikuRoBERTa:面向数字人文的《四库全书》预训练模型构建及应用研究 [Construction and Application of Pre-trained Models of Siku Quanshu in Orientation to Digital Humanities]. 图书馆论坛 [Literary Tribune], 42(6), 14. https://doi.org/10.3969/j.issn.1002-1167.2022.06.005
- Wang, D., Liu, C., & Zhu, Z. (2021). SikuBERT (Version 2.0) [Computer software]. Nanjing Agricultural University. https://huggingface.co/SIKU-BERT/sikubert, https://github.com/hsc748NLP/SikuBERT-for-digital-humanities-and-classical-Chinese-information-processing, https://gitee.com/onesleepyjoker/SikuBERT-for-digital-humanities-and-classical-Chinese-information-processing
- Wang, H., Shimizu, H., & Kawahara, D. (2023). Kanbun-LM: Reading and Translating Classical Chinese in Japanese Methods by Language Models. In A. Rogers, J. Boyd-Graber, & N. Okazaki (Eds), Findings of the Association for Computational Linguistics: ACL 2023 (pp. 8589–8601). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.545
- Wu, L. 吴留营 (2022). 琉球汉诗选 [Anthology of Ryukyu Sinitic Poetry] (1st ed). 江西教育出版社 [Jiangxi Education Press].
- Wu, T. (2025). 文律宋体 [WenJin Mincho] (Version v1.100) [Lua]. Tianjin University. https://github.com/takushun-wu/WenJinMincho (Original work published 2024)
- Xu S. (2025). 說文解字 [Shuowen Jiezi] (Sturgeon D., Ed.; Digitised ed., Vols 1–15). Chinese Text Project. https://ctext.org/shuo-wen-jie-zi
- Huxley, T. H. (1898). 天演論 [Theory of Evolution] (F. Yan, Trans.). In 譯例言 [Evolution and Ethics] (Digitized ed.). Wikisource. https://zh.wikisource.org/wiki/%E5%A4%A9%E6%BC%94%E8%AB%96/%E8%AD%AF%E4%BE%8B%E8%A8%80
- Yan, Y. (2023). 越南汉诗选 [Anthology of Vietnamese Sinitic Poetry] (1st ed, Vols 1–2). 江西教育出版社 [Jiangxi Education Press].
- Yang, B. (2016). 文言语法 [Literary Chinese Grammar] (1st ed). 中华书局 [Zhonghua Book Company].
- Yi, B. (2019). 字統网 [zi.tools] [Browser]. https://zi.tools/
- Zhang, Y., & Chen, T. (with Aisin-Gioro, X.). (2015). 康熙字典 [Kāngxī Zìdiǎn] (H. Wang, Ed.; Revised Ed., Vols 1–2). 社会科学文献出版社 [Social Sciences Academic Press].