Making Yomitan Dictionaries　

This document provides an overview on how to create your own Yomitan dictionary.

Tools
Read the Schemas
Packaging A Dictionary
Examples
Schema Validation
Conjugation
Tag Categories

Tools

Yomichan Dictionary Builder: A node package that simplifies the process of making dictionaries, particularly useful for those using TypeScript or JavaScript.
hasUTF16SurrogatePairAt: Important for checking if a kanji/hanzi is a surrogate pair, which affects string operations in JavaScript.
japanese-furigana-normalize: A utility function to normalize Japanese readings containing furigana, ensuring proper alignment with kanji characters.

Read the Schemas

Familiarity with the Yomitan schemas is essential. These schemas define the structure of Yomitan dictionaries. Helpful resources for interpreting JSON schemas include codebeautify, json-schema-viewer, and jsonhero.

Below is a list of Yomitan dictionary schemas, their expected filenames, and their usage:

Schema	Expected Filename	Usage
`dictionary-index-schema.json`	`index.json`	Metadata about the dictionary. Please include as much detail as possible.
`dictionary-kanji-bank-v3-schema.json`	`kanji_bank_${number}.json`	Information used in the kanji viewer - meanings, readings, statistics, and codepoints.
`dictionary-kanji-meta-bank-v3-schema.json`	`kanji_meta-bank_${number}.json`	Stores kanji frequency data.
`dictionary-tag-bank-v3-schema.json`	`tag_bank_${number}.json`	Defines tags for kanji and term dictionaries, like parts of speech or kanken level.
`dictionary-term-bank-v3-schema.json`	`term_bank_${number}.json`	Stores dictionary readings, definitions, etc.
`dictionary-term-meta-bank-v3-schema.json`	`term_meta_bank_${number}.json`	Stores meta information about terms, such as frequency data and pitch accent data.

Packaging A Dictionary

A dictionary can contain various types of information within the zip file. After creating an index.json and the relevant data files, zip them with all data .json files in the root directory of the zip, not in subfolders. Use the highest compression level possible to reduce the size.

Examples

The official test dictionary showcases the full range of features available in Yomitan dictionaries.

Schema Validation

To validate schemas, configure VSCode to validate schemas or use a website such as jsonschemavalidator.

For VSCode validation, use the following settings JSON:

  "json.schemas": [
    {
      "fileMatch": ["kanji_bank_*.json"],
      "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-kanji-bank-v3-schema.json"
    },
    {
      "fileMatch": ["kanji_meta_bank_*.json"],
      "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-kanji-meta-bank-v3-schema.json"
    },
    {
      "fileMatch": ["tag_bank_*.json"],
      "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-tag-bank-v3-schema.json"
    },
    {
      "fileMatch": ["term_bank_*.json"],
      "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-term-bank-v3-schema.json"
    },
    {
      "fileMatch": ["term_meta_bank_*.json"],
      "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-term-meta-bank-v3-schema.json"
    }
  ],

Conjugation

For Yomitan to conjugate Japanese terms, they need the appropriate part of speech tag. The part of speech labels are documented on the official JMDict page. For other languages, find the part of speech tags in ext/js/language/{language}/{language}-transforms.js under the conditions label, for labels that aren't prefixed with "Intermediate".

Tag Categories

The second item in the array of the tag bank schema determines the tag category, affecting the tag color in the user interface. The categories include:

name
expression
popular
frequent
archaism
dictionary
frequency
partOfSpeech
search
pronunciation-dictionary
search

You can view the tag colors here.

Community Contributions

If you have any questions, need help, or want to share a new dictionary, feel free to pop in the Yomitan Discord server. We're happy to help you get started!