summaryrefslogtreecommitdiff
path: root/docs/making-yomitan-dictionaries.md
diff options
context:
space:
mode:
authormarv <rotrobmin@gmail.com>2024-06-25 11:17:36 -0700
committerGitHub <noreply@github.com>2024-06-25 18:17:36 +0000
commit2a92a0b98c2bb08c2adaca24ff8af3322874ef59 (patch)
treeb5f14d8b30768f65f9f92990acdf90553f30bb42 /docs/making-yomitan-dictionaries.md
parentb7a73d17964f7f4ae56a313ff64afa94e24efe68 (diff)
Add Documentation for Making Dictionaries (#1103)
* Add basic info copied over * Add link to making dicts doc in readme * Update docs/making-yomitan-dictionaries.md Co-authored-by: Kuuuube <61125188+Kuuuube@users.noreply.github.com> Signed-off-by: marv <rotrobmin@gmail.com> * Update docs/making-yomitan-dictionaries.md Co-authored-by: Kuuuube <61125188+Kuuuube@users.noreply.github.com> Signed-off-by: marv <rotrobmin@gmail.com> * Update docs/making-yomitan-dictionaries.md Co-authored-by: Kuuuube <61125188+Kuuuube@users.noreply.github.com> Signed-off-by: marv <rotrobmin@gmail.com> * Remove unnecessary sections * Add POS documentation * Add community contributions line --------- Signed-off-by: marv <rotrobmin@gmail.com> Co-authored-by: Kuuuube <61125188+Kuuuube@users.noreply.github.com>
Diffstat (limited to 'docs/making-yomitan-dictionaries.md')
-rw-r--r--docs/making-yomitan-dictionaries.md97
1 files changed, 97 insertions, 0 deletions
diff --git a/docs/making-yomitan-dictionaries.md b/docs/making-yomitan-dictionaries.md
new file mode 100644
index 00000000..79769f4f
--- /dev/null
+++ b/docs/making-yomitan-dictionaries.md
@@ -0,0 +1,97 @@
+# Making Yomitan Dictionaries <!-- omit in toc -->
+
+This document provides an overview on how to create your own Yomitan dictionary.
+
+- [Tools](#tools)
+- [Read the Schemas](#read-the-schemas)
+- [Packaging A Dictionary](#packaging-a-dictionary)
+- [Examples](#examples)
+- [Schema Validation](#schema-validation)
+- [Conjugation](#conjugation)
+- [Tag Categories](#tag-categories)
+
+## Tools
+
+- [Yomichan Dictionary Builder](https://github.com/MarvNC/yomichan-dict-builder/): A node package that simplifies the process of making dictionaries, particularly useful for those using TypeScript or JavaScript.
+- [hasUTF16SurrogatePairAt](https://www.npmjs.com/package/@stdlib/assert-has-utf16-surrogate-pair-at): Important for checking if a kanji/hanzi is a surrogate pair, which affects string operations in JavaScript.
+- [japanese-furigana-normalize](https://github.com/MarvNC/japanese-furigana-normalize): A utility function to normalize Japanese readings containing furigana, ensuring proper alignment with kanji characters.
+
+## Read the Schemas
+
+Familiarity with the [Yomitan schemas](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas) is essential. These schemas define the structure of Yomitan dictionaries. Helpful resources for interpreting JSON schemas include [codebeautify](https://codebeautify.org/jsonviewer/), [json-schema-viewer](https://json-schema-viewer.vercel.app/), and [jsonhero](https://jsonhero.io/).
+
+Below is a list of Yomitan dictionary schemas, their expected filenames, and their usage:
+
+| Schema | Expected Filename | Usage |
+| ---------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------- | -------------------------------------------------------------------------------------- |
+| [`dictionary-index-schema.json`](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas/dictionary-index-schema.json) | `index.json` | Metadata about the dictionary. Please include as much detail as possible. |
+| [`dictionary-kanji-bank-v3-schema.json`](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas/dictionary-kanji-bank-v3-schema.json) | `kanji_bank_${number}.json` | Information used in the kanji viewer - meanings, readings, statistics, and codepoints. |
+| [`dictionary-kanji-meta-bank-v3-schema.json`](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas/dictionary-kanji-meta-bank-v3-schema.json) | `kanji_meta-bank_${number}.json` | Stores kanji frequency data. |
+| [`dictionary-tag-bank-v3-schema.json`](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas/dictionary-tag-bank-v3-schema.json) | `tag_bank_${number}.json` | Defines tags for kanji and term dictionaries, like parts of speech or kanken level. |
+| [`dictionary-term-bank-v3-schema.json`](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas/dictionary-term-bank-v3-schema.json) | `term_bank_${number}.json` | Stores dictionary readings, definitions, etc. |
+| [`dictionary-term-meta-bank-v3-schema.json`](https://github.com/themoeway/yomitan/tree/master/ext/data/schemas/dictionary-term-meta-bank-v3-schema.json) | `term_meta_bank_${number}.json` | Stores meta information about terms, such as frequency data and pitch accent data. |
+
+## Packaging A Dictionary
+
+A dictionary can contain various types of information within the zip file. After creating an `index.json` and the relevant data files, zip them with all data `.json` files in the root directory of the zip, not in subfolders. Use the highest compression level possible to reduce the size.
+
+## Examples
+
+- The [official test dictionary](https://github.com/themoeway/yomitan/tree/master/test/data/dictionaries/valid-dictionary1) showcases the full range of features available in Yomitan dictionaries.
+
+## Schema Validation
+
+To validate schemas, configure [VSCode](https://code.visualstudio.com/docs/languages/json#_json-schemas-and-settings) to validate schemas or use a website such as [jsonschemavalidator](https://www.jsonschemavalidator.net/).
+
+For VSCode validation, use the following settings JSON:
+
+```json
+ "json.schemas": [
+ {
+ "fileMatch": ["kanji_bank_*.json"],
+ "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-kanji-bank-v3-schema.json"
+ },
+ {
+ "fileMatch": ["kanji_meta_bank_*.json"],
+ "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-kanji-meta-bank-v3-schema.json"
+ },
+ {
+ "fileMatch": ["tag_bank_*.json"],
+ "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-tag-bank-v3-schema.json"
+ },
+ {
+ "fileMatch": ["term_bank_*.json"],
+ "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-term-bank-v3-schema.json"
+ },
+ {
+ "fileMatch": ["term_meta_bank_*.json"],
+ "url": "https://github.com/themoeway/yomitan/raw/master/ext/data/schemas/dictionary-term-meta-bank-v3-schema.json"
+ }
+ ],
+```
+
+## Conjugation
+
+For Yomitan to conjugate Japanese terms, they need the appropriate part of speech tag. The part of speech labels are documented on the [official JMDict page](http://www.edrdg.org/jmdictdb/cgi-bin/edhelp.py?svc=jmdict&sid=#kw_pos). For other languages, find the part of speech tags in `ext/js/language/{language}/{language}-transforms.js` under the `conditions` label, for labels that aren't prefixed with "Intermediate".
+
+## Tag Categories
+
+The second item in the array of the tag bank schema determines the tag category, affecting the tag color in the user interface. The categories include:
+
+- name
+- expression
+- popular
+- frequent
+- archaism
+- dictionary
+- frequency
+- partOfSpeech
+- search
+- pronunciation-dictionary
+- search
+
+You can view the tag colors [here](https://github.com/themoeway/yomitan/blob/48f1d012ad5045319d4e492dfbefa39da92817b2/ext/css/display.css#L136-L149).
+
+# Community Contributions
+
+If you have any questions, need help, or want to share a new dictionary, feel free to pop in the [Yomitan Discord server](/README.md#yomitan). We're happy to help you get started!