|
Learning with Texts - Fork 2.4.0-fork
Learn foreign languages with texts
|
| Language | RegExp Split Sentences | RegExp Word Characters | Make each character a word | Remove spaces |
|---|---|---|---|---|
| Latin derived alphabet (English, French, German, etc.) | .!?:; | a-zA-ZÀ-ÖØ-öø-ȳ | No | No |
| Languages with a Cyrillic-derived alphabet (Russian, Bulgarian, Ukrainian, etc.) | .!?:; | a-zA-ZÀ-ÖØ-öø-ȳЀ-ӹ | No | No |
| Greek | .!?:; | \x{0370}-\x{03FF}\x{1F00}-\x{1FFF} | No | No |
| Hebrew (Right-To-Left = Yes) | .!?:; | \x{0590}-\x{05FF} | No | No |
| Thai | .!?:; | ก-๛ | No | Yes |
| Chinese | .!?:;。!?:; | 一-龥 | Yes or No | Yes |
| Japanese (Without MeCab) | .!?:;。!?:; | 一-龥ぁ-ヾ | Yes or No | Yes |
| Japanese (With MeCab) | .!?:;。!?:; | mecab | Yes or No | Yes |
| Korean | .!?:;。!?:; | 가-힣ᄀ-ᇂ | No | No or Yes |