Complex Script Functionality

This page relates to Bug 0001547: - Support for Indic Scripts and Metabug 0003965: Support for non-latin languages].

Things we require for testing Complex Script Functionality

 * Sample text files encoded in UTF8
 * Sample screenshots of Scribus 1.4.1 or 1.5.0svn rendering these incorrectly.
 * Sample fonts that we can get that are freely available that have the ability to be used for DTP.
 * Sample screenshots of any application rendering the same text correctly


 * An important point to note is we need comparable information. Same texts, same fonts etc and screenshots (png or tiff please) of said texts and fonts. --Cbradney 18:40, 25 Feb 2005 (UTC)


 * It would also be helpful to add information about the writing direction(s).--C schaefer (talk) 06:52, 22 December 2012 (CET)

example : French:latn:FRA
 * Other useful contribution would be mapping of the languages you're interested in with its script/lang pair in OpenType Font style.

To help, you can have a look at http://www.microsoft.com/typography/developers/OpenType/scripttags.aspx for script tags and http://www.microsoft.com/typography/developers/OpenType/languagetags.aspx for lang tags. --Pmarchand 13:46, 8 April 2007 (CEST)

Please try to only list the best fonts for each script for use in printed publications. (Not screen/web fonts).

I don't know anything about quality, but Fedora 9 has many of these fonts available in their repository (meaning they must be free): arabic, bengali, gujarati, hindi, malayalam, oriya, punjabi, sinhala, tamil, and telugu. – Greg P

Arabic & Urdu
Font: Free fonts for Arabic / Urdu are available by Crulp and by PakType. Licensing is GPL at PakType and Crulp has also very loose license.

Font

 * SolaimanLipi is a free Bengali font available in sourceforge

Language, Script & Language System tags

 * Assamese:beng:ASM
 * Bengali:beng:BEN

Open Type features used for Bengali script
 * See: Developing OpenType Fonts for Bengali Script - at MS Typography

Screenshots
Screenshot needed here. Please expand this section

Devanagari
The Devanagari script is used for writing many languages in India and Nepal including Sanskrit, Hindi and Nepali.

Fonts

 * Chandas - Devanagari Unicode Open Type font with 4347 glyphs: 325 half-forms, 960 half-forms context-variations, 2743 ligature-signs. It is designed especially for Vedic and Classical Sanskrit but can also be used for Hindi, Nepali and other modern Indian languages. The font includes Vedic accents and many additional signs and so provides maximal support for Devanagari script. GPL License.


 * Madan - Open Type font with Nepali glyphsets.Developed by Madan Puraskar Pustakalaya and released under GPL.

Language, Script & Language System tags
OpenType features used for Devanagari script
 * Bhojpuri:deva:BHO
 * Hindi:deva:HIN
 * Marathi:deva:MAR
 * Nepali:deva:NEP
 * Sanskrit:deva:SAN
 * See Developing OpenType Fonts for Devanagari Script - at Microsoft Typography

Screenshots




Gujarati
Please expand this section

Kannada
Kannada is used in south India

Fonts
Lohit Kannada https://fedorahosted.org/lohit (GPL Font)

Language, Script & Language System tags
Kannada KAN

Screenshots




Khmer
Khmer is used in Cambodia.

Fonts

 * Khmer OS System: http://selapa.net/khmerfonts/fontinfo.php?font=89
 * Other Khmer font: http://selapa.net/khmerfonts/

Language, Script & Language System tags

 * Khmer KHM

Screenshots




Lao
Please expand this section

Malayalam
The issue is related to the bug report #7140

Fonts

 * Lohit Malayalam
 * AnjaliOldLipi (GPL fonts)

Language, Script & Language System tags

 * Malayalam MAL

Myanmar
Please expand this section

Language, Script & Language System tags

 * Burmese:mymr:BRM

OpenType Features used for Myanmar script:

Font

 * See Oriya

Language, Script & Language System tags

 * Oriya:orya:ori

Open Type features used for Oriya script
 * See: Developing OpenType Fonts for Oriya Script - at MS Typography
 * UTF8:
 * ଜ୍ଞାନକୋଷ
 * ଭାଷାରେ

Screenshots




Fonts

 * Typing order: Left ==> Right
 * Unicode Range: 0D80 - 0DFF SInhala Unicode Range
 * Fonts freely available in both Linux and MS Windows from here

Language, Script & Language System tags
Sinhala Unicode Issues for Scribus Version 1.3.3.12 Issues in MS Windows
 * Not Supporting for Sinhala Charters
 * Example Charters
 * අ 0D85, ෆ 0DC6

Issues in Linux
 * Please expand this

Sinhala Unicode Issues for Scribus Version 1.3.6SVN
Issues in MS Windows Issues in Linux

Tamil
Please expand this section

Telugu
Please expand this section

Thai
Please expand this section

Fonts
Seems to work fine on a variety of fonts (compiled for OSX), including the following: Note: This does not work with Tahoma (the tone marks and vowels overlay each other)
 * JS-Jukaphan
 * PSLxText
 * PSL TextSP
 * Thonburi (OSX default font for Thai)
 * TH Krub
 * TH K2D July8

Tibetan & Dzongkha
This relates to Bug 0004452: No support for Tibetan Script - sample UTF-8 test file and PDF file attached to that bug.

Tibetan Script:

 * Unicode Range: 0F00 - 0FFF Tibetan Unicode Range
 * Typing order: Left ==> Right
 * OpenType Specs: Creating and Supporting OpenType Fonts for Tibetan Script

Fonts:

 * Jomolhari, a free Tibetan script font suitable for publishing is available from the Free Tibetan Fonts project - OFL License. The font was originally designed for publishing traditional Buddhist texts in Classical Tibetan (chos skad) - but can also be used for modern Tibetan, Dzongkha and Ladakhi text. The font works well in OO.org 2.4.x & Inkscape.


 * DDC Uchen OFL License. Freely available from the Dzongkha Development Commission

Language Script & Language System tags:

 * Tibetan: tibt:dflt (default)
 * Tibetan: tibt:TIB
 * Dzongkha: tibt:DZN
 * Sanskrit: tibt:SAN
 * Ladakhi: tibt:LDK
 * Tshangla: tibt:TSJ
 * Khengkha: tibt:XKF
 * Bumthangkha: tibt:KJZ
 * Balti: tibt:BFT

OpenType Features
The following OpenType features should be processed to support Tibetan script:
 * ccmp - Glyph Composition  / Decomposition  (Composition = GSUB lookup type 4 & Decomposition = GSUB lookup type 2).
 * blws - Below-base Substitutions (GSUB lookup type 4)
 * abvs - Above-base Substitutions (GSUB lookup type 4)
 * calt - Contextual Alternates (GSUB lookup type 6)
 * blwm - Below-base Mark Positioning (GPOS lookup type 4,5)
 * abvm - Above-base Mark Positioning (GPOS lookup type 4,5)
 * kern - Kerning - (GPOS lookup type 2 or 8)

Note: The current version of the Jomolhari font uses only ccmp, blws, abvs, calt, and kern features. In kern only GPOS lookup type 2 (pair-adjustment) is used in this font ~ so this is the only kind of GPOS lookup used.

Test of ccmp feature used for Decomposition (GSUB lookup type 2):
The string U+0F43 U+0F77 (གྷཷ)  should render like the first example: However, in Scribus, U+0F43 U+0F77 (གྷཷ) currently renders like this:

In this case, the ccmp feature should decompose U+0F77 into seperate glyph components to go below and above the glyph for the base character (U+0F43)

Test of ccmp feature used for Composition (GSUB lookup type 4):
The string U+0F42 U+0FB7 U+0F0B U+0F4C U+0FB7 U+0F0B U+0F51 U+0FB7 U+0F0B U+0F56 U+0FB7 U+0F0B U+0F5B U+0FB7 U+0F0B U+0F40 U+0FB5 (གྷ་ཌྷ་དྷ་བྷ་ཛྷ་ཀྵ) should render as follows:

However, in Scribus, that string currently renders like this:

In this case if ccmp feature is being applied ligatures should be properly composed.

Furthermore, the strings U+0F42 U+0FB7 U+0F0B U+0F4C U+0FB7 U+0F0B U+0F51 U+0FB7 U+0F0B U+0F56 U+0FB7 U+0F0B U+0F5B U+0FB7 U+0F0B U+0F40 U+0FB5 (གྷ་ཌྷ་དྷ་བྷ་ཛྷ་ཀྵ) and U+0F43 U+0F0B U+0F4D U+0F0B U+0F52 U+0F0B U+0F57 U+0F0B U+0F5C U+0F0B U+0F69 (གྷ་ཌྷ་དྷ་བྷ་ཛྷ་ཀྵ) should render identically as they are canonically equivalent. (U+0F42 U+0FB7 = U+0F43; U+0F4C U+0FB7 = U+0F4D; U+0F51 U+0FB7 = U+0F52; U+0F5B U+0FB7 = U+0F5C; and U+0F40 U+0FB5 = U+0F69.)

Test of blws feature (GSUB lookup type 4)
Test string consists of the folowing Unicode characters: U+0F62 U+0F0B U+0F62 U+0F90 U+0F0B U+0F62 U+0F90 U+0FB1 U+0F0B U+0F62 U+0F90 U+0FB1 U+0F74 U+0F62 U+0F92 U+0F0B U+0F62 U+0F92 U+0FB1 U+0F0B U+0F62 U+0F92 U+0FB1 U+0F74 U+0F0D (ར་རྐ་རྐྱ་རྐྱུརྒ་རྒྱ་རྒྱུ།) and U+0F66 U+0F0B U+0F66 U+0FA6 U+0F0B U+0F66 U+0FA6 U+0FB2 U+0F0B U+0F66 U+0FA6 U+0FB2 U+0F74 U+0F0B U+0F66 U+0FA8 U+0F0B U+0F66 U+0FA8 U+0FB2 U+0F0B U+0F66 U+0FA8 U+0FB2 U+0F74 U+0F0D (ས་སྦ་སྦྲ་སྦྲུ་སྨ་སྨྲ་སྨྲུ།)

UTF8 Hebrew Text
A pangram of Hebrew text (from Zephania 3:8): לָכֵ֤ן חַכּוּ־לִי֙ נְאֻם־יְהוָ֔ה לְי֖וֹם קוּמִ֣י לְעַ֑ד כִּ֣י מִשְׁפָּטִי֩ לֶאֱסֹ֨ף גּוֹיִ֜ם לְקָבְצִ֣י מַמְלָכ֗וֹת לִשְׁפֹּ֨ךְ עֲלֵיהֶ֤ם זַעְמִי֙ כֹּ֚ל חֲר֣וֹן אַפִּ֔י כִּ֚י בְּאֵ֣שׁ קִנְאָתִ֔י תֵּאָכֵ֖ל כָּל־הָאָֽרֶץ׃

Note: Hebrew script is written RTL (right to left). The above text is formatted with the following span and div tags:

Hebrew Fonts

 * Unicode range: 0590–05FF
 * Opentype script and laguage tags: Hebrew:hebr:IWR
 * The Cardo font (Cardoi99.otf) by Fonts for Scholars is an open source licensed Open Type font that includes the full set of Hebrew with its diacritics
 * The Open Source Unicode Hebrew Font pack (A comprehensive collection of free/libre licensed fonts including Ezra SIL, Cardo, and Culmus Project fonts)
 * Ezra SIL (an open font licensed, Unicode Hebrew font supporting the full range of Hebrew diacritics)