Complex Script Functionality

This page relates to Bug 0001547: - Support for Indic Scripts and Metabug 0003965: Support for non-latin languages].

Things we require for testing Complex Script Functionality

 * Sample text files encoded in UTF8
 * Sample screenshots of Scribus 1.3.12, 1.3.4 or 1.3.5cvs rendering these incorrectly.
 * Sample fonts that we can get that are freely available that have the ability to be used for DTP.
 * Sample screenshots of any application rendering the same text correctly

An important point to note is we need comparable information. Same texts, same fonts etc and screenshots (png or tiff please) of said texts and fonts. --Cbradney 18:40, 25 Feb 2005 (UTC)

example : French:latn:FRA
 * Other usefull contribution would be mapping of the languages you're interested in with its script/lang pair in OpenType Font style.

To help, you can have a look at http://www.microsoft.com/typography/developers/OpenType/scripttags.aspx for script tags and http://www.microsoft.com/typography/developers/OpenType/languagetags.aspx for lang tags. --Pmarchand 13:46, 8 April 2007 (CEST)

Please try to only list the best fonts for each script for use in printed publications. (Not screen/web fonts).

I don't know anything about quality, but Fedora 9 has many of these fonts available in their repository (meaning they must be free): arabic, bengali, gujarati, hindi, malayalam, oriya, punjabi, sinhala, tamil, and telugu. – Greg P

Arabic & Urdu
Font: Free fonts for Arabic / Urdu are available by Crulp and by PakType. Licensing is GPL at PakType and Crulp has also very loose license.

Bengali
Font: SolaimanLipi is a free Bengali font available in sourceforge

Language, Script & Language System tags:
 * Assamese:beng:ASM
 * Bengali:beng:BEN

Open Type features used for Bengali script
 * See: Developing OpenType Fonts for Bengali Script - at MS Typography

Screenshots:

Please expand this section

Devanagari
The Devanagari script is used for writing many languages in India and Nepal including Sanskrit, Hindi and Nepali.

Fonts:
 * Chandas - Devanagari Unicode Open Type font with 4347 glyphs: 325 half-forms, 960 half-forms context-variations, 2743 ligature-signs. It is designed especially for Vedic and Classical Sanskrit but can also be used for Hindi, Nepali and other modern Indian languages. The font includes Vedic accents and many additional signs and so provides maximal support for Devanagari script. GPL License.


 * Madan - Open Type font with Nepali glyphsets.Developed by Madan Puraskar Pustakalaya and released under GPL.

Language, Script & Language System tags: OpenType features used for Devanagari script
 * Bhojpuri:deva:BHO
 * Hindi:deva:HIN
 * Marathi:deva:MAR
 * Nepali:deva:NEP
 * Sanskrit:deva:SAN


 * See Developing OpenType Fonts for Devanagari Script - at Microsoft Typography

Screenshots:





Gujarati
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Kannada
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Khmer
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Lao
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Malayalam
Fonts: Lohit Malayalam https://fedorahosted.org/lohit, AnjaliOldLipi http://varamozhi.sourceforge.net/fonts/AnjaliOldLipi-0.710.ttf (GPL fonts) Language, Script & Language System tags: Malayalam MAL Screenshots: Scribus rendering http://sabdabodha.googlepages.com/scribus_ml.png Desired Rendering http://sabdabodha.googlepages.com/des_re_ml.png
 * The issue is related to the bug report http://bugs.scribus.net/view.php?id=7140

Myanmar
 Please expand this section 

Font:

Language, Script & Language System tags:
 * Burmese:mymr:BRM

OpenType Features used for Myanmar script:

Screenshots:

Oriya
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Sinhala
Fonts: Language, Script & Language System tags: Screenshots:
 * Linux and MS Windows Font Downloads=> Official Sinhala Unicodes

Tamil
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Telugu
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Thai
Fonts: Language, Script & Language System tags: Screenshots:
 * Please expand this section

Tibetan & Dzongkha
This relates to Bug 0004452: No support for Tibetan Script - sample UTF-8 test file and PDF file attached to that bug.

Fonts for Tibetan Script:
Jomolhari, a free Tibetan script font suitable for publishing is available from Jomolhari font - OFL License. The font was originaly designed for publishing traditional Buddhist texts in Classical Tibetan (chos skad) - but can also be used for modern Tibetan, Dzongkha and Ladakhi text. The font works well in OO.org 2.4.x & Inkscape.

Language Script & Language System tags:

 * Tibetan:tibt:TIB
 * Dzongkha:tibt:DZN
 * Ladakhi:tibt:LDK

OpenType Features
The following OpenType features should be processed to support Tibetan script:
 * ccmp - Glyph Composition  / Decomposition  (Composition = GSUB lookup type 4 & Decomposition = GSUB lookup type 2).
 * blws - Below-base Substitutions (GSUB lookup type 4)
 * abvs - Above-base Substitutions (GSUB lookup type 4)
 * calt - Contextual Alternates (GSUB lookup type 6)
 * blwm - Below-base Mark Positioning (GPOS lookup type 4,5)
 * abvm - Above-base Mark Positioning (GPOS lookup type 4,5)
 * kern - Kerning - (GPOS lookup type 2 or 8)

Note: The current version of the Jomolhari font uses only ccmp, blws, abvs, calt, and kern features. In kern only GPOS lookup type 2 (pair-adjustment) is used in this font ~ so this is the only kind of GPOS lookup used.

Test of ccmp feature used for Decomposition (GSUB lookup type 2):
The string U+0F43 U+0F77 (གྷཷ)  should render like the first example: However, in Scribus, U+0F43 U+0F77 (གྷཷ) currently renders like this:

In this case, the ccmp feature should decompose U+0F77 into seperate glyph components to go below and above the glyph for the base character (U+0F43)

Test of ccmp feature used for Composition (GSUB lookup type 4):
The string U+0F42 U+0FB7 U+0F0B U+0F4C U+0FB7 U+0F0B U+0F51 U+0FB7 U+0F0B U+0F56 U+0FB7 U+0F0B U+0F5B U+0FB7 U+0F0B U+0F40 U+0FB5 (གྷ་ཌྷ་དྷ་བྷ་ཛྷ་ཀྵ) should render as follows:

However, in Scribus, that string currently renders like this:

In this case if ccmp feature is being applied ligatures should be properly composed.

Furthermore, the strings U+0F42 U+0FB7 U+0F0B U+0F4C U+0FB7 U+0F0B U+0F51 U+0FB7 U+0F0B U+0F56 U+0FB7 U+0F0B U+0F5B U+0FB7 U+0F0B U+0F40 U+0FB5 (གྷ་ཌྷ་དྷ་བྷ་ཛྷ་ཀྵ) and U+0F43 U+0F0B U+0F4D U+0F0B U+0F52 U+0F0B U+0F57 U+0F0B U+0F5C U+0F0B U+0F69 (གྷ་ཌྷ་དྷ་བྷ་ཛྷ་ཀྵ) should render identically as they are canonically equivalent. (U+0F42 U+0FB7 = U+0F43; U+0F4C U+0FB7 = U+0F4D; U+0F51 U+0FB7 = U+0F52; U+0F5B U+0FB7 = U+0F5C; and U+0F40 U+0FB5 = U+0F69.)

Test of blws feature (GSUB lookup type 4)
Test string consists of the folowing Unicode characters: U+0F62 U+0F0B U+0F62 U+0F90 U+0F0B U+0F62 U+0F90 U+0FB1 U+0F0B U+0F62 U+0F90 U+0FB1 U+0F74 U+0F62 U+0F92 U+0F0B U+0F62 U+0F92 U+0FB1 U+0F0B U+0F62 U+0F92 U+0FB1 U+0F74 U+0F0D (ར་རྐ་རྐྱ་རྐྱུརྒ་རྒྱ་རྒྱུ།) and U+0F66 U+0F0B U+0F66 U+0FA6 U+0F0B U+0F66 U+0FA6 U+0FB2 U+0F0B U+0F66 U+0FA6 U+0FB2 U+0F74 U+0F0B U+0F66 U+0FA8 U+0F0B U+0F66 U+0FA8 U+0FB2 U+0F0B U+0F66 U+0FA8 U+0FB2 U+0F74 U+0F0D (ས་སྦ་སྦྲ་སྦྲུ་སྨ་སྨྲ་སྨྲུ།)