How to write a padma conversion file?
What is Padma?
Simply put, it’s a text transformation utility, which comes as firefox extension. It transforms a custom font encoded text(text written using a custom proprietary font such as eenadu which doesn’t adhere to standards) to Unicode(A standard employed to support all the scripts in this world), provided it is given the mapping file required. There are already around 70 such conversion/mapping files associated with Padma. What that means is, it can transform a text encoded in 70 different ways to Unicode.
This tutorial gets you up with all that is required to kick start writing a new conversion/mapping file in Padma.
So, this mapping file has mappings which map some hexcode(a hexadecimal code associated to some letter, for example letter “ki” can be \u0045) in the font to it’s respective counter part in the standards. Let’s consider eenadu.ttf for example. We want to transform text written using this font to Unicode. First we need to set up our development environment. This need not be done each time you write a new conversion/mapping file. Just once!
Setting up development environment
- Install Padma, Forget not to restart firefox.
- Run the following commands. Replace <whatever> with an appropriate string in all the steps of this tutorial.
cd ~/.mozilla/firefox/<whatever>.default/extensions/{3e*/chrome
unzip padma.jar
vim ~/.mozilla/firefox/<whatever>.default/extensions/{3e*/chrome.manifest
- Replace the line content padma jar:chrome/padma.jar!/content/ with content padma chrome/content/
- Save the file.
- Install fontforge, a tool to open font files. In fedora, just run yum –y install fontforge and in ubuntu, run sudo apt-get install fontforge
Writing a new mapping/conversion file
The setup is complete! To write a new conversion/mapping file, you need to do the following, remember, each time you intend to write a new one! Just to give an overview of what’s going on, we’ll be doing these things…
- Create a new mapping file with the help of already existing mapping file in the same language.
- Update padma’s config files with this new mapping file(filename and class we implement in that file)
- Write the mappings
- Test the new mapping file
- Repeat step 4 until you are satisfied with results!
I’ll use eenadu/telugu for font/language in our running example.
Step 1 : Create a new mapping file
- Go to ~/.mozilla/firefox/<whatever>.default/extensions/{3e*/chrome/content/encodings/Telugu/
- Make a copy of existing converter with the name previously specified in padma.xul, which is Eenadu.js in our example. (remember that? <script type="application/x-javascript" src="encodings/Telugu/Eenadu.js"/>). I have copied ShreeTel0900.js to Eenadu.js
- Open our newly created file, Eenadu.js
- Replace Shree_Tel_0900 with Eenadu. This is the class name we have specified in Transformer.js
- Change fontFace and displayName to appropriate names, both “Eenadu” in our example.
Step 2 : Update padma’s config files
- Open ~/.mozilla/firefox/<whatever>.default/extensions/{3e*/chrome/content/padma.xul
- You will see few lines like the following
<script type="application/x-javascript" src="encodings/Telugu/TeluguLipi.js"/>
<script type="application/x-javascript" src="encodings/Telugu/TCSMith.js"/><script type="application/x-javascript" src="encodings/Telugu/TeluguFont.js"/>
<script type="application/x-javascript" src="encodings/Telugu/SuriTln.js"/>
- These lines tell padma, the path to the mapping files. So, just append a line with same syntax and change the filename appropriately. So after doing this it should look something like the following
<script type="application/x-javascript" src="encodings/Telugu/TeluguLipi.js"/>
<script type="application/x-javascript" src="encodings/Telugu/TCSMith.js"/><script type="application/x-javascript" src="encodings/Telugu/TeluguFont.js"/>
<script type="application/x-javascript" src="encodings/Telugu/SuriTln.js"/>
<script type="application/x-javascript" src="encodings/Telugu/Eenadu.js"/>
- Save the file. Do the same for padmaMailOverlay.xul
- Open ~/.mozilla/firefox/<whatever>.default/extensions/{3e*/chrome/content/transformers/Transformer.js
- You’ll find few lines like the following
Transformer.dynFont_AAADurgax = 63;
Transformer.dynFont_AAADurgaxx = 64;Transformer.dynFont_Amudham = 65;
Transformer.dynFont_ShreeDev0714 = 66;
Transformer.dynFont_Unknown = 67;
- Locate the line with Transformer.dynFont_Unknown. Note the number it is assigned. Just before this line include another line with appropriate name in the place of Unknown and assign it the number previously noted. And increment the former(Transformer.dynFont_Unknown) by one. After doing that, it looks something like the following.
Transformer.dynFont_AAADurgax = 63;
Transformer.dynFont_AAADurgaxx = 64;Transformer.dynFont_Amudham = 65;
Transformer.dynFont_ShreeDev0714 = 66;
Transformer.dynFont_Eenadu = 67;
Transformer.dynFont_Unknown = 68;
- In the same file, you’ll find lines which tell padma what classes implement which font-mappings, like…
Transformer.dynFont_Class[Transformer.dynFont_ShreeTel0900] = Shree_Tel_0900;
Transformer.dynFont_Class[Transformer.dynFont_Hemalatha] = Hemalatha;Transformer.dynFont_Class[Transformer.dynFont_ShreeTel0902] = Shree_Tel_0902;
Transformer.dynFont_Class[Transformer.dynFont_Tikkana] = Tikkana;
- Include another line with our new class, Eenadu in our example. So, it becomes…
Transformer.dynFont_Class[Transformer.dynFont_ShreeTel0900] = Shree_Tel_0900;
Transformer.dynFont_Class[Transformer.dynFont_Hemalatha] = Hemalatha;Transformer.dynFont_Class[Transformer.dynFont_ShreeTel0902] = Shree_Tel_0902;
Transformer.dynFont_Class[Transformer.dynFont_Tikkana] = Tikkana;
Transformer.dynFont_Class[Transformer.dynFont_Eenadu] = Eenadu;
Step 3 : Writing the mappings
- Get back to our new mapping file, Eenadu.js in our example. Just go through it and you’ll understand that there are various categories of letters, to be mapped, like vowels, consonants and special combinations of them. The combinations are derived bringing together few consonant(s) and vowel(s). They can also be represented by a single code in the font file, as we’ll see.
- Run the following command to open our font file, eenadu.ttf in our example.
fontforge /path/to/eenadu.ttf
- In the menu of fontforge window, go to Encoding and select compact. And In the View, select 48 pixel outline. This is just to turn the window more readable.
- Now, on selecting some random letter, you’ll find some information, in red, about the selected letter, just below the fontforge window menu. I have selected “NII”. It says something like…
39 (0x0027) U+0046 F LATIN CAPITAL LETTER F
- We only need the third column, and also the pronounciation of the letter, in this case “NII”. You can find the pronounciations of Indian language letters spelled here.
- Find the spelling of the letter you have selected, in mapping file you have created, which is Eenadu.js in our example. So, on locating “NII”, I landed at the the line which starts as Eenadu.combo_NII. Now, assign it the code we have observed, U+0046 in our example. So the line becomes…
Eenadu.combo_NII = "\u0046";
- For combinations, for example, take YAA. It is a combination of YA consonant and AA sign. So, the mapping which corresponds to that is…
Eenadu.combo_YAA = "\u00A7\u00D6";
where 00A7 corresponds to YA consonant and 00D6 corresponds to AA sign
- Do this for all the characters that appear in the font file, which was opened in fontforge. That completes the assigning part. Now, In the same file, we should let padma know what mapping in font file corresponds to what in the actual standards.
- These kind of mappings can be found towards the end of file. So, we’ll do it with above mentioned two example mappings.
Eenadu.toPadma[Eenadu.combo_NII] = Padma.consnt_NA + Padma.vowelsn_II;
Eenadu.toPadma[Eenadu.combo_YAA] = Padma.consnt_YA + Padma.vowelsn_AA;
- Do this for all the mappings you have previously made seeing the font file. It’s that simple! It’s just enough to know what are vowels, vowel signs and consonants. And the rest automatically follows from intuition.
Step 4 : Testing
- To test the file, collect some encoded data in a some html file. Embrace all that data with a font tag like…
<font face=“Eenadu”> ---->data<---- </font>
- Replace Eenadu with whatever fontFace name you have previously specified in the mapping/conversion file.
- Open that file in firefox. Right click and select “Transform to unicode”. See if it works.
- If there are any errors, you can try locating what letters are causing those errors.
- khexedit is a handy tool to show you what hex codes are causing the problem. you can install it in fedora by running yum –y install khexedit and in ubuntu as sudo apt-get install khexedit
- Paste the wrongly converted or unconverted text in a file and open it with khexedit. It shows you the hexcodes corresponding to them. You can go back and correct them in the mapping file.
After you are satisfied with the result, you can submit it the file here to be included with next version of Padma! Note that you have to sumbit a new mapping file by reporting it’s absence as a bug.
In case you find any of the above instructions difficult to follow or incomplete, please do let me know. Best wishes!

Comments
Do you know how I can update
Do you know how I can update the Unicode character set for Malayalam in Padma. I'm trying to convert Manorama's chillu characters into Unicode chillus. The current mapping is old and converts to a sequence of Unicode characters that are no longer valid.
Thanks in advance!
You can contact the Unicode
You can contact the Unicode standards organization at Proposals and Updates section at http://unicode.org
Excellent...article on padma
Excellent...article on padma conversion file,needless to say anything more.Thanks.
Macken,
Olympic Weight Sets
Post new comment