tamil-itrans is no good, and Gnome’s
Tamil (phonetic (m17n)) stutters when used with Emacs! What should a Tamil-user do?
Tamil is my mother tongue.
In the past, there have been occasions where I had to compose lengthy text in Tamil. On those occasions, either I used to go with Emacs’
tamil-itrans input method, or Gnome’s
Tamil (phonetic (m17n)) input method.
Each of these methods has its own set of problems.
- Emacs’ very own
tamil-itransmethod feels very contrived.Many of the mappings from English-letters to Tamil-letters are not at all intuitive.What this means is that I spend more time correcting text with
BACKSPACEs, or looking up completions with
TAB, or consulting the input mapping table with
M-x describe-input-method), rather than composing the article.
- When I use the Gnome Input Method , there is a noticeable lag between the typing of keys in Emacs and the appearance of the corresponding Tamil letter in the Emacs buffer.This lag is so large that it is simply impossible to get immediate feedback on the Tamil letter you have typed.
The answer to earlier question is …. “Use
tamil-phonetic input method!”
Given my above troubles , I was pleasantly surprised to discover that there is
tamil-phonetic input method available for Emacs(1). The package
tamil-phonetic is authored by a native Tamil speaker, and caters admirably to needs of users who want to type Tamil text in Emacs.
The highlights of the package are:
- the default mapping table mimics Gnome’s
Tamil (phonetic (m17n))method
- you can customize the mapping table
(1) above means that you get better defaults than
tamil-itrans and you can start from what you already know.
(2) above means that you can tweak the mapping table to your liking, and you don’t have to ever exert yourself in recollecting the right set of keys.
In this article, I am documenting how I use the package.
If you had composed Tamil text in the past using Emacs, and were as frustrated as I was, then read on …
How to set up
Tamil Phonetic input method in Emacs
Step 1: Install
Emacs Lisp libraries that
tamil-phonetic depends on
dash is available from GNU ELPA.
s is available on MELPA.
Configure your Emacs to have the right set of ELPAs:
(custom-set-variables '(package-archives '(("gnu" . "https://elpa.gnu.org/packages/") ("nongnu" . "https://elpa.nongnu.org/nongnu/") ("melpa" . "https://melpa.org/packages/"))))
M-x package-refresh-contents, followed by
M-x package-install RET dash RET and
M-x package-install RET s RET.
Step 2: Download and install
Assuming that you have downloaded (or checked out) the
~/src/elisp/tamil-phonetic directory, add the following to your
(add-to-list 'load-path "~/src/elisp/tamil-phonetic/") (require 'tamil-phonetic)
Step 3: Configure Tamil fonts, if you haven’t already
I prefer Noto fonts(1).
(set-fontset-font "fontset-default" 'tamil "Noto Sans Tamil")
Step 4: Set
tamil-phonetic as your preferred input method
(custom-set-variables '(default-input-method "tamil-phonetic"))
Step 5: Customize
tamil-phonetic; roll out your own mapping table
Set up the input mapping(1) as below and restart your Emacs. (Don’t forget this step.)
tamil-phonetic installs whatever mapping table it sees when it starts; and ignores all changes to the mapping tables once it gets running. This means that your
(require 'tamil-phonetic) must go after the custom mapping table, and not before it.
(custom-set-variables '(tamil-phonetic-consonants '(("க்" "k" "g") ("ங்" "ng") ("ச்" "c" "s" "ch") ("ஞ்" "nj" "gn") ("ட்" "t" "d") ("ண்" "n") ("த்" "th" "dh") ("ந்" "nh" "nd" "nnn") ("ப்" "p" "b") ("ம்" "m") ("ய்" "y") ("ர்" "r") ("ல்" "l") ("வ்" "v") ("ழ்" "z" "zh") ("ள்" "L" "ll") ("ற்" "tr" "R" "rr") ("ன்" "N" "nn") ("ஜ்" "j") ("ஷ்" "sh") ("ஸ்" "S") ("ஶ்" "Z") ("ஹ்" "h") ("க்ஷ்" "ksh") ("க்ஷ்" "ksH"))))
Step 6: Open up a buffer, toggle the input method, and start typing
M-x toggle-input-method (
You have already told Emacs that
tamil-phonetic is your
default-input-method. So, Emacs wouldn’t prompt you with a list of input methods it knows about.
At this point, you are all set.
Here is a overview of how it all looks.
tamil-phonetic input method in action
If you are trying out
tamil-phonetic input method for the first time, you may want to review the input mapping table. You can do this by
C-h C-\ (
Down below you see visual representation of the input mapping in effect. Note in particular that most vowels and consonants can be produced in mutiple ways.
Input mapping tables for
Input mapping tables for
Notes on the Input Mapping Table
Consider these mappings
ண் <-–—> n
ன் <-–—> N
ல் <-–—> l
ள் <-–—> L
ர் <-–—> r
ற் <-–—> R
In the above mappings,
- the “small” / “சின்ன” Tamil-letters map to the corresponding lowercase English-letters, and
- the “big” / “பெரிய” Tamil-letters map to the uppercase English-letters.
The practical problem with uppercase mappings is that you need to remember to engage the
SHIFT key while typing them. And if you are going to type a long-form article, then you may have to keep engaging and dis-engaging
SHIFT very often. And this constant switching is error-prone, even for a touch typist.
The letter “ந்” poses a problem of its own. It is neither “small” / “சின்ன” or “big” / “பெரிய”, nor does it have a unique sound of its own.
As an aside, Gnome’s
Tamil (phonetic (m17n)) method maps
"ந்" to letters
- Use of
"ந்"is outright condemnable. Who in their right mind would set up such a mapping, and yet have the audacity to label the input method as phonetic?
- Use of
"n-"means that you have to stray away from the home row. Even though I am a touch-typer, I cannot input any of the punctuation characters without also looking at the keyboard.
Fortunately, the strategy used for mapping the tamil vowels provide a nice way out. That is, when typing the vowels, to move from “short-form” / “குறில்” to “long-form” / “நெடில்”, you press the corresponding letter once again. That is, if you want
"ஆ", you press
"a" twice over.
I have adopted the same strategy in the following mappings:
ண் <-–—> n
ன் <-–—> nn
ந் <-–—> nnn
ல் <-–—> l
ள் <-–—> ll
ர் <-–—> r
ற் <-–—> rr
You keep pressing the
r) until you arrive at the right Tamil-letter.
The following mappings
ந் <-–—> nd
ற் <-–—> tr
and the mappings I have chosen for rest of the consonants are quite intuitive. Each “sound” they make, maps to their “natural” English counterparts.
Problems posed with multiple mappings and how to overcome them
One of the problems posed by the above mappings is that there is a ground for ambiguity.
The ambiguity is best explained with an example. Try typing out the word
"கண்டமிதில்", which occurs in “தமிழ்த் தாய் வாழ்த்து”. If you input
"kand", you will get
"கந்" in the buffer. If you think about it, Emacs is not doing anything wrong; you want some other word. The way out of the problem is to explicitly tell Emacs when you are “done” with a particular letter and move on to the next letter. You do this with the command
quail-select-current, which is mapped to
<kp-enter>. That is, to get
"கண்டமிதில்", you type
"kanC-SPCda" —note the use of
C-SPC here—and you will get
"கண்ட", instead of the previous
C-SPC is useful for resolving ambiguities
Bonus features: Transliteration from English to Tamil, and vice versa
tamil-phonetic input method, like all its sister indian input methods, supports transliteration. This is quite useful.
For example, imagine that you are teaching a non-Tamilian, say a Kannadiga or a Malayalee, how to read, and write Tamil. Assuming that your medium of instruction is English, you are likely to give him a transliterated version of a Tamil text as an aid for practice.
You can transliterate from Tamil to English, by marking the Tamil text, and invoking
M-x tamil-phonetic-transiterate-tamil->english on it.
Down below you see the result of Tamil—> English transliteration of “தமிழ்த் தாய் வாழ்த்து”.
“தமிழ்த் தாய் வாழ்த்து”, as transliterated in to English
You can move back from English text to Tamil, if you do
As an aside … the transliteration feature is not unique to the
tamil-phonetic input method. Most indian languages have an
itrans input method, and all the
itans methods support transliteration. For example, if you have parents who have the habit of reciting Shlokas, but cannot read Devanagari, you can transliterate a Devanagari shloka first from Devanagari to English and then from English to your parents’ native tongue. (Try it. Theoretically speaking, it should work.)
What should the input method for Minibuffer be? Should it be English or Tamil …
There is a problem that I noticed when
tamil-phonetic was active. The minibuffer also inherits the buffer’s input method. This “inherit the buffer’s input method”-behaviour may or may not be what you desire. Given my specific workflow—the task of researching for this article—I wanted the minibuffer to be in English, and not in Tamil. (To appreciate what I am saying, try typing an
M-x find-library RET, when
tamil-phonetic input method is active in the buffer).
To avoid constant toggling of input method as and when I was invoking
M-x, I added the following hook.
(add-hook 'minibuffer-setup-hook (defun my-deactivate-all-input-methods () (if current-transient-input-method (deactivate-transient-input-method) (deactivate-input-method))))
You most likely do not want this hook in your
user-init-file. If you have this hook, then you wouldn’t be able to
isearch-forward) in Tamil (without also toggling the input method first).
(As an aside,
M-: doesn’t seem to inherit the buffer’s input method, and
M-x should indicate the currently active input method like
I am happy that I stumbled upon
tamil-phonetic. Typing in Tamil would no longer be a hassle.
I am eagerly looking forward to the day when the author of
tamil-phonetic makes good of his promise(1) and puts his package in ELPA or right in Emacs.
There is one another wish:
Support for input method completions in
helm, or in any other completion packages
Input Method Completion is unlike other completions. Under normal circumstances, completion frameworks suggest candidates based on a prefix match on typed input. This “complete based on partial string match”-behaviour—the exact behaviour that Emacs
Quail library implements—may not be that helpful for a user. This is particularly so if a user is exploring a input method for the first time, and hopes for better assisstance.
To appreciate the above remark, consider the case of
tamil-phonetic input method. When a user types
"n", this is what he sees
The sheer amount of suggestions is a bit overwhelming. You most likely want to get the “மெய்யெழுத்து” / “consonant” part right, before moving on to “உயிரெழுத்து” / “vowel”. In other words, the suggestions can be pruned.
Too many suggestions; It could be improved …
I also imagine that phonetic input methods (in other languages) could benefit from suggestions that are based on nearness of sound as nearness of typed text.