How to configure a Gentoo Linux system to use UTF-8 character encoding and Portuguese (Portugal) localization (language and keyboard).
UTF-8 is a variable-length character encoding, which in this instance means that it uses 1 to 4 bytes per symbol. So, the first UTF-8 byte is used for encoding ASCII, giving the character set full backwards compatibility with ASCII. UTF-8 means that ASCII and Latin characters are interchangeable with little increase in the size of the data, because only the first bit is used.
UTF-8 allows you to work in a standards-compliant and internationally accepted multilingual environment, with a comparatively low data redundancy. UTF-8 is the preferred way for transmitting non-ASCII characters over the Internet, through Email, IRC or almost any other medium.
A Locale is a set of information that most programs use for determining country and language specific settings. The locales and their data are part of the system library and can be found at
/usr/share/locale on most systems. A locale name is generally named ab_CD where ab is your two (or three) letter language code (as specified in ISO-639) and
CD is your two letter country code (as specified in ISO-3166). Variants are often appended to locale names, e.g.
Specify the locales we will need in
# vi /etc/locale.gen en_GB ISO-8859-1 en_GB.UTF-8 UTF-8 pt_PT ISO-8859-1 [email protected] ISO-8859-15 pt_PT.UTF-8 UTF-8 [email protected] UTF-8
The next step is to run
locale-gen. It will generate all the locales we have specified in the
There is one environment variable that needs to be set in order to use our new UTF-8 locales:
LC_CTYPE (or optionally
LANG, if you want to change the system language as well). Setting the locale globally should be done using
# vi /etc/env.d/02locale LANG="[email protected]"
Now update the update the environment after the change
# env-update && source /etc/profile
The keyboard layout used by the console is set in
/etc/conf.d/keymaps by the
KEYMAP variable. For a Portuguese keyboard use pt-latin1 or pt-latin9. Set also EXTENDED_KEYMAPS attributes like "euro".
# vi /etc/conf.d/keymaps</pre> KEYMAP="pt-latin9" SET_WINDOWKEYS="yes" EXTENDED_KEYMAPS="backspace keypad euro"
To enable UTF-8 on the console, you need to edit
/etc/rc.conf and set
# vi /etc/rc.conf UNICODE="yes"
The keyboard layout to be used by the X server is specified in
/etc/X11/xorg.conf by the
# vi /etc/X11/xorg.conf Section "InputDevice" Identifier "Keyboard0" Driver "kbd" Option "XkbLayout" "pt" ... EndSection
There is also additional localisation variable called LINGUAS, which affects to localisation files that get installed in gettext-based programs, and decides used localisation for some specific software packages, such as
app-office/openoffice. The variable takes in space-separated list of language codes, and suggested place to set it is
# vi /etc/make.conf LINGUAS="pt pt_PT en en_GB"
And that's it! Hopefully your system should now be running in full UTF-8/Portuguese support. Good linuxing ;)