Cesky Pmail
Lubos Pavlicek
pavlicek at vse.vse.cz
Wed Feb 9 12:02:41 CET 1994
>>
>>2) Prenos posty s diakritikou. Reseni pro prenos 8-bitovych informaci
>> stavajicimi E-postovnimi systemy je davno hotove a jmenuje se MIME
>> (Multipurpose Internet Mail Extension).
> Vim, ze existuje RFC 1345, ktera se zabyva kodovanim takovych
> znaku, ktere potrebujeme a pritom se to s malym sebezaprenim
> da cist v i 7-bitovem tvaru. MIME je udelano podle RFC 1345 nebo
> je to neco uplne jineho?
RFC 1345 je pouze informacni. Obsahuje v podstate neuplny seznam kodu
z ISO 10646 a prehled nekterych kodovych tabulek (tech, ktere jsou soucasti
nejake narodni ci mezinarodni normy). Ke kazdemu znaku je prirazena
mnemonicka zkratka - znaky z ISO 646 (v podstate US ASCII) se nekoduji, pro
ostatni se pouzivaji obvykle 2 znaky, obcas i vice. Toto kodovani (
mnemonicke zkratky) se pouziva v IDA verzi sendmailu.
MIME standard s timto RFC nijak nesouvisi.
V MIME standardu je rozdeleno pouzivani ruznych znakovych sad od 7-bitoveho
prenosu. Znakova sada se udava jako specialni parametr (charset) pri
specifikaci obsahu zpravy (popr. casti zpravy - v MIME muze obsahovat zprava
vice casti). Jako znakove sady jsou zatim povoleny US ASCII a znakove sady z
rady ISO 8859.
Pro 7-bitovy prenos se pouzivaji dve kodovani:
- base-64 pro binarni data,
- quotable pro texty v 8-bitove znakove sade.
>> Smulu budou mit
>> jen ti, jejichz postaci MIME neumi (ale takovych bude cim dal tim mene)=
>> Cteni takovychto rozko=BFn=FFch textu je jiste brzy donuti poridit si
>> neco lepsiho nebo obtezovat autora programu tak dlouho, az podporu MIME
>> prida.
> Podle RFC 1345 to zase tak hrozne neni - rozkos2ny'ch -
> hacek = 2, carka = ' , krouzek = 0, prehlaska = " .
> Neni' to z2a'dna' para'da, ale c2i'st se to da' bez velky'ch vysve2tlivek.
> Automaticky provadena konverse je samozrejme silne zadouci.
v RFC 1345 se pro hacek pouziva znak < ne znak 2. Dvoupismenne mnemonicke
kody autor RFC 1345 (Keld Simonsen) vytvari obvykle z odpovidajiciho znaku v
US ASCII a nasledujiciho znaku:
Exclamation mark ! Grave
Apostrophe ' Acute accent
Greater-Than sign > Circumflex accent
Question Mark ? tilde
Hyphen-Minus - Macron
Left parenthesis ( Breve
Full Stop . Dot Above
Colon : Diaeresis
Comma , Cedilla
Underline _ Underline
Solidus / Stroke
Quotation mark " Double acute accent
Semicolon ; Ogonek
Less-Than sign < Caron
Zero 0 Ring above
Two 2 Hook
Nine 9 Horn
Equals = Cyrillic
Asterisk * Greek
Percent sign % Greek/Cyrillic special
Plus + smalls: Arabic, capitals: Hebrew
Three 3 some Latin/Greek/Cyrillic letters
Four 4 Bopomofo
Five 5 Hiragana
Six 6 Katakana
Ceske znaky pote vypadaji takto:
( The format of the table is:
1st field is the character mnemonic (mostly 2 characters).
2nd field is the ISO 2DIS 10646 code in hexadecimal.
3rd field is the long descriptive name of ISO 2DIS 10646. )
A' 00c1 LATIN CAPITAL LETTER A WITH ACUTE
A: 00c4 LATIN CAPITAL LETTER A WITH DIAERESIS
E' 00c9 LATIN CAPITAL LETTER E WITH ACUTE
E: 00cb LATIN CAPITAL LETTER E WITH DIAERESIS
I' 00cd LATIN CAPITAL LETTER I WITH ACUTE
O' 00d3 LATIN CAPITAL LETTER O WITH ACUTE
O> 00d4 LATIN CAPITAL LETTER O WITH CIRCUMFLEX
O: 00d6 LATIN CAPITAL LETTER O WITH DIAERESIS
U' 00da LATIN CAPITAL LETTER U WITH ACUTE
U> 00db LATIN CAPITAL LETTER U WITH CIRCUMFLEX
U: 00dc LATIN CAPITAL LETTER U WITH DIAERESIS
Y' 00dd LATIN CAPITAL LETTER Y WITH ACUTE
a' 00e1 LATIN SMALL LETTER A WITH ACUTE
a: 00e4 LATIN SMALL LETTER A WITH DIAERESIS
e' 00e9 LATIN SMALL LETTER E WITH ACUTE
e: 00eb LATIN SMALL LETTER E WITH DIAERESIS
i' 00ed LATIN SMALL LETTER I WITH ACUTE
i> 00ee LATIN SMALL LETTER I WITH CIRCUMFLEX
o' 00f3 LATIN SMALL LETTER O WITH ACUTE
o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
o: 00f6 LATIN SMALL LETTER O WITH DIAERESIS
u' 00fa LATIN SMALL LETTER U WITH ACUTE
u> 00fb LATIN SMALL LETTER U WITH CIRCUMFLEX
u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
y' 00fd LATIN SMALL LETTER Y WITH ACUTE
C< 010c LATIN CAPITAL LETTER C WITH CARON
c< 010d LATIN SMALL LETTER C WITH CARON
D< 010e LATIN CAPITAL LETTER D WITH CARON
d< 010f LATIN SMALL LETTER D WITH CARON
E< 011a LATIN CAPITAL LETTER E WITH CARON
e< 011b LATIN SMALL LETTER E WITH CARON
L' 0139 LATIN CAPITAL LETTER L WITH ACUTE
l' 013a LATIN SMALL LETTER L WITH ACUTE
L< 013d LATIN CAPITAL LETTER L WITH CARON
l< 013e LATIN SMALL LETTER L WITH CARON
N< 0147 LATIN CAPITAL LETTER N WITH CARON
n< 0148 LATIN SMALL LETTER N WITH CARON
R' 0154 LATIN CAPITAL LETTER R WITH ACUTE
r' 0155 LATIN SMALL LETTER R WITH ACUTE
R< 0158 LATIN CAPITAL LETTER R WITH CARON
r< 0159 LATIN SMALL LETTER R WITH CARON
S< 0160 LATIN CAPITAL LETTER S WITH CARON
s< 0161 LATIN SMALL LETTER S WITH CARON
T< 0164 LATIN CAPITAL LETTER T WITH CARON
t< 0165 LATIN SMALL LETTER T WITH CARON
U0 016e LATIN CAPITAL LETTER U WITH RING ABOVE
u0 016f LATIN SMALL LETTER U WITH RING ABOVE
Z< 017d LATIN CAPITAL LETTER Z WITH CARON
z< 017e LATIN SMALL LETTER Z WITH CARON
Osobne doporucuji pouzivat toto kodovani v pripade, kdy je treba vyjadrit
hacky ci carky v textu psanem pomoci US ASCII, tj. pro adresy, jmena ...
Nelze tyto mnemonicke zkratky pouzivat pro bezne kodovani, protoze nelze
provadet zpetny prevod (v IDA/sendmail se pred kazdou mnemonicku zkratku
pridava specialni priznak).
napr. me jmeno pote vypada nasledovne: Lubos< Pavli'c<ek
Ja toto kodovani pouzivam napr. u naseho specifickeho finger demonu na
pocitaci vse.vse.cz, kde pro kazdeho uzivatele u nas na siti vypisuji vedle
jmena v US ASCII i jmeno zakodovanem s pouzitim techto mnemonickych zkratek (
v teto fazi bohuzel jeste neni doplneno u vsech uzivatelu).
Lubos Pavlicek
pavlicek at vse.cz
More information about the net
mailing list