Cesky Pmail

Lubos Pavlicek pavlicek at vse.vse.cz
Wed Feb 9 12:02:41 CET 1994


>>
>>2) Prenos posty s diakritikou. Reseni pro prenos 8-bitovych informaci
>>   stavajicimi E-postovnimi systemy je davno hotove a jmenuje se MIME
>>   (Multipurpose Internet Mail Extension).

> Vim, ze existuje RFC 1345, ktera se zabyva kodovanim takovych
> znaku, ktere potrebujeme a pritom se to s malym sebezaprenim
> da cist v i 7-bitovem tvaru.  MIME je udelano podle RFC 1345 nebo
> je to neco uplne jineho?

RFC 1345 je pouze informacni. Obsahuje v podstate  neuplny seznam kodu
z ISO 10646 a prehled nekterych kodovych tabulek (tech, ktere jsou soucasti
nejake narodni ci mezinarodni normy). Ke kazdemu znaku je prirazena
mnemonicka zkratka - znaky z ISO 646 (v podstate US ASCII) se nekoduji, pro
ostatni se pouzivaji obvykle 2 znaky, obcas i vice. Toto kodovani (
mnemonicke zkratky) se pouziva v IDA verzi sendmailu.

MIME standard s timto RFC nijak nesouvisi.
V MIME standardu je rozdeleno pouzivani ruznych znakovych sad od 7-bitoveho
prenosu. Znakova sada se udava jako specialni parametr (charset) pri
specifikaci obsahu zpravy (popr. casti zpravy - v MIME muze obsahovat zprava
vice casti). Jako znakove sady jsou zatim povoleny US ASCII a znakove sady z
rady ISO 8859.
Pro 7-bitovy prenos se pouzivaji dve kodovani:
 - base-64 pro binarni data,
 - quotable pro texty v 8-bitove znakove sade.

>>   Smulu budou mit
>>   jen ti, jejichz postaci MIME neumi (ale takovych bude cim dal tim mene)=
>>   Cteni takovychto rozko=BFn=FFch textu je jiste brzy donuti poridit si
>>   neco lepsiho nebo obtezovat autora programu tak dlouho, az podporu MIME
>>   prida.

> Podle RFC 1345 to zase tak hrozne neni - rozkos2ny'ch -
> hacek = 2,  carka = ' , krouzek = 0, prehlaska = " .
> Neni' to z2a'dna' para'da, ale c2i'st se to da' bez velky'ch vysve2tlivek.
> Automaticky provadena konverse je samozrejme silne zadouci.

v RFC 1345 se pro hacek pouziva znak < ne znak 2. Dvoupismenne mnemonicke
kody autor RFC 1345 (Keld Simonsen) vytvari obvykle z odpovidajiciho znaku v
US ASCII a nasledujiciho znaku:

     Exclamation mark           ! Grave
     Apostrophe                 ' Acute accent
     Greater-Than sign          > Circumflex accent
     Question Mark              ? tilde
     Hyphen-Minus               - Macron
     Left parenthesis           ( Breve
     Full Stop                  . Dot Above
     Colon                      : Diaeresis
     Comma                      , Cedilla
     Underline                  _ Underline
     Solidus                    / Stroke
     Quotation mark             " Double acute accent
     Semicolon                  ; Ogonek
     Less-Than sign             < Caron
     Zero                       0 Ring above
     Two                        2 Hook
     Nine                       9 Horn
     Equals                     = Cyrillic
     Asterisk                   * Greek
     Percent sign               % Greek/Cyrillic special
     Plus                       + smalls: Arabic, capitals: Hebrew
     Three                      3 some Latin/Greek/Cyrillic letters
     Four                       4 Bopomofo
     Five                       5 Hiragana
     Six                        6 Katakana

Ceske znaky pote vypadaji takto:
  ( The format of the table is:
      1st field is the character mnemonic (mostly 2 characters).
      2nd field is the ISO 2DIS 10646 code in hexadecimal.
      3rd field is the long descriptive name of ISO 2DIS 10646. )

 A'     00c1    LATIN CAPITAL LETTER A WITH ACUTE
 A:     00c4    LATIN CAPITAL LETTER A WITH DIAERESIS
 E'     00c9    LATIN CAPITAL LETTER E WITH ACUTE
 E:     00cb    LATIN CAPITAL LETTER E WITH DIAERESIS
 I'     00cd    LATIN CAPITAL LETTER I WITH ACUTE
 O'     00d3    LATIN CAPITAL LETTER O WITH ACUTE
 O>     00d4    LATIN CAPITAL LETTER O WITH CIRCUMFLEX
 O:     00d6    LATIN CAPITAL LETTER O WITH DIAERESIS
 U'     00da    LATIN CAPITAL LETTER U WITH ACUTE
 U>     00db    LATIN CAPITAL LETTER U WITH CIRCUMFLEX
 U:     00dc    LATIN CAPITAL LETTER U WITH DIAERESIS
 Y'     00dd    LATIN CAPITAL LETTER Y WITH ACUTE
 a'     00e1    LATIN SMALL LETTER A WITH ACUTE
 a:     00e4    LATIN SMALL LETTER A WITH DIAERESIS
 e'     00e9    LATIN SMALL LETTER E WITH ACUTE
 e:     00eb    LATIN SMALL LETTER E WITH DIAERESIS
 i'     00ed    LATIN SMALL LETTER I WITH ACUTE
 i>     00ee    LATIN SMALL LETTER I WITH CIRCUMFLEX
 o'     00f3    LATIN SMALL LETTER O WITH ACUTE
 o>     00f4    LATIN SMALL LETTER O WITH CIRCUMFLEX
 o:     00f6    LATIN SMALL LETTER O WITH DIAERESIS
 u'     00fa    LATIN SMALL LETTER U WITH ACUTE
 u>     00fb    LATIN SMALL LETTER U WITH CIRCUMFLEX
 u:     00fc    LATIN SMALL LETTER U WITH DIAERESIS
 y'     00fd    LATIN SMALL LETTER Y WITH ACUTE
 C<     010c    LATIN CAPITAL LETTER C WITH CARON
 c<     010d    LATIN SMALL LETTER C WITH CARON
 D<     010e    LATIN CAPITAL LETTER D WITH CARON
 d<     010f    LATIN SMALL LETTER D WITH CARON
 E<     011a    LATIN CAPITAL LETTER E WITH CARON
 e<     011b    LATIN SMALL LETTER E WITH CARON
 L'     0139    LATIN CAPITAL LETTER L WITH ACUTE
 l'     013a    LATIN SMALL LETTER L WITH ACUTE
 L<     013d    LATIN CAPITAL LETTER L WITH CARON
 l<     013e    LATIN SMALL LETTER L WITH CARON
 N<     0147    LATIN CAPITAL LETTER N WITH CARON
 n<     0148    LATIN SMALL LETTER N WITH CARON
 R'     0154    LATIN CAPITAL LETTER R WITH ACUTE
 r'     0155    LATIN SMALL LETTER R WITH ACUTE
 R<     0158    LATIN CAPITAL LETTER R WITH CARON
 r<     0159    LATIN SMALL LETTER R WITH CARON
 S<     0160    LATIN CAPITAL LETTER S WITH CARON
 s<     0161    LATIN SMALL LETTER S WITH CARON
 T<     0164    LATIN CAPITAL LETTER T WITH CARON
 t<     0165    LATIN SMALL LETTER T WITH CARON
 U0     016e    LATIN CAPITAL LETTER U WITH RING ABOVE
 u0     016f    LATIN SMALL LETTER U WITH RING ABOVE
 Z<     017d    LATIN CAPITAL LETTER Z WITH CARON
 z<     017e    LATIN SMALL LETTER Z WITH CARON

Osobne doporucuji pouzivat toto kodovani v pripade, kdy je treba vyjadrit
hacky ci carky v textu psanem pomoci US ASCII, tj. pro adresy, jmena ...
Nelze tyto mnemonicke zkratky pouzivat pro bezne kodovani, protoze nelze
provadet zpetny prevod (v IDA/sendmail se pred kazdou mnemonicku zkratku
pridava specialni priznak).

napr. me jmeno pote vypada nasledovne: Lubos< Pavli'c<ek
Ja toto kodovani pouzivam napr. u naseho specifickeho finger demonu na
pocitaci vse.vse.cz, kde pro kazdeho uzivatele u nas na siti vypisuji vedle
jmena v US ASCII i jmeno zakodovanem s pouzitim techto mnemonickych zkratek (
v teto fazi bohuzel jeste neni doplneno u vsech uzivatelu).

Lubos Pavlicek
pavlicek at vse.cz



More information about the net mailing list