From: Petr Lampa <lampa@FEE.VUTBR.CZ>
Subject: Re: Wordia
Date: Sun, 22 Oct 1995 19:54:40 +0100
Next Article (by Subject): Re: Wordia zdenek Hladik
Previous Article (by Subject): Re: Wordia Vlastimil Vavrina
Top of Thread: Wordia Stepan Kasal
Next in Thread: Re: Wordia zdenek Hladik
Articles sorted by:
[Author]
[Subject]
> > > > i v lokalizovanem Wordu. Snad se podarilo vyvojovy team presvedcit o > tom, > > > ze u nas piseme HTML stranky bez diakritiky a take anglicky, kde ISO > Latin 1 > > > staci. > > > > Tak nevim, jestli je tohle mineno vazne: kdo o tom ten team > > presvedcoval ? Ja za sebe mam radsi cestinu s diakritikou, kdyz si > > muzu vybrat :-). Ano, pisu "take anglicky", ale prave ze "take", > > a ne "jen". > > > > Honza Vejvalka > > > Je to mineno vazne - cestina v HTML podporovana neni, pouze ISO Latin 1. > ISO Latin 2 bude az v unicodove verzi zatim neznamo kdy - snad brzo. > Pokud je to mysleno tak, ze nejsou v HTML v. 2.0, ci 3.0 definovany kody pro znaky ISO 8859-2 (tj. &ecaron, í, etc.), pak je to samozrejme pravda. Jinak ale definice jazyka HTML omezuje pouziti libovolnych osmibitovych kodu (tj. ne viceslabikovych) pouze takto: Character sets The charset parameter (as defined in section 7.1.1 of RFC 1521) may be used with the text/html content type to specify the encoding used to represent the HTML document as a sequence of bytes. Normally, text/* media types specify a default of US-ASCII for the charset parameter. However, for text/html, if the byte stream contains data that is not in the 7-bit US-ASCII set, the HTML interpreting agent should assume a default charset of ISO-8859-1. When an HTML document is encoded using US-ASCII, the mechanisms of numeric character references and character entity references may be used to encode additional characters from ISO-8859-1. Character entity references are needed for symbols such as math and greek characters from other unspecified character sets. Other values for the charset parameter are not defined in this specification, but may be specified in future versions of HTML. It is envisioned that HTML will use the charset parameter to allow support for non-Latin characters such as Arabic, Hebrew, Cyrillic and Japanese, rather than relying on any SGML mechanism for doing so. Tato citace pouze rika, ze implicitni kod je ISO 8859-1 a jina hodnota charset nez 8859-1 neni v HTML 2.0 (3.0) definovana, tj. klient ji nemusi rozumet. Nezakazuje, ale aby hodnota charset byla jina a klient ji rozumel. Specifikace protokolu HTTP v. 1.0 povoluje uvest jako charset tyto kody: charset = "US-ASCII" | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" | token Pokud tedy server poskytne hlavicky Content-Type: text/html; charset=ISO-8859-2 Content-Language: cz nic nebrani tomu, aby text byl cesky a klient ho spravne zobrazil (ktery lient to ale umi?). Neni tim samozrejme vyreseno vse, to resi az draft HTML 2.1 (viz napr. ftp://pub/WWW/draft-ietf-html-i18n-01.txt). Protoze ale plna implementace interpretace Unicode 1.1 na strane klienta a serveru nejaky cas potrva, je rozumnejsi vyuzivat soucasnych moznosti. Zabyval se nekdo u nas myslenkou prinutit nektereho klienta rozumet hlavicce Charset? (pripadne prinutit ho generovat hlavicky Accept-Charset a Accept-Language?). Petr Lampa -- Technical University of Brno E-mail: lampa@fee.vutbr.cz Faculty of El. Engineering and Comp. Science Phone: (+42 5) 7275/111,225 Department of Computer Science and Engineering Fax: (+42 5) 41211141 Bozetechova 2, 612 66 Brno, Czech Republic
Next Article (by Subject): Re: Wordia zdenek Hladik
Previous Article (by Subject): Re: Wordia Vlastimil Vavrina
Top of Thread: Wordia Stepan Kasal
Next in Thread: Re: Wordia zdenek Hladik
Articles sorted by:
[Author]
[Subject]