From: The Radio Prague Staff of Highly Skilled Experts <barry@RADIO.CZ>
Subject: Re: several messages about Netscape and charsets
Date: Wed, 20 Dec 1995 16:10:47 +0100
Next Article (by Date): Re: Cestina ??? Stanislav Koci
Previous Article (by Date): Cestina, hacky, carky, donekonecna? "Vagoun, Mr Voyta"
Top of Thread: Re: several messages about Netscape and charsets guest
Articles sorted by: [Date] [Subject]
Another weekend, another followup... On Fri, 15 Dec 1995, (ISO-8859-2) krem=BEsk=E1 HO=D8=C8ICE wrote: > > > (But wouldn't these tags confuse > > > browsers, which use non-ISO-Latin2 fonts, like Netscape for MS-Window= s? >=20 > [...M]y opinion is that Netscape should recognize > this encoding and be able to translate from this to CP1250 I thought I would pass along part of the following message, which was sent to the IETF (Internet Engineering Task Force) SMTP discussion group recently, as it discusses HTTP charsets, even though the focus of the message was slightly different... From: david_goldsmith@taligent.com (David Goldsmith) Newsgroups: info.ietf.smtp Subject: Re: Character set registration Date: 19 Dec 95 04:10:35 GMT The following sections from the latest (version of September 5, 1995) HTTP 1.0 spec seem to be relevant: ----------------------------------------- [*snip*] HTTP also redefines the default character set for text media in an entity body. If a textual media type defines a charset parameter with a registered default value of "US-ASCII", HTTP changes the default to be "ISO-8859-1". Since the ISO-8859-1 [18] character set is a superset of US-ASCII [17], this has no effect upon the interpretation of entity bodies which only contain octets within the US-ASCII set (0 - 127). The presence of a charset parameter value in a Content-Type header field overrides the default. It is recommended that the character set of an entity body be labelled as the lowest common denominator of the character codes used within a document, with the exception that no label is preferred over the labels US-ASCII or ISO-8859-1. --------------------------- and (from 3.4): -------------------- HTTP character sets are identified by case-insensitive tokens. The complete set of tokens are defined by the IANA Character Set registry [15]. However, because that registry does not define a single, consistent token for each character set, we define here the preferred names for those character sets most likely to be used with HTTP entities. These character sets include those registered by RFC 1521 [5] -- the US-ASCII [17] and ISO-8859 [18] character sets -- and other names specifically recommended for use within MIME charset parameters. charset =3D "US-ASCII" | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" | token ---------------------- In other words, HTTP specifically allows the use of multibyte character sets which do not use the CRLF sequence, more specifically 16-bit Unicode (unicode-1-1). It also recognizes that this differs from the behavior specified by MIME. David Goldsmith Senior Scientist Taligent, Inc. 10201 N. DeAnza Blvd. Cupertino, CA 95014-2233 david_goldsmith@taligent.com What this means is that Slovak and Czech documents should not be served with a charset tagging other than ISO-8859-2, and if no charset tagging is provided, the browser should assume it's ISO-8859-1. This also means that effort should not be put into serving out HTTP with a variety of different encodings, but instead this effort should be focused on making the browser support ISO-8859-2 charsets even if the display charset is something otherwise (such as Windows CP1250, DOS CP852, or whatever a Mac would be using). So the Netscape2.0 version should be doing the mapping from the ISO 8859-2 charset to the Windows or Mac display instead of expecting the document to be supplied with the unregistered X-MAC-CE charset, or the CP1250 charset which, from the documentation, it seems not to recognize. Now that I have made reference to this, can somebody point me to some resources about Central European support for the Macintosh, such as the font encoding, and sources for Mac fonts with ISO-8859-2 encoding, if such exist, for an application which does not perform the translation to the internal display charset? I would appreciate it. Diky, Barry Bouwsma <barryb@tuke.sk> vesel=E9 v=E1noce
Next Article (by Date): Re: Cestina ??? Stanislav Koci
Previous Article (by Date): Cestina, hacky, carky, donekonecna? "Vagoun, Mr Voyta"
Top of Thread: Re: several messages about Netscape and charsets guest
Articles sorted by: [Date] [Subject]