MSIE5 a heski znaky v cestl

Martin Macok martin.macok at underground.cz
Tue Mar 7 23:56:55 CET 2000


On Tue, Mar 07, 2000 at 05:25:25PM +0100, Bohumil Michal wrote:
> Kdyz jsem v MSIE 5.01 v Nastroje->Moznosti site->Upresnit odskrtl
> polozku "Adresy URL posilat vzdy ve formatu UTF-8", tak to fungovalo
> taky. Takze jedina vec je, ze pro tento ucel nevyhovuje implicitni
> nastaveni. Proto prece neni MSIE b**bej - nebo chcete oznacit za b**bej
> browser, ktery implicitne nema nastaveny PROXY server, bez nehoz
> nemuzete do Internetu?

Asi si zakladam na flame, ale TO, v jakem formatu ma browser posilat http
zadosti serveru, by podle mne nemelo byt SCHOVANE v "Nastrojich" (jake
nastroje???) -> "Moznosti site" (co to je??? To by snad s browserem ani
souviset nemelo) -> "Upresneni" (ehm) ...

Pokud to nebude napr. v "Nastaveni" -> "Protokol HTTP" nebo tak nejak, tak
bude MSIE mozna ne blbej, ale rozhodne divnej ...

Kazdopadne, dal jsem si tu praci, a asi pul hodiny jsem se hrabal v RFC
ohledne HTTP 1.1 (rfc2616) a URI Generic Syntax (rfc2616) a dospel jsem k
zaveru, ze neni nekde presne receno, ze se smi pouzivat jen takova a
makova znakova sada, a jak presne se ma prenaset po siti. Asi
nejirelevantnejsi, co jsem nasel, je toto:

RFC 2396                   URI Generic Syntax                August 1998
...
   For original character sequences that contain non-ASCII characters,
   however, the situation is more difficult. Internet protocols that
   transmit octet sequences intended to represent character sequences
   are expected to provide some way of identifying the charset used, if
   there might be more than one [RFC2277].  However, there is currently
   no provision within the generic URI syntax to accomplish this
   identification. An individual URI scheme may require a single
   charset, define a default charset, or provide a way to indicate the
   charset used.

   It is expected that a systematic treatment of character encoding
   within URI will be developed as a future modification of this
   specification.
...

RFC 2616                        HTTP/1.1                       June 1999

   HTTP character sets are identified by case-insensitive tokens. The
   complete set of tokens is defined by the IANA Character Set registry
   [19].

       charset = token

   Although HTTP allows an arbitrary token to be used as a charset
   value, any token that has a predefined value within the IANA
   Character Set registry [19] MUST represent the character set defined
   by that registry. Applications SHOULD limit their use of character
   sets to those defined by the IANA registry.

...

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

   For example, the following three URIs are equivalent:

      http://abc.com:80/~smith/home.html
      http://ABC.com/%7Esmith/home.html
      http://ABC.com:/%7esmith/home.html
...

To by mohlo byt asi vse, co je relevantni. Jestli jsem to dobre pochopil,
lze ta URL posilat temer v jakemkoliv kodovani, ktere je v "IANA
   Character Set registry" (kde jsou temer vsechna znama i neznama), ale
musite to specifikovat v "charset = ..." a poslat v UTF-8. Jenze
chtit, aby servery toto vse (charsety) implementovali, je asi nemozne ...

Pokud vi nekdo neco vic, rad se o tom dozvim ...

Preji hezky den

-- 
< Martin Mačok        martin.macok at underground.cz           <iso-8859-2> 
  \\  http://kocour.ms.mff.cuni.cz/~macok/  http://underground.cz/  //
    \\\             -=  t.r.u.s.t  n.0  o.n.e  =-                ///




More information about the net mailing list