character encoding - UTF-8 and CP-1252
Hi all,
Longtime X/Xtreme user here, and new to Web Designer. Never needed to ask a question before!
I'm creating a website for the Council for British Archaeology East Midlands group - one small section of a much bigger national site. Pages here: www.britarch.ac.uk/cbaem
Thing is, the main CBA webmaster has asked for all pages to be UTF-8 encoded:
"All pages should be UTF-8; use of ISO-8859-1(5) and Windows CP-1252 encodings is strongly discouraged in order to avoid unexpected character set issues. Please ensure that you are using the correct character encoding."
Inspecting the page source of my Web Designer-created pages shows CP-1252 encoding. Is it possible to change this to UTF-8? How?
Thanks,
Dave
Re: character encoding - UTF-8 and CP-1252
All the content text exported by WD is Unicode (UTF-8) encoded. So you can simply replace declaration in the resulting HTML to UTF-8.
WD declares the codepage of your OS (it's not necessary 1252) because it is used for all system strigs (filenames, title and alt attributes, meta tags e.t.c.). If you will use only ASCII7 characters for these, you'll get output completely identical to UTF-8.
Re: character encoding - UTF-8 and CP-1252
Thanks, covoxer, for your speedy reply.
Quote:
Originally Posted by
covoxer
So you can simply replace declaration in the resulting HTML to UTF-8.
That's easy enough, if a bit of a faff. Presumably every time I re-save a WD website and republish it, I will have to manually change each page's declaration.
Quote:
Originally Posted by
covoxer
If you will use only ASCII7 characters for these, you'll get output completely identical to UTF-8.
..which I presume means 'just normal text' in my idiot-speak. There are no special characters on these pages, just text.
Thanks again for your reply - I was a bit worried I might have to switch back to the comparative horrors of Frontpage or NVU because of an insurmountable technicality!
Re: character encoding - UTF-8 and CP-1252
Quote:
Originally Posted by
speedsixdave
That's easy enough, if a bit of a faff. Presumably every time I re-save a WD website and republish it, I will have to manually change each page's declaration.
Yes. You can also use auto replace to improve the process a bit.
Quote:
..which I presume means 'just normal text' in my idiot-speak. There are no special characters on these pages, just text.
Yes, generally speaking you may use English letters, digits and usual punctuation. Check the full list for example here: http://en.wikipedia.org/wiki/Ascii
See the table of the ASCII printable characters (codes 32-126).
But in the page text you can use any Unicode characters.
Re: character encoding - UTF-8 and CP-1252
John,
I live in the USA. My WD pages all indicate encoding Windows-1252. When I run a Markup Validation on any of my WD sites I get one warning:
Character Encoding mismatch!
The character encoding specified in the HTTP header (utf-8) is different from the value in the <meta> element (windows-1252). I will use the value from the HTTP header (utf-8) for this validation.
Is this something I can change (registry?) on future pages so the default encoding always indicates utf-8? How important is this?
Jim
Re: character encoding - UTF-8 and CP-1252
Read here: http://www.talkgraphics.com/showthread.php?t=45105
Quote:
9. Another request was to allow overwriting of the declared used char set. This is done by applying the name that begins with "charset=" to any object. For example: "charset=UTF-8". Affects all pages in the site.
Re: character encoding - UTF-8 and CP-1252
Re: character encoding - UTF-8 and CP-1252
Covoxer I have a charset=windows-1253 problem with my site, because even though the meta tag is declared and accepted in the code, the greek characters are not readable in the code which is not good for google.
This is troubling me a lot and your help will be of a great value.
Thanx in advance
Thanos