I sent this message to help-gnu-emacs, but got no response; thought I might get better answers here.
[First: I'm running GNU Emacs 23.2.1 (i386-mingw-nt6.1.17601), on 64-bit Windows 7 Professional SP2.]
I would like new buffers to default to utf-8 encoding, and I would like indeterminate files (like text files, especially source code files) also to use utf-8, unless the -*- line specifies a different coding system.
By default, when I create a new buffer that isn't associated with any file, the coding system is set to 'iso-latin1-dos'. When I visit an existing (text) file, its coding system is set to 'undecided-dos'.
I tried to change this by executing (prefer-coding-system 'utf-8) After that, when I create a new file, the coding system in the new buffer is set to 'utf-8'. However, when I open an existing file, emacs still sets its coding system to 'undecided-dos'.
Frustrated, I then tried (setq-default buffer-file-coding-system 'utf-8) That didn't help; visiting existing source code files still came up with a coding of 'undecided-dos'.
Digging further, it seems that this is controlled by the variable file-coding-system-alist. If a file name does not match any of the patterns in that list, the function find-buffer-file-type-coding-system (in dos-w32.el) is invoked to determine what coding system to use for the file.
That function *always* returns 'undecided' for text files, or 'no-conversion' for files it determines are binary. The only time it uses the default value for buffer-file-coding-system is if the file doesn't yet exist!
Am I reading this right? There is no way to set a preferred coding system for existing files under Windows? 'prefer-coding-system' only works in *nix environments? I have to either add every source and text file name pattern to file-coding-system-alist, or manually change the buffer coding every time I visit an existing file?