[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: opening files with unicode characters in the file name on windows
From: |
Mathias Dahl |
Subject: |
Re: opening files with unicode characters in the file name on windows |
Date: |
04 Aug 2004 16:27:05 +0200 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
"Eli Zaretskii" <eliz@gnu.org> writes:
> Your original message said ``file names with Unicode characters''.
> Can you tell what characters are those, and why do you think they
> are encoded in some Unicode-related encoding, like UTF-16? Can you
> look at the file's name as recorded in the directory with some
> low-level tool that actually shows the byte values that encode the
> file's name?
I have done some investigation and I am pretty sure UTF-16 is the
encoding used. The following VBScript program (sorry for pasting
non-emacs related stuff here) loops through all files in a folder and
if the file names contain character values > 255 displays a list with
unicode code point values:
' -- TestUnicoceFileNames.vbs ---
Option Explicit
' --------- Main program starts
Dim sFileName
Dim oFSO
Dim oFile
Set oFSO = CreateObject("Scripting.FileSystemObject")
For Each oFile In oFSO.GetFolder("c:\document\my docs").Files
checkUnicodeFileName(oFile.Name)
Next
Set oFSO = Nothing
' --------- Main program ends
Private Sub checkUnicodeFileName(fileName)
Dim i
Dim c
Dim n
For i = 1 to Len(fileName)
c = Mid(fileName, i, 1)
n = AscW(c)
If n > 255 Then
MsgBox "File name contains unicode characters: " & _
Chr(10) & Chr(10) & _
"File name: " & fileName & _
Chr(10) & Chr(10) & _
"Characters and their unicode code points:" & _
Chr(10) & Chr(10) & _
getStringInfo(fileName)
Exit Sub
End If
Next
End Sub
Private Function getStringInfo(s)
Dim i
Dim n
Dim c
Dim h
Dim result
result = "Char" & Chr(9) & "U+NNNN" & Chr(10) & Chr(10)
For i = 1 to Len(s)
c = Mid(s, i, 1)
n = AscW(c)
h = Hex(n)
result = result & c & Chr(9) & Right("0000" & h, 4) & Chr(10)
Next
getStringInfo = result
End Function
' -- TestUnicoceFileNames.vbs end here---
The output looks like this (you do not see the actual characters which
I do if I use a "unicode font" for message boxes):
File name contains unicode characters:
File name: pravda_правда.txt
Characters and their unicode code points:
Char U+NNNN
p 0070
r 0072
a 0061
v 0076
d 0064
a 0061
_ 005F
п 043F
р 0440
а 0430
в 0432
д 0434
а 0430
. 002E
t 0074
x 0078
t 0074
/Mathias
- Re: opening files with unicode characters in the file name on windows, (continued)
- Re: opening files with unicode characters in the file name on windows, Mathias Dahl, 2004/08/03
- Re: opening files with unicode characters in the file name on windows, Eli Zaretskii, 2004/08/03
- Message not available
- Re: opening files with unicode characters in the file name on windows, Mathias Dahl, 2004/08/04
- Re: opening files with unicode characters in the file name on windows, Jason Rumney, 2004/08/04
- Re: opening files with unicode characters in the file name on windows, Mathias Dahl, 2004/08/04
- Re: opening files with unicode characters in the file name on windows, Eli Zaretskii, 2004/08/04
- Message not available
- Re: opening files with unicode characters in the file name on windows, Mathias Dahl, 2004/08/05
- Re: opening files with unicode characters in the file name on windows, Eli Zaretskii, 2004/08/06
- Message not available
- Re: opening files with unicode characters in the file name on windows, Mathias Dahl, 2004/08/06
- Message not available
- Re: opening files with unicode characters in the file name on windows, Mathias Dahl, 2004/08/06
- Message not available
- Re: opening files with unicode characters in the file name on windows,
Mathias Dahl <=