Page MenuHomePhabricator

Public | Unicode characters in file and folder names will not appear in output
Closed, ResolvedPublic


Tristan wrote in ticket:

I am opening the ticket because I met the issue that some documents with Chinese characters can't show properly when I tried to run the save to disk function get the document list of certain folder.

Here's what I tested:

Event Timeline

Joe created this task.Jan 4 2019, 5:26 PM
Joe created this object in space S5 Public.
Joe created this object with visibility "Public (No Login Required)".
Joe closed this task as Wontfix.Jan 15 2019, 11:50 AM

Directory Printer is ASCII through and through starting with the folder browser. Fixing this before a .NET rewrite isn't worth the effort. Time to break off and rewrite this thing.

Joe added a comment.Feb 28 2019, 8:13 PM

Chris wrote to about Greek characters.

Joe triaged this task as Wishlist priority.Feb 28 2019, 8:14 PM
Joe reopened this task as Open.
Joe added a comment.Apr 14 2019, 10:49 AM

Contrary to what I used to believe, VB6 does store strings in full, wide character format -- so the file and folder names with special characters should be in there. The inability for the output to contain certain characters then depends on code-page dependencies? Notice here that Asian characters and Hebrew are excluded, but some special Latin-ish characters are displayed:

Joe added a comment.Apr 20 2020, 10:09 AM

Vince wrote to support:

I am having an issue that when I process a file with double-byte character, the output to the text file is all in ???????.txt

Joe added a comment.EditedApr 20 2020, 2:58 PM

Shiver me timbers!

Now saving Unicode bytes to disk instead of calling Print# function (which converts it to local ANSI question marks for any multibyte characters)

From comment on How To Write Out Unicode Text Files in VBA

Internally, all VB strings are in Unicode format anyway. The "put" statement converts strings to ANSI *unless* you pass the string as a byte array.

Sub WriteUnicodeTextFile(ByRef Path As String, ByRef Value As String)
   Dim Buffer() As Byte
   Dim FileNum As Integer
   ' Convert string to an array of bytes, preserving unicode (2bytes per character)
   Buffer = Value
   FileNum = FreeFile
   Open Path For Binary As FileNum
   Put FileNum, , Buffer  
   Close FileNum
End Sub

Wayne Phillips
Joe added a comment.Apr 21 2020, 10:57 PM

Now also able to print Unicode. Had to replace calls to the Printer object's Print method with a call to Win32 TextOutW -- that also means CurrentX and CurrentY have to be incremented manually.

Joe added a comment.Apr 22 2020, 12:30 PM

Checked that the tab-separated values (.tsv) format is outputting Unicode.

Joe closed this task as Resolved.Apr 22 2020, 1:51 PM

Saving UTF-8 encoded instead of saving UTF-16

And... done