通过VB编写UTF-8格式的文本文件

由于系统的需要,要国际化,但是由于那些字符串和翻译都写在EXCEL表格里面,如果一个一个的复制出来,那是相当麻烦的.所以老大让我写一个转换器,从EXECEL表格中导出数据,然后写到.PO扩展名的文本文件,要求UTF-8的编码格式的.

UTF-8是UNICODE编码格式的一种特殊情况.对于汉字的处理,它是采用了三个字节.对于它的具体情况,在这就不具体介绍,网上有相当多的资料.在VB中,有很多API可以提供,比如连接EXCEL、连接数据库、ADODB数据流等。这些API都可以很方便的引用。

在写的过程中,让我有点苦恼的是,用UTF-8编码编写的文本文件,都会产生一个BOM,这使得转换出来的文件,在系统编译中编译不过去,必须要把BOM去掉。接下来通过代码讲讲怎么实现。

  1. Dim app As Excel.Application
  2. Dim eworkbook As Workbook
  3. Dim eworksheet As Worksheet
  4. Dim eworksheet_count As Integer
  5. Dim sheetName As String
  6. Dim obj As Object
  7. Dim FileNum
  8. Dim file_path as String
  9. Dim j as Integer
  10. Dim filepath_save as String
  11. filepath_save = "d:/"
  12. Set app = New Excel.Application //连接EXCEL
  13. Set eworkbook = app.Workbooks.Open(file_path)
  14. eworkbook_count = eworkbook.Worksheets.count
  15. For j = 1 To eworkbook_count
  16. filepath_path = filepath_save & j & ".txt"
  17. Set eworksheet = eworkbook.Sheets(j)
  18. sheetName = eworksheet.Name
  19. Set obj = New ADODB.Stream //设置ADODB流
  20. With obj
  21. .Open
  22. .Charset = "UTF-8"
  23. .Position = .Size
  24. .WriteText "helloworld", 1
  25. .SaveToFile filepath_save
  26. .Close
  27. End With
  28. Set obj = Nothing
  29. Open filepath_save For Input As #1 //消除UTF-8的BOM
  30. Line Input #1, str
  31. mm = Replace(str, str, "msgid """"")
  32. Close #1
  33. Open filepath_save For Binary As #FileNum
  34. Put #FileNum, , mm
  35. Close #FileNum
  36. Next j
  37. Set eworksheet = Nothing
  38. eworkbook.Close
  39. Set eworkbook = Nothing
  40. app.Quit
  41. Set app = Nothing
Dim app As Excel.Application Dim eworkbook As Workbook Dim eworksheet As Worksheet Dim eworksheet_count As Integer Dim sheetName As String Dim obj As Object Dim FileNum Dim file_path as String Dim j as Integer Dim filepath_save as String filepath_save = "d:/" Set app = New Excel.Application //连接EXCEL Set eworkbook = app.Workbooks.Open(file_path) eworkbook_count = eworkbook.Worksheets.count For j = 1 To eworkbook_count filepath_path = filepath_save & j & ".txt" Set eworksheet = eworkbook.Sheets(j) sheetName = eworksheet.Name Set obj = New ADODB.Stream //设置ADODB流 With obj .Open .Charset = "UTF-8" .Position = .Size .WriteText "helloworld", 1 .SaveToFile filepath_save .Close End With Set obj = Nothing Open filepath_save For Input As #1 //消除UTF-8的BOM Line Input #1, str mm = Replace(str, str, "msgid """"") Close #1 Open filepath_save For Binary As #FileNum Put #FileNum, , mm Close #FileNum Next j Set eworksheet = Nothing eworkbook.Close Set eworkbook = Nothing app.Quit Set app = Nothing

这是其中的一部分关键的代码,如果没有设置UTF-8的编码格式的话,一般NOTEPADE的写入格式都是默认为ANSI。好久没用VB写代码了,不过这次用起来还是感觉比较好的。所以在学校多学点是好的。

经验分享 程序员 微信小程序 职场和发展