by Devin Yang
(This article was automatically translated.)

Published - 2 years ago ( Updated - 2 years ago )

Do you have a large number of big5-encoded php, js or html pages that need to be transcoded?
Here is my original php transcoding method, which uses PHP programs to transcode files.

The transcoding of database big5 via latin1 is more complicated, so this article will not discuss it.

The most important thing before transcoding is to control the version of your target folder, so that after transcoding, you can compare whether there is a normal transcoding success.
 Or when there is a problem, you can also use the reduction method.

The action of the following program is very simple, you should be able to guess what it is doing by looking at the code, that is, all the sub-files in the directory under the loop parameter are php, js or html files,
Replace the charset inside and change the encoding to UTF8.

To execute this program, please make sure that php is installed in your computer. For example, my php execution program is in /usr/bin/php. If not, you can adjust it yourself#!/usr/bin/ php in the correct location.

You can detect the location of php in your computer through the which php command, for example, the possible result of MacOS will be as follows

 which php
/opt/homebrew/bin/php

Of course, your result is also likely to be

which php
php not found


If there is no php computer installed, you can also use Docker to mount the directory. This is a digression, and this article will not introduce more.

Enter the topic, let's start!

First, create a file called iconv.php

#!/usr/bin/php
<?php
if(count($argv)>1){
   $iterator = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($argv[1]));
   array_filter(iterator_to_array($iterator), function($item) {
       if($item->isFile()){
           $file = sprintf("%s/%s",$item->getPath(),$item->getFilename());
           if(preg_match('/(\.(php|js|html)$)/', $file, $matches, PREG_OFFSET_CAPTURE)) {
               $output = file_get_contents($file);
               if(!mb_check_encoding($output, 'UTF-8'))
               {
                   echo sprintf("Convert big5 file to utf8: %s\n", $item->getFilename());
                   $output_utf8 = mb_convert_encoding($output, "UTF-8", "BIG5");
                   $output_utf8 = preg_replace("/charset=big5/", "charset=utf-8", $output_utf8);
                   file_put_contents($file, $output_utf8);
               }
           }
       }
   });
} else {
   echo sprintf("%s [converted directory]", $argv[0]);
}

Second, come again, change this php to an executable file, +x to make the file have executable permissions.

chmod +x iconv.php


Third, finally, there is After this transcoding program, we can execute it on the terminal.
The following is a schematic diagram. For example, the program I want to transcode is in the /var/www/html directory.

./iconv.php /var/www/html
Convert big5 file to utf8: abc.php
Convert the big5 file to utf8: test.php

4. After the transcoding is complete, you can go to the target directory, for example, I am /var/www/html, and use git diff to check if there is any What is abnormal?
As you can see, this transcoding program not only helps us convert the code, but also helps us change the code

diff --git a/abc.php b/abc.php
index 3bee6ed..5797c23 100644
--- a/abc.php
+++ b/abc.php
@@ -2,13 +2,13 @@
<html>
<head>
<title>Page Title</title>
-<meta http-equiv="content-type" content="text/html; charset=big5">
+<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body>
<h1>This is a Heading</h1>
<p>This is a paragraph.</p>
-<A4><A4><A4><E5>
+ Chinese
</body>
</html>
diff --git a/test.php b/test.php
index 3bee6ed..5797c23 100644
---a/test.php
+++ b/test.php

 

Tags: big5

Devin Yang

Feel free to ask me, if you don't get it.:)

No Comment

Post your comment

Login is required to leave comments