How to convert an existing MySQL database's character set

  • 7012293
  • 05-Feb-2010
  • 29-May-2013

Environment

Novell Service Desk 6.5

Situation

If the language and character set of Novell Service Desk needs to be changed, after a database has already been in use, rather than start with a fresh database, the following steps can be used to convert a MySQL database from one character set to another.

Resolution

In this example, we are converting from latin1 (AKA iso-8859-1) to utf-8. If converting to/from different versions, the appropriate substitutions need to be made.
 
Please read through this before starting this due to the replacing step as this may take a while if there is the need to install cygwin or find a program that will deal with this:
 
Stop the application server. If installed using the Installer, this is the Novell Service Desk service. Otherwise, this is the Tomcat, JBoss, or whatever application server is in use.
 
Make a backup of the database using the company's standard processes, as a precaution.
 
The MySQL client tools need to be installed so that mysql and mysqldump can be run.
 
A command/ssh/terminal prompt should be opened and the following entered, where username is the database username and Novell Service Desk is the database name:
 
(The first line is only for Windows users as they may need to change to the folder where the commands are located. In the case of MySQL server, the following line would do this, where x.y needs to be replaced with the version number.
 
Mac users may also need to include the path before the mysql commands depending on whether it has been added to the computer's PATH. ie. if the mysqldump line cannot find mysqldump. The path of this depends on how MySQL was installed but the default path to put in front of the mysql commands is: /usr/local/mysql/bin/)
 
cd "C:\Program Files\MySQL\MySQL Server x.y\bin"
 
mysqldump -u username -p Novell Service Desk > Novell Service Desk_latin1.sql
 
This will generate a mysql dump file. Note that mysql dumps or backups can be created in a number of ways. However, the way described above should not include the database creation steps at the start of the script so it will not overwrite the database this came from unless specified so when restoring the database.
 
To double check this load the Novell Service Desk_latin1.sql and look for CREATE DATABASE. If this is in the file then this is the wrong dump file type. The safest way to ensure the correct dump file type is created is to follow the command usage stated above precisely.
 
The dump file created from this (Novell Service Desk_latin1.sql) will need loading into a program that can replace all occurrences of CHARSET=latin1 with CHARSET=[the desired character set]. So for this example, we need to replace this with CHARSET=utf8. However it is important that whatever replacing tool used, doesn't damaging the file and leaves the file as plain text. Some text editors may not cope with the long lines that can be involved with MySQL dumps so this needs to be considered.
 
The unix/linux based command called sed can be used for this. This is a native command to Linux and Mac based operating systems and is an efficient way to achieve this goal.
 
To do this on Windows, this can be done using a program called cygwin. This is a freely available program from http://www.cygwin.com/. There are other alternatives to this such as sed conversions for Windows.
 
If using CygWin the following command will change the directory to the bin folder, if this is where the database dump was placed in the previous command, replacing x.y again accordingly:
 
cd c:/Program\ Files/MySQL/MySQL\ Server\ x.y/bin
 
The sed command can be used as follows:
 
sed 's/CHARSET=latin1/CHARSET=utf8/g' Novell Service Desk_latin1.sql > Novell Service Desk_utf8.sql
 
To confirm this has worked, there should be a slightly bigger file called Novell Service Desk_utf8.sql in the folder.
 
If a collation (sort order) has been set if you view the start of this file there will be instructions to create the first table that begin as follows:
 
CREATE TABLE `alert` (
  `ID` int(11) NOT NULL,
  `CONTENT` longtext COLLATE latin1_general_ci,
 
If the 'COLLATE' term is used in this file, which will appear as above for text based fields (as shown) and as 'COLLATE=' at the end of each table then the value for this will probably need updating to the appropriate collation too. In our case, this is utf8_general_ci so the sed command could then be used as follows to replace the collation references:
 
sed 's/latin1_general_ci/utf8_general_ci/g' Novell Service Desk_utf8.sql > Novell Service Desk_utf8b.sql

Note that general refers to the sorting order here and this could be set to a language that the MySQL version supports. To see the collations possible, run the following MySQL query:
 
SHOW COLLATION LIKE 'utf8%';
 
Note that in the above line we are replacing the collation references in the file we have already replaced the character set in, creating another file with a b appended to it's name, which will have both character set and collation updated accordingly.
 
In some rare cases there are separate references to the character set for fields in each table. If this is the case the MySQL command: show create table INCIDENT would show items like this:
 
`CC_LIST` varchar(255) character set latin1
 
If this is true, we would also need to replace those references too.
 
sed 's/character\ set\ latin1/character\ set\ utf8/g' Novell Service Desk_utf8b.sql > Novell Service Desk_utf8c.sql
 
Note that in the above line we are now replacing these extra character set references in the file we replaced the collation references in, creating a 3rd utf8 file. If the collation step wasn't done then replace Novell Service Desk_utf8b.sql with the original Novell Service Desk_utf8.sql file, as appropriate.
 
Next we need to create the new database to replace this. If MySQL has been configured to use a default of utf-8 for new databases, this can be dome using a command/ssh/terminal prompt, by typing the following:
 
mysqladmin -h ServerIPOrName -u root -p create Novell Service Desk_utf8
 
Alternatively, a tool that allows the character encoding to be specified, during creation, such as Navicat, could be used.

Now type in the command/ssh/terminal prompt (Remember to use the last file created from the above steps - if the collation and/or field character set references were replaced, this will be 2nd or 3rd versions as appropriate) :
 
mysql -h ServerIPOrName -u root -p -D Novell Service Desk_utf8 < Novell Service Desk_utf8.sql
 
Now the application server needs to be started, Novell Service Desk needs to be set to use this database under Setup > Database and then the application server restarted again, when prompted.
 
Upon access of Novell Service Desk again at http://your-address/Novell Service Desk/WebObjects/Novell Service Desk.woa/ (note: Do not include Login on the end of this url at this point), it might show the Database selection screen again.
 
If this is the case, please ensure the browser that is set to utf8 encoding and the default English language (either switch browser or set the browser settings back to English if they were set to the other language), and then click Save. This will apply the settings and the login prompt should then appear.
 
The browser can now be set back to the required language and encoding as required.
 
Note that, it may be the case that previously entered characters may need to be re-entered and saved in Novell Service Desk for the full effects of this change to be seen.

Additional Information

The improved diverse character set handling and utilization of utf-8 defaults since version 6.5, makes moving the database to a utf-8 based character set a first thing to try when character set issues are encountered.

Formerly known as 1000901