场景1:
将A库数据迁移到B库,其中A库字符集ZHS16GBK,B库字符集AL32UTF8;
场景2:
修改A库字符集,由ZHS16GBK改成AL32UTF8。
问题:
如何提前知道在以上两个场景下,哪些数据会出现问题,例如字段长度溢出等问题?
解决方案:
Oracle有自带的Character Set Scanner(CSSCAN)工具,可以提前收集转换字符集出现错误的数据。
如下:
Oracle使用CSSCAN工具,检查某用户或表字符集转换后是否有错误

一:执行csminst.sql脚本

SQL> @?/rdbms/admin/csminst.sql

二:csscan使用语法:

[oracle@cjcos01 ~]$ csscan -help
Character Set Scanner v2.2 : Release 11.2.0.4.0 - Production on Wed Jun 17 13:46:29 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
You can let Scanner prompt you for parameters by entering the CSSCAN        
command followed by your username/password:                                 
                                                                            
  Example: CSSCAN \"SYSTEM/MANAGER AS SYSDBA\"                          
                                                                            
Or, you can control how Scanner runs by entering the CSSCAN command         
followed by various parameters. To specify parameters, you use keywords:    
                                                                            
  Example:                                                                  
    CSSCAN \"SYSTEM/MANAGER AS SYSDBA\" FULL=y TOCHAR=utf8 ARRAY=1024000 PROCESS=3  
                                                                            
Keyword    Default Prompt Description                                       
---------- ------- ------ ------------------------------------------------- 
USERID             yes    username/password                                 
FULL       N       yes    scan entire database                              
USER               yes    owner of tables to be scanned                     
TABLE              yes    list of tables to scan                            
COLUMN             yes    list of columns to scan                            
EXCLUDE                   list of tables to exclude from scan               
TOCHAR             yes    new database character set name                   
FROMCHAR                  current database character set name               
TONCHAR                   new national character set name                   
FROMNCHAR                 current national character set name               
ARRAY      1024000 yes    size of array fetch buffer                        
PROCESS    1       yes    number of concurrent scan process                 
MAXBLOCKS                 split table if block size exceed MAXBLOCKS        
CAPTURE    N              capture convertible data                          
SUPPRESS                  maximum number of exceptions logged for each table
FEEDBACK                  report progress every N rows                      
BOUNDARIES                list of column size boundaries for summary report 
LASTRPT    N              generate report of the last database scan         
LOG        scan           base file name of report files                    
PARFILE                   parameter file name                               
PRESERVE   N              preserve existing scan results                    
LCSD       N       no     enable language and character set detection       
LCSDDATA   LOSSY   no     define the scope of the detection                 
HELP       N              show help screen (this screen)                    
QUERY      N              select clause to scan subset of tables or columns 
---------- ------- ------ ------------------------------------------------- 
Scanner terminated successfully.

三:使用csscan工具,生成scan.err等信息

[oracle@cjcos01 ~]$ csscan 
Character Set Scanner v2.2 : Release 11.2.0.4.0 - Production on Wed Jun 17 11:47:21 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Username: sys as sysdba
Password: 
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
(1)Full database, (2)User, (3)Table, (4)Column: 1 > 2
Current database character set is ZHS16GBK.
Enter new database character set name: > AL32UTF8
Enter array fetch buffer size: 1024000 > 
Enter number of scan processes to utilize(1..): 1 > 
Enter user name to scan: > CJC
......

四:检查转换字符集后哪些表出现错误,单独处理

[oracle@cjcos01 ~]$ ls
scan.err scan.out scan.txt

五:修改数据库字符集

首先要确定修改后的字符集应该是修改前的超集。
修改数据库字符集方式:

  • 1:执行csalter.plb脚本
  • 2:ALTER DATABASE CHARACTER ...
  • 3: 修改prop$等表(此方法有风险,严禁使用)
    参考Changing the NLS\_CHARACTERSET to AL32UTF8 / UTF8 (Unicode) in 8i, 9i , 10g and 11g (文档 ID 260192.1)
    Oracle 单机修改数据库字符集
10.a.2) Execute the following commands in the database home sqlplus connected as "/ AS SYSDBA":
Sqlplus / as sysdba
SPOOL Nswitch.log
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER SYSTEM ENABLE RESTRICTED SESSION;
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
ALTER SYSTEM SET AQ_TM_PROCESSES=0;
ALTER DATABASE OPEN;
-- Do not do these steps on a 10g or 11g system
ALTER DATABASE CHARACTER SET INTERNAL_USE AL32UTF8;
SHUTDOWN IMMEDIATE;
-- You NEED to restart the database before doing ANY operation on this database
-- exit this session now do not use the session that did alter database for other operations
EXIT
-- reconnect to the database and start the database
Sqlplus / as sysdba
STARTUP;
-- in 8i you need to do the startup/shutdown 2 times
SHUTDOWN;
STARTUP;

Oracle RAC 修改数据库字符集

If you are using RAC you will need to start the database in single instance with CLUSTER_DATABASE = FALSE
Run in the database home sqlplus connected as "/ AS SYSDBA":
Sqlplus / as sysdba
-- Make sure the CLUSTER_DATABASE parameter is set
-- to false or it is not set at all.
-- If you are using RAC you will need to start the database in single instance
-- with CLUSTER_DATABASE = FALSE
sho parameter CLUSTER_DATABASE
-- if you are using spfile
SELECT DECODE(value, NULL, 'PFILE', 'SPFILE') "Init File Type" FROM sys.v_$parameter WHERE name = 'spfile';
-- note the values for
sho parameter job_queue_processes
sho parameter aq_tm_processes
-- (this is Bug 6005344 fixed in 11g )
-- then do
shutdown
startup restrict
SPOOL Nswitch.log
PURGE DBA_RECYCLEBIN
/
-- next select should only give ONE row - your sqlplus connection
-- If more then one session is connected Csalter will fail and report "Sorry only one session is allowed to run this script".
SELECT sid, serial#, username, status,  osuser, machine, process, program FROM v$session WHERE username IS NOT NULL;
-- do this alter system or you might run into "ORA-22839: Direct updates on SYS_NC columns are disallowed"
-- This is only needed in 11.1.0.6, fixed in 11.1.0.7, not applicable to 10.2 or lower
-- ALTER SYSTEM SET EVENTS '22838 TRACE NAME CONTEXT LEVEL 1,FOREVER';
-- then run Csalter.plb:
 @?/rdbms/admin/csalter.plb
-- Csalter will aks confirmation - do not copy paste the whole actions on one time
-- sample Csalter output:
-- 3 rows created.
...
-- This script will update the content of the Oracle Data Dictionary.
-- Please ensure you have a full backup before initiating this procedure.
-- Would you like to proceed (Y/N)?y
-- old 6: if (UPPER('&conf') <> 'Y') then
-- New 6: if (UPPER('y') <> 'Y') then
-- Checking data validility...
-- begin converting system objects
-- PL/SQL procedure successfully completed.
-- Alter the database character set...
-- CSALTER operation completed, please restart database
-- PL/SQL procedure successfully completed.
...
-- Procedure dropped.
-- if you are using spfile then you need to also
-- ALTER SYSTEM SET job_queue_processes=<original value> SCOPE=BOTH;
-- ALTER SYSTEM SET aq_tm_processes=<original value> SCOPE=BOTH;
SHUTDOWN IMMEDIATE;
-- You NEED to restart the database before doing ANY operation on this database
-- exit this session now. do not use the session where Csalter was runned for other operations.
EXIT
-- reconnect to the database and start the database
Sqlplus / as sysdba
STARTUP;

作者:chenoracle

最后修改:2022 年 02 月 22 日
如果觉得我的文章对你有用,请随意赞赏