处理生僻字乱码:JPA框架对于Oracle的NVarchar2,NChar,NClob类型支持

2024-07-16#coding#oracle#生僻字#JPA#java阅读时间约 7 分钟

一、背景说明

oracle数据库因字符集问题,varchar2类型无法保存部分生僻汉字,保存后生僻汉字会变成“?”。

解决方案:

oracle数据库在解决这个问题上,可采用如下方法解决:

1.将varchar2,clob,char类型改成nvarchar2,nclob,nchar类型

2.另外在保存生僻字时需要将字符串转为unicode编码再进行保存,否则生僻字保存进去仍然不会生效。

保存方法:

因保存生僻字需要转换unicode编码再保存,所以nvarchar2,nchar及nclob保存时与一般方法有一定差异。

1)sql保存方法如下:

在字段前加n

updateset 字段 = n'值' where 条件 = ''

2)jdbc保存方法

// nvarchar2类型使用setNString()方法
java.sql.PreparedStatement.setNString();
 
// nchar类型同样使用setNString()方法
java.sql.PreparedStatement.setNString();
 
// nclob类型使用setNClob()方法
java.sql.PreparedStatement.setNClob();

二、存在问题

因JPA底层已经封死,当判断类型为String类型时,会默认都使用java.sql.PreparedStatement.setString()方法写入,导致就算数据库类型已经修改,还是不能正确写入生僻字。

下面介绍两种处理方案。

三、JPA框架支持N类型处理方案

1.自定义TypeDescriptor和自定义方言(全局处理,推荐该方案)

1.1.自定义TypeDescriptor

Hibernate在最终设值时会通过TypeDescriptor类进行设值,可参考类:VarcharTypeDescriptor.getBinder(…)

展开查看:VarcharTypeDescriptor部分源码
public class VarcharTypeDescriptor implements SqlTypeDescriptor {
    @Override
    public <X> ValueBinder<X> getBinder(final JavaTypeDescriptor<X> javaTypeDescriptor) {
        return new BasicBinder<X>( javaTypeDescriptor, this ) {
            @Override
            protected void doBind(PreparedStatement st, X value, int index, WrapperOptions options) throws SQLException {
                // 通过该逻辑给sql设值
                st.setString( index, javaTypeDescriptor.unwrap( value, String.class, options ) );
            }
 
            @Override
            protected void doBind(CallableStatement st, X value, String name, WrapperOptions options)
                    throws SQLException {
                st.setString( name, javaTypeDescriptor.unwrap( value, String.class, options ) );
            }
        };
    }
}

那么,我们只需要继承这个类,加入我们的逻辑,通过判断是否生僻字,即可在setString和setNString中切换。

首先我们增加一个用于判断编码是否属于生僻字的工具类

NstringSupportUtils

展开查看:NstringSupportUtils工具类源码
import lombok.SneakyThrows;
 
public class NstringSupportUtils {
 
    /**
     * <p>编码是否属于生僻字</p>
     * @author VincentHo
     * @param str
     * @return boolean
     */
    public static boolean asUnicode(String str) {
        return isLessUseWord(str);
    }
 
    @SneakyThrows
    private static boolean isLessUseWord(String str) {
        return !str.equals(new String(str.getBytes("gb18030"), "gbk"));
    }
 
}

然后我们通过继承VarcharTypeDescriptor来增加生僻字的支持,通过工具类判断该走setNString()还是setString()

生僻字判断核心代码片段:

// doBind()方法内的部分片段
if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
    st.setNString( index, javaTypeDescriptor.unwrap( value, String.class, options ) );
} else {
    st.setString( index, javaTypeDescriptor.unwrap( value, String.class, options ) );
}

全部代码如下:

展开查看源码
import com.gdbyway.bpg.ip.common.persistence.infrastructure.util.NstringSupportUtils;
import org.hibernate.type.descriptor.ValueBinder;
import org.hibernate.type.descriptor.WrapperOptions;
import org.hibernate.type.descriptor.java.JavaTypeDescriptor;
import org.hibernate.type.descriptor.sql.BasicBinder;
import org.hibernate.type.descriptor.sql.VarcharTypeDescriptor;
import java.sql.*;
 
public class CustomVarcharTypeDescriptor extends VarcharTypeDescriptor {
    public static final CustomVarcharTypeDescriptor INSTANCE = new CustomVarcharTypeDescriptor();
 
    public CustomVarcharTypeDescriptor() {
        super();
    }
 
    @Override
    public <X> ValueBinder<X> getBinder(final JavaTypeDescriptor<X> javaTypeDescriptor) {
        return new BasicBinder<X>( javaTypeDescriptor, this ) {
            @Override
            protected void doBind(PreparedStatement st, X value, int index, WrapperOptions options) throws SQLException {
                // 用于判断如果属于生僻字就走setNString(),否则走setString()
                if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
                    st.setNString( index, javaTypeDescriptor.unwrap( value, String.class, options ) );
                } else {
                    st.setString( index, javaTypeDescriptor.unwrap( value, String.class, options ) );
                }
            }
 
            @Override
            protected void doBind(CallableStatement st, X value, String name, WrapperOptions options)
                    throws SQLException {
                if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
                    st.setNString( name, javaTypeDescriptor.unwrap( value, String.class, options ) );
                } else {
                    st.setString( name, javaTypeDescriptor.unwrap( value, String.class, options ) );
                }
 
            }
        };
    }
 
}

其余的nchar和nclob同样道理,不再赘述,直接贴代码:

展开查看源码:CustomCharTypeDescriptor.java
import org.hibernate.type.descriptor.ValueBinder;
import org.hibernate.type.descriptor.java.JavaTypeDescriptor;
import org.hibernate.type.descriptor.sql.CharTypeDescriptor;
 
public class CustomCharTypeDescriptor extends CharTypeDescriptor {
    public static final CustomCharTypeDescriptor INSTANCE = new CustomCharTypeDescriptor();
 
    public CustomCharTypeDescriptor() {
        super();
    }
 
    @Override
    public <X> ValueBinder<X> getBinder(final JavaTypeDescriptor<X> javaTypeDescriptor) {
        // char直接用varchar的处理方法
        return CustomVarcharTypeDescriptor.INSTANCE.getBinder(javaTypeDescriptor);
    }
 
}
展开查看源码:CustomClobTypeDescriptor.java
import com.gdbyway.bpg.ip.common.persistence.infrastructure.util.NstringSupportUtils;
import org.hibernate.engine.jdbc.CharacterStream;
import org.hibernate.type.descriptor.WrapperOptions;
import org.hibernate.type.descriptor.java.JavaTypeDescriptor;
import org.hibernate.type.descriptor.sql.BasicBinder;
import org.hibernate.type.descriptor.sql.ClobTypeDescriptor;
import java.sql.*;
 
public class CustomClobTypeDescriptor extends ClobTypeDescriptor {
    public static final CustomClobTypeDescriptor INSTANCE = new CustomClobTypeDescriptor();
 
    public CustomClobTypeDescriptor() {
        super();
    }
 
    @Override
    public <X> BasicBinder<X> getClobBinder(final JavaTypeDescriptor<X> javaTypeDescriptor) {
        return new BasicBinder<X>( javaTypeDescriptor, this ) {
            @Override
            protected void doBind(PreparedStatement st, X value, int index, WrapperOptions options)
                    throws SQLException {
                if ( options.useStreamForLobBinding() ) {
                    if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
                        bindNForStream(javaTypeDescriptor, st, value, index, options);
                    } else {
                        bindForStream(javaTypeDescriptor, st, value, index, options);
                    }
                } else {
                    if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
                        st.setNClob( index, javaTypeDescriptor.unwrap( value, NClob.class, options ) );
                    } else {
                        st.setClob( index, javaTypeDescriptor.unwrap( value, Clob.class, options ) );
                    }
                }
            }
 
            @Override
            protected void doBind(CallableStatement st, X value, String name, WrapperOptions options)
                    throws SQLException {
                if ( options.useStreamForLobBinding() ) {
                    if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
                        bindNForStream(javaTypeDescriptor, st, value, name, options);
                    } else {
                        bindForStream(javaTypeDescriptor, st, value, name, options);
                    }
                } else {
                    if(value != null && NstringSupportUtils.asUnicode(value.toString())) {
                        st.setNClob( name, javaTypeDescriptor.unwrap( value, NClob.class, options ) );
                    } else {
                        st.setClob( name, javaTypeDescriptor.unwrap( value, Clob.class, options ) );
                    }
                }
            }
        };
    }
 
    private <X> void bindForStream(final JavaTypeDescriptor<X> javaTypeDescriptor, PreparedStatement st, X value, int index, WrapperOptions options) throws SQLException {
        final CharacterStream characterStream = buildCharacterStream(javaTypeDescriptor, value, options);
        st.setCharacterStream( index, characterStream.asReader(), characterStream.getLength() );
    }
 
    private <X> void bindForStream(final JavaTypeDescriptor<X> javaTypeDescriptor, CallableStatement st, X value, String name, WrapperOptions options) throws SQLException {
        final CharacterStream characterStream = buildCharacterStream(javaTypeDescriptor, value, options);
        st.setCharacterStream( name, characterStream.asReader(), characterStream.getLength() );
    }
 
    private <X> void bindNForStream(final JavaTypeDescriptor<X> javaTypeDescriptor, PreparedStatement st, X value, int index, WrapperOptions options) throws SQLException {
        final CharacterStream characterStream = buildCharacterStream(javaTypeDescriptor, value, options);
        st.setNCharacterStream( index, characterStream.asReader(), characterStream.getLength() );
    }
 
    private <X> void bindNForStream(final JavaTypeDescriptor<X> javaTypeDescriptor, CallableStatement st, X value, String name, WrapperOptions options) throws SQLException {
        final CharacterStream characterStream = buildCharacterStream(javaTypeDescriptor, value, options);
        st.setNCharacterStream( name, characterStream.asReader(), characterStream.getLength() );
    }
 
    private final <X> CharacterStream buildCharacterStream(final JavaTypeDescriptor<X> javaTypeDescriptor, X value, WrapperOptions options) {
        final CharacterStream characterStream = javaTypeDescriptor.unwrap(
                value,
                CharacterStream.class,
                options
        );
        return characterStream;
    }
 
}

1.2.自定义方言

那么自定义TypeDescriptor都完成了,如何应用到呢?

hibernate为了适应各种数据库的方言,有一个方言的定义类:Dialect,所有数据库方言都继承于这个类进行扩展,这个类上有一个方法:getSqlTypeDescriptorOverride(int sqlCode),就是用于在处理类型时返回对应的TypeDescriptor

所以,我们只需要定义一个自定义方言,重写一下这个方法,就可以使用我们自定义的TypeDescriptor,代码如下:

展开查看源码:CustomOracleDialect.java
import com.gdbyway.bpg.ip.common.persistence.infrastructure.type.CustomCharTypeDescriptor;
import com.gdbyway.bpg.ip.common.persistence.infrastructure.type.CustomClobTypeDescriptor;
import com.gdbyway.bpg.ip.common.persistence.infrastructure.type.CustomVarcharTypeDescriptor;
import org.hibernate.dialect.Oracle10gDialect;
import org.hibernate.type.descriptor.sql.SqlTypeDescriptor;
import java.sql.Types;
 
// 这里继承的方言类建议跟项目原来的方言类一致,避免出现一些兼容问题
public class CustomOracleDialect extends Oracle10gDialect {
 
    public CustomOracleDialect() {
        super();
    }
 
    @Override
    protected SqlTypeDescriptor getSqlTypeDescriptorOverride(int sqlCode) {
        switch (sqlCode) {
            // 根据类型返回对应的自定义TypeDescriptor
            case Types.VARCHAR : return CustomVarcharTypeDescriptor.INSTANCE;
            case Types.CHAR : return CustomCharTypeDescriptor.INSTANCE;
            case Types.CLOB : return CustomClobTypeDescriptor.INSTANCE;
            default: return super.getSqlTypeDescriptorOverride(sqlCode);
        }
    }
}

1.3.配置并启用

最后,我们只需要配置上我们上面定义的自定义方言,就可以全局启用生僻字识别了。

展开查看application.yaml配置
spring:
    jpa:
      properties:
        hibernate:
          dialect: com.gdbyway.bpg.ip.common.persistence.infrastructure.dialect.CustomOracleDialect

2.增加@Nationalized注解(不推荐)

在持久实体(PO)的nvarchar类型的字段上增加@Nationalized注解。

如:

import org.hibernate.annotations.Nationalized;
 
@Entity
@Table(name = "ipn_visit")
public class VisitPO {
 
    /** 主键 */
    @Id
    private String pid;
     
    /** 患者姓名,此字段类型为nvarchar2,增加@Nationalized注解后保存生僻字就不会乱码 */
    @Nationalized
    private String patName;
}

该方案改动虽简单,但是维护相当繁琐,因为每次修改数据库字段后,都需要相应的修改程序发布,不能自动全局识别。