04 Apr 2020

String

本篇将根据String.java进行记录String中需要注意的方法、变量。

String基本内容

属性

String类中存储一个声明为value的char数组，int类型的hash，long类型的序列化ID

/** The value is used for character storage. */
private final char value[];

/** Cache the hash code for the string */
private int hash; // Default to 0

/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;

构造函数

无参构造函数，实例化一个空字符串。
可以直接传入StringBuffer进行实例化String对象。

String.charAt（int index）

String的length()，isEmpty()函数比较基础，就是对char数据的判断。

charAt() 会对index进行判断，如果超出了value的下标范围，就会抛出异常。

    public char charAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }

String.codePointAt（int index）

返回Unicode编码数

String s = "123";
System.out.println(s.codePointAt(1));
// 50

codePointBefore(int index)

返回前一个数的unicode编码。
codePointCount(int beginIndex, int endIndex)

返回开始至末尾的码点数。

String.getChars&getBytes

String.getChars()

/**
* The characters are copied into the
* subarray of {@code dst} starting at index {@code dstBegin}
* and ending at index:
* <blockquote><pre>
*     dstBegin + (srcEnd-srcBegin) - 1
* </pre></blockquote>
*/

从String的dstBegin到dstBegin + (srcEnd-srcBegin) - 1位置复制到dst数组中。

getChars(int srcBegin, int srcEnd, char dst[], int dstBegin)

从String的dstBegin开始的位置复制到dst数组中。

getChars(char dst[], int dstBegin)

String.getBytes()

返回字符对应的字节，默认采用UTF-8

/**
* Encodes this {@code String} into a sequence of bytes using the named
* charset, storing the result into a new byte array.
*/
public byte[] getBytes(String charsetName)
    throws UnsupportedEncodingException {
    if (charsetName == null) throw new NullPointerException();
    return StringCoding.encode(charsetName, value, 0, value.length);
}

String.equals()

重写了Object的方法，通过对每个字符进行比较。

contentEquals（StringBuffer）: 直接与StringBuffer进行比较。
equalsIgnoreCase(String anotherString) 忽略大小写比较。

String.hashCode()

重写了Object的方法。

String hash值计算：h = 31 * h + val[i];

至每个位的hash值为== {31 * 上一个位的hash + 当前的ASCII值}

public int hashCode() {
    int h = hash; // default == 0
    if (h == 0 && value.length > 0) {
        char val[] = value;
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

参考：

关于Java中String类的hashCode方法

String.compareTo()

/*
	Compares two strings lexicographically.
*/
public int compareTo(String anotherString)
public int compareToIgnoreCase(String str) 

返回字典序的比较。

String.regionMatches()

/*
The substring of this{@code String} object to be compared begins at index {@code toffset}
and has length {@code len}. The substring of other to be compared
begins at index {@code ooffset} and has length {@code len}.
调用方的toffset开始与other字符串的ooffset开始，比较至len
*/
public boolean regionMatches(boolean ignoreCase, int toffset,
        String other, int ooffset, int len)
// ignoreCase == true 则忽略大小写。

regionMatches() 方法用于比较两个字符串在一个区域内是否相等。

String.startsWith()

public boolean startsWith(String prefix, int toffset)

比较从toffset起，是否比配前缀prefix

endsWith(String suffix)

比较后缀。

String.indexOf()

所有方法都是获得某个字符或者字符串对应的第一个或者最后一个下标。

共有14个相关方法。

public int indexOf(String str) 

String.substring()

public String substring(int beginIndex, int endIndex)
public String substring(int beginIndex)
public CharSequence subSequence(int beginIndex, int endIndex)

获取 [ beginIndex ， endIndex/length) 位置的子串

==后续补充==

String.split()

Java split()用法

单个符号作为分割：

String[] splitAddress=address.split("\\"); // `\`
String[] splitAddress=address.split("\\|"); // `|`
String[] splitAddress=address.split("\\*"); // `*`;`^`;`:`;`.`;`@`;`,` 同理，前面需要加\\

多符号作为分割：

如果使用多个分隔符则需要借助符号，但需要转义符的仍然要加上分隔符进行处理

String address="上海^上海市@闵行区#吴中路";
String[] splitAddress=address.split("\\^|@|#"); //分隔符为^ @ #