做每日一题看到题解需要用到前缀树，之前完全没听说过。于是就打算好好学习一下。

前缀树定义

Leetcode #208

Trie（发音类似 “try”）或者说前缀树是一种树形数据结构，用于高效地存储和检索字符串数据集中的键。这一数据结构有相当多的应用情景，例如自动补完和拼写检查。

请你实现 Trie 类：
Trie() 初始化前缀树对象。
void insert(String word) 向前缀树中插入字符串 word 。
boolean search(String word) 如果字符串 word 在前缀树中，返回 true（即，在检索之前已经插入）；否则，返回 false 。
boolean startsWith(String prefix) 如果之前已经插入的字符串 word 的前缀之一为 prefix ，返回 true ；否则，返回 false 。

前缀树实现

不看题解真的不太能理解前缀树到底干了啥，但是看了题解之后就发现前缀树也挺简单的。它可以理解成为一个26叉树，每一个叉对应一个字符。并且每个节点要记录是不是结尾，来判断当前路径能不能构成一个字符串（一个节点可以是结尾，也可以继续指向下一个Trie）

class Trie {
private:
    bool isEnd;
    Trie* next[26];
public:
    /** Initialize your data structure here. */
    Trie() {
        isEnd = false;
        memset(next, 0, sizeof(next)); // 将next数组初始化为0
    }
    
    /** Inserts a word into the trie. */
    void insert(string word) {
        Trie* node = this;
        for (char c : word){
            if (node -> next[c - 'a'] == NULL){
                node -> next[c - 'a'] = new Trie();
            }
            node = node -> next[c - 'a'];
        }
        node -> isEnd = true;
    }
    
    /** Returns if the word is in the trie. */
    bool search(string word) {
        Trie* node = this;
        for (char c : word){
            node = node -> next[c - 'a'];
            if (node == NULL)
                return false;
        }
        return node -> isEnd;
    }
    
    /** Returns if there is any word in the trie that starts with the given prefix. */
    bool startsWith(string prefix) {
        Trie* node = this;
        for (char c : prefix){
            node = node -> next[c - 'a'];
            if (node == NULL)
                return false;
        }
        return true;
    }
};

在官方题解评论中看到一个更好的版本。因为search和startWith前面其实做的事情差不多，所以可以写一个searchPrefix函数来搜索前缀，简化下代码。并且可以用一个数组来记录new生成的node，最后在析构的时候delete掉。

struct Node{
    bool isEnd;
    Node* children[26];
    Node():isEnd(false){
        memset(children,0,sizeof children);
    }
};
class Trie {
private:
    vector<Node*>pool;
    Node* searchPrefix(string prefix) {
        Node* node = pool[0];
        for (char ch : prefix) {
            ch -= 'a';
            if (node->children[ch] == nullptr) {
                return nullptr;
            }
            node = node->children[ch];
        }
        return node;
    }

public:
    Trie() : pool(1) {
        pool[0] = new Node;
    }
    ~Trie(){
        for(auto&&t:pool){
            delete t;
        }
    }
    //每次入新的表需要把它们加入到内存池，方便最后析构函数的清理
    void insert(string word) {
        Node* node = pool[0];
        for (char ch : word) {
            ch -= 'a';
            if (node->children[ch] == nullptr) {
                node->children[ch] = new Node;
                pool.emplace_back(node->children[ch]);
            }
            node = node->children[ch];
        }
        node->isEnd = true;
    }

    bool search(string word) {
        Node* node = this->searchPrefix(word);
        return node != nullptr && node->isEnd;
    }
    bool startsWith(string prefix) {
        return this->searchPrefix(prefix) != nullptr;
    }
};

前缀树应用

单词的压缩编码

Leetcode #820

单词数组 words 的有效编码由任意助记字符串 s 和下标数组 indices 组成，且满足：
words.length == indices.length
助记字符串 s 以 ‘#’ 字符结尾
对于每个下标 indices[i] ，s 的一个从 indices[i] 开始、到下一个 ‘#’ 字符结束（但不包括 ‘#’）的子字符串恰好与 words[i] 相等
给你一个单词数组 words ，返回成功对 words 进行编码的最小助记字符串 s 的长度。

示例 1：

输入：words = [“time”, “me”, “bell”]
输出：10
解释：一组有效编码为 s = “time#bell#” 和 indices = [0, 2, 5] 。
words[0] = “time” ，s 开始于 indices[0] = 0 到下一个 ‘#’ 结束的子字符串，如加粗部分所示 “time#bell#”
words1 = “me” ，s 开始于 indices1 = 2 到下一个 ‘#’ 结束的子字符串，如加粗部分所示 “time#bell#”
words2 = “bell” ，s 开始于 indices2 = 5 到下一个 ‘#’ 结束的子字符串，如加粗部分所示 “time#bell#”

示例 2：
输入：words = [“t”]
输出：2
解释：一组有效编码为 s = “t#” 和 indices = [0] 。

提示：
1 <= words.length <= 2000
1 <= words[i].length <= 7
words[i] 仅由小写字母组成

这题可以用前缀树来做，不过要把单词反转过来，并且要将单词从长到短排序加入到前缀树中。

class Trie{
private:
    Trie* next[26];
public:
    Trie(){
        memset(next, 0, sizeof(next));
    }
    void insert(string word){
        Trie* node = this;
        for (char c : word){
            c -= 'a';
            if (node -> next[c] == NULL) node -> next[c] = new Trie();
            node = node -> next[c];
        }
    }
    bool startWith(string word){
        Trie* node = this;
        for (char c : word){
            c -= 'a';
            node = node -> next[c];
            if (node == NULL)
                return false;
        }
        return true;
    }
};

class Solution {
public:
    int minimumLengthEncoding(vector<string>& words) {
        auto cmp=[&](string &a, string &b){
            return a.size() > b.size();
        };
        sort(words.begin(), words.end(), cmp);
        Trie root;
        int res = 0;
        for (string s : words){
            string r(s);
            reverse(r.begin(),r.end());
            if (root.startWith(r)) continue;
            root.insert(r);
            res += r.size() + 1;
        }
        return res;
    }
};

参考资料

前缀树介绍和实现：
https://leetcode-cn.com/problems/implement-trie-prefix-tree/solution/trie-tree-de-shi-xian-gua-he-chu-xue-zhe-by-huwt/
https://blog.csdn.net/m0_46202073/article/details/107253959
Merkle Patricia Tree:
https://ethfans.org/toya/articles/588

我的「精神家园」

数据结构与算法-前缀树

前缀树定义

前缀树实现

前缀树应用

单词的压缩编码

参考资料