使用 bitset 进行字符串匹配

2025/7/10 OI 字符串 bitset

背景

2025-07-10 的时候，随机跳题跳到一道字符串的题，一开始以为是 FFT，思考后未果，瞟了眼题解发现了这个神秘的字符串匹配方法。

题面描述

P4465 [国家集训队] JZPSTR
维护一个字符串，需要支持区间删除，从某个位置插入字符串，查询某个区间与给定字符串的匹配数量。

这道题本身正解是 SAM+分块链表，很显然我不会，代码长达 9kb。但是有神仙发现 bitset 可过，于是便有了这篇博客。
~~原本是黑题，似乎因为这个做法降紫了。~~

具体做法

首先，我们发现此题字符集很小，仅有 \(m=10\) 种字符。我们对每种字符开一个长度为 \(N\) 的 bitset，记作 \(b_{i}\)，然后我们遍历字符串 \(S\)，将 \(b_{S_{i}}\) 的第 \(i\) 位设为True。

插入操作

设插入的字符串为 \(S\)。不妨暴力一点，我们对于每种字符，我们将 \(x\) 及以后的位置均往后挪动 \(\left|S\right|\) 位。然后再按照前面所说的暴力去设置新的位置的值即可。
总共的时间复杂度为 \(O(\dfrac{N\sum\left|S\right|}{w})\)。实际上还要小很多。

删除操作

类似于前面，我们只需要把前 \(x-1\) 位提取出来，再把第 \(y\) 位及以后的拿出来，两个拼到一起就行了。时间复杂度 \(O(\dfrac{N}{w})\)。

匹配

设匹配的字符串为 \(S\)。

我们一样暴力一点，尝试一位一位来进行匹配，我们先设置一个 bitset，称作 \(ans\)，其 \([x,y-\left|S\right|+1]\) 为True，其他位置均为False，True 表示以该位置为起点可以匹配上。

然后我们枚举 \(S\) 的每个字符 \(S_{i}\)，不难发现假如某个起始位置能匹配上该字符，当且仅当 \(b_{S_{i}}\) 右移 \(i\) 位后与上该起始位置仍为True。下标从 \(0\) 开始。

最后整个 \(ans\) 中True的数量即为匹配数量。

时间复杂度 \(O(\dfrac{N\sum\left|S\right|}{w})\)。

一些细节

以上操作涉及到对 bitset 以及位运算的较熟练运用。

比如将 \([x,N-1]\) 这段区间提取出来，可以先设置一个全为True的 bitset，将其左移 \(x\) 位再与上原数组便是这段区间。
类似的，提取 \([0,x-1]\) 就是上述操作，但是左移后先取反再与。

数True的数量可以用 bitset 的count()函数简单实现。

代码

// Problem: P4465 [国家集训队] JZPSTR
// Contest: Luogu
// URL: https://www.luogu.com.cn/problem/P4465
// Memory Limit: 250 MB
// Time Limit: 3000 ms
// 
// Powered by CP Editor (https://cpeditor.org)

#include<bits/stdc++.h>
using namespace std;
const int MAXN=1e6+10;
const int mod1=1e9+7;
const int mod2=998244353;
const int inf_int=0x7f7f7f7f;
const long long inf_long=0x7f7f7f7f7f7f7f7f;
const double eps=1e-9;
char Buf[1<<23],*P1=Buf,*P2=Buf;
#define getchar() (P1==P2&&(P2=(P1=Buf)+fread(Buf,1,1<<23,stdin),P1==P2)?EOF:*P1++)
template<typename type>
inline void read(type &x){
    x=0;
    bool f=false;
    char ch=getchar();
    while(ch<'0'||ch>'9') f|=ch=='-',ch=getchar();
    while(ch>='0'&&ch<='9') x=x*10+(ch^48),ch=getchar();
    if(f) x=-x;
}
template<typename type,typename... args>
inline void read(type &x,args&... y){
    read(x),read(y...);
}

int m;
bitset<MAXN> b[10],all;

signed main(){
    ios::sync_with_stdio(false);
    cin.tie(nullptr);
    cin>>m;
    for(int i=0;i<=1000000;i++) all[i]=1;
    for(int i=1;i<=m;i++){
        int opt,x,y;string s;
        cin>>opt;
        if(opt==0){
            cin>>x>>s;
            for(int i=0;i<10;i++) b[i]=(((all<<x)&b[i])<<s.length())|((~(all<<x))&b[i]);
            for(int i=0;i<s.length();i++) b[s[i]-'0'][i+x]=1;
        }
        if(opt==1){
            cin>>x>>y;
            for(int i=0;i<10;i++) b[i]=(((all<<y)&b[i])>>(y-x))|((~(all<<x))&b[i]);
        }
        if(opt==2){
            cin>>x>>y>>s;
            if(y-x<s.length()){
                cout<<0<<'\n';
                continue;
            }
            bitset<MAXN> ans;
            ans=(all<<x)^(all<<y-s.length()+1);
            for(int i=0;i<s.length();i++) ans&=(b[s[i]-'0']>>i);
            cout<<ans.count()<<'\n';
        }
    }
    return 0;
}

感觉现在怎么越写题解越不像人话了。（？
有看不懂或者或者发现错误的，可以在评论区指出。

LOADING