如何从discuz6.0迁移至blogger(通过wordpress中转)

由于之前的网站域名为个人备案,根据规定,个人备案不能放置论坛和博客:

个人备案不能做博客,不能出现“博客”字眼,不能有”评论“、“登录”等功能。 个人备案网站名必须包含备案时填写的网站名,必须完整,建议后跟网站备案名。 主体下所有备案域名必须解析到主体所在的服务商的服务器IP。 网站首页底部必须添加备案号及版权所有。 控制页面内容,不要出现违法关键词。 域名持有人与备案主体人信息一致。

也因此,为了满足个人备案的条件,决定把三思屋社区看天博客迁走,所以有了这篇文章。

本来没打算写这篇文章,这样的经历估计没有多少人会有——我习惯于用比较古老的discuz6.0,看了下目前网络上已经没有了相关的资料,只能自己动手处理了。本文先介绍一下如何从discuz6.0转入blogger

处理discuz迁移主要思路是,从数据库中导出符合wordpress格式的csv文件,导入wordpress中转,然后再通过wordpress的插件,导入至blogger

准备工作

安装wordpress,并安装really-simple-csv-importer插件和export-to-blogger插件

接着如果discuz所在的mysql版本是5.x,需要准备mysql 8.0库,并将discuz导入新库

数据整理

执行sql对discuz的cdb_posts表进行查询,导出新表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SELECT 
p.tid post_id
,CONCAT('thread-',p.tid,'-1-1') post_name
,p.author post_author
,FROM_UNIXTIME(p.dateline) post_date
,'post' post_type
,'publish' post_status
,p.`subject` post_title
,REPLACE(REPLACE(p.message, char(13), '<br />'), char(10), '<br />') post_content
,f.`name` post_category
,'' post_tags
,useip custom_field
FROM cdb_posts p
LEFT JOIN cdb_forums f ON f.fid = p.fid
WHERE p.first = 1 AND invisible = 0 AND  p.fid IN (28,24,27,25,17,16,12,99,18,26,96)

对discuz文章中的ubb标签通过mysql的REGEXP_REPLACE正则替换函数,转为html标签 这一步需要在mysql8.0库中进行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
-- 测试语句
SELECT post_content,
REGEXP_REPLACE(post_content,'(\[img\])(.*?)(\[/img\])','<img src="$2" />') post_content2
FROM cdb_posts
WHERE post_content like '%[img%';

-- 实际替换ubb符号
UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[url\])(.*?)(\[/url\])','<a href="$2" target="_blank">$2</a>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[url=(.*?)\])(.*?)(\[/url\])','<a href="$2" target="_blank">$3</a>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[img\])(.*?)(\[/img\])','<img src="$2" />');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[img=(.*?),(.*?)\])(.*?)(\[/img\])','<img width="$2" height="$3" src="$4" />');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[quote\])(.*?)(\[/quote\])','<quote>$2</quote>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[color=(.*?)\])(.*?)(\[/color\])','$3');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[size=(.*?)\])(.*?)(\[/size\])','$3');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[font=(.*?)\])(.*?)(\[/font\])','$3');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[align=(.*?)\])(.*?)(\[/align\])','<p align="$2">$3</p>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[b\])(.*?)(\[/b\])','<strong>$2</strong>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[i\])(.*?)(\[/i\])','<i>$2</i>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[u\])(.*?)(\[/u\])','<u>$2</u>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[list\])(.*?)(\[/list\])','<ul>$2</ul>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[list=1\])(.*?)(\[/list\])','<ul type="1" style="list-style-type: decimal">$2</ul>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[\*\])','<li>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[free\])(.*?)(\[/free\])','<div class="free">$2</div>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[code\])(.*?)(\[/code\])','<pre><code>$2</code><pre>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[hide\])(.*?)(\[/hide\])','<div>此部分内容未通过审核已移除</div>');

UPDATE cdb_posts
SET post_content = REGEXP_REPLACE(post_content,'(\[hide=(.*?)\])(.*?)(\[/hide\])','<div>此部分内容未通过审核已移除</div>');

将处理后的cdb_posts表导出为csv

导入wordpress中转

将上一步中导出的csv文件,通过really-simple-csv-importer插件导入wordpress

接着使用export-to-blogger插件导出适用于blogger的xml文件

导入blogger

将上一步中导出的xml文件,从blogger后台导入。

好了,去喝杯咖啡吧😄

坚持原创技术分享,您的支持将鼓励我继续创作!