最近做一个项目,因为采集错误,和采集程序过滤不严,导致重复插入多条记录。
本来应该四万条数据,结果插入了十三万数据。下面是我处理的方法和SQL语句。
查询重复的记录数:
select user_name,count(*) as count from user_table group by user_name having count>1;
批量删除重复记录:
delete t from user_table t left join(select user_name,min(id) as min_id from user_table group by user_name) t1 on t.id=t1.min_id where t1.min_id is null;
user_name 是查询的字段名 user_table 是查询的表名