vc++使用boost::wregex处理正则匹配中文内容解决中文乱码问题,已封装为函数。

本人C++新手，搞正则搞了一天。。试了各种方法。C 的正则 C++正则 boost正则。真是头都大了。

匹配中文内容一直乱码，最后功夫不负有心人，让我整理出来了方法。这里写出来给大家分享。

高手勿喷。

std::string getvalue(char *regstr,char *str)

{

wchar_t *regexstr=AnsiToUnicode(regstr);

std::string sToMatch=str;

setlocale( LC_CTYPE, "" );

int iWLen= mbstowcs( NULL, sToMatch.c_str(), sToMatch.length() );

wchar_t *lpwsz= new wchar_t[iWLen+1];

int i1= mbstowcs( lpwsz, sToMatch.c_str(), sToMatch.length() );

std::wstring wsToMatch(lpwsz);

delete []lpwsz;

boost::wregex wrg(regexstr);

boost::wsmatch wsm;

boost::regex_match( wsToMatch, wsm, wrg );

int iLen= wcstombs( NULL, wsm[1].str().c_str(),0);

char *lpsz= new char[iLen+1];

int i= wcstombs( lpsz, wsm[1].str().c_str(), iLen );

//lpsz[iLen] = ‘/0’;

std::string retstr(lpsz);

delete []lpsz;

return retstr;

}

//将单字节char*转化为宽字节wchar_t*
wchar_t* AnsiToUnicode( const char* szStr )
{
int nLen = MultiByteToWideChar( CP_ACP, MB_PRECOMPOSED, szStr, -1, NULL, 0 );
if (nLen == 0)
{
return NULL;
}
wchar_t* pResult = new wchar_t[nLen];
MultiByteToWideChar( CP_ACP, MB_PRECOMPOSED, szStr, -1, pResult, nLen );
return pResult;
}

vc++使用boost::wregex处理正则匹配中文内容解决中文乱码问题,已封装为函数。

About 欧阳逍遥

发表回复

Related Posts

linux 动态库编译：.so: undefined symbol

跨平台c++/boost/asio 简单的HTTP POST请求 客户端模型

C++读写windows剪贴板的内容

About 欧阳逍遥

发表回复

跨平台c++/boost/asio 简单的HTTP POST请求客户端模型