开发者

Convert ASCII and UTF-8 to non-special characters with one function

开发者 https://www.devze.com 2023-04-08 11:40 出处:网络
So I\'m building a website that is using a database feed that was already set up and has been used by the client for all their other websites for quite some time.

So I'm building a website that is using a database feed that was already set up and has been used by the client for all their other websites for quite some time.

They fill this database through an external program, and I have no way to change the way I get my data.

Now I have the following problem, sometimes I get strings in UTF-8 and sometimes in ASCII (I hope I've got these terms right, they're still a bit vague to me sometimes).

So I could get either this: Scénic or Scénic.

Now the problem is, I have to convert this to non-special characters (so it would become Scenic) for urls.

I don't think there's a function for converting é to e (if there is do tell) so I'll probably need to create an array for that containing all the source and destinations, but the bigger problem is converting é to é without breaking é when it comes through that开发者_如何学运维 function.

Or should I just create an array containing everything

(so for example: array('é'=>'e','é'=>'e'); etc.

I know how to get é to é, by doing utf8_encode(html_entity_decode('é')), however putting é through this same function will return é.

Maybe I'm approaching this the wrong way, but in that case I'd love to know how I should approach it.


Thanks to @XzKto and this comment on PHP.net I changed my slug function to the following:

static function slug($input){

    $string = html_entity_decode($input,ENT_COMPAT,"UTF-8");

    $oldLocale = setlocale(LC_CTYPE, '0');  

    setlocale(LC_CTYPE, 'en_US.UTF-8');
    $string = iconv("UTF-8","ASCII//TRANSLIT",$string);

    setlocale(LC_CTYPE, $oldLocale);

    return strtolower(preg_replace('/[^a-zA-Z0-9]+/','-',$string));

}

I feel like the setlocale part is a bit dirty but this works perfectly for translating special characters to their 'normal' equivalents.

Input a áñö ïß éèé returns a-ano-iss-eee

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号