开发者

How to remove special charecters in wordpress?

开发者 https://www.devze.com 2023-04-06 08:40 出处:网络
I am using Topsy, It returns me title of highest ranking article of my mebsite, It returns m开发者_Python百科e one RSS file which contains post title with there link. For now i am only taking post nam

I am using Topsy, It returns me title of highest ranking article of my mebsite, It returns m开发者_Python百科e one RSS file which contains post title with there link. For now i am only taking post name and using post title am trying to search in mysql database using following function like this:

get_post_by_title($postTitle,'post');

But the problem is topsy returns me post title but it also add some special characters in RSS file like " ' " replace with " ’ " this charecters.Because of this get_post_by_title() function does not return me post by title name.

EDIT : It returns me one post title like this :

iPad Applications In Bloom’s Taxonomy NEXT

Here single quote is special charecter.

Please help me. Thanks


First let's clear up a misconception: that character in your example is not a "special" character. It is Unicode code point U+2019, "RIGHT SINGLE QUOTATION MARK." Its HTML entity reference is ’. It's an ordinary character - it just happens to be an ordinary character that has no representation in ASCII. Before getting to an answer to your specific question, I need to tell you to read Joel Spolsky's article "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" - it is just what it says on the tin, and unless you absorb at least a little more knowledge about Unicode, you will keep running into problems like this. Don't fret too much: everyone runs into problems like this until they learn how to deal with text. Unicode isn't "hard" so much as it is "prone to exposing unconscious assumptions we make about how text works." †

Now, to your question.

If I'm reading you right, what's happening to you is that you have posts with non-ASCII characters in their titles such as ’ which aren't showing up when you search for them with get_post_by_title() (it seems like you're using something similar to the accepted answer on this question - is that right?) There are two paths to a solution: store the titles in a format that's easier for you to search, or use a searching method that can find non-ASCII characters.

Storing the titles differently would require that you run them through PHP's built-in htmlentities() function or before storing them in your Wordpress DB - you would also want to make sure that you convert characters with no HTML entity equivalent to '\xNN' form, and to make sure that your DB's collation/charset is set to UTF-8 or another Unicode-aware encoding. This will be a nontrivial amount of effort. ‡

Using a different searching method doesn't require tinkering with your DB or digging into WordPress internals, but it does require very careful fiddling with search string. You'll need to either use the exact character you're looking for in a search, expressed as a '\xNN' character reference if necessary, or use wildcards carefully in the search.

Either way, good luck. It may be possible to offer more specific advice if more of your code is visible.



†: By the way, your life with regards to Unicode will also get much, much easier if you use better languages than PHP and better databases than MySQL. WordPress is inextricably tied to PHP and MySQL: PHP & MySQL are both woefully, horrendous, hilariously bad at handling Unicode issues correctly. Your life as a programmer will get better if you extirpate PHP & MySQL from it.

‡: Seriously, PHP is atrociously bad at this, and MySQL is in a shoelaces-tied-together state of fumbling. Avoid them.


remove from wp-config.php

//define('DB_CHARSET', 'utf8');

//define('DB_COLLATE','utf8_unicode_ci');


You can easily remove special characters using preg_replace, see this post -> http://code-tricks.com/filter-non-ascii-characters-using-php/

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号