{"id":105,"date":"2014-01-16T08:32:29","date_gmt":"2014-01-16T08:32:29","guid":{"rendered":"http:\/\/www.eminozlem.com\/?p=105"},"modified":"2015-06-06T21:30:17","modified_gmt":"2015-06-06T21:30:17","slug":"detecting-the-crazy-letters","status":"publish","type":"post","link":"https:\/\/eminozlem.com\/tr\/detecting-the-crazy-letters\/","title":{"rendered":"Detecting the &#8216;crazy&#8217; letters"},"content":{"rendered":"<p>It&#8217;s always a pain to work with multibyte characters; most fonts only support ISO-8859-1 English letters. I was working with Imagemagick but some of the strings contained Arabic, Chinese, Russian letters, even some Chinese \/ Korean, and i did not want to end up with ???&#8217;s instead of non-supported characters all over the place.<\/p>\n<p>The solution is easy if you only want to work with 8859-1 set:<\/p>\n<pre class=\"lang:default decode:true\">if(strlen($string) == mb_strlen($string, 'utf-8')) {\r\n\/\/ the string's strlen count and mb strlen count is same so; no multibyte characters here.\r\n}<\/pre>\n<p>or if the string only consists of foreign letters:<\/p>\n<pre class=\"lang:default decode:true\"> if ( ! preg_match(\"\/\\p{Latin}+\/u\", $string) ) {\r\n\/\/ The string doesnt contain any Latin characters, all crazy letters here.\r\n }<\/pre>\n<p>But i wanted to work with extended Latin characters such as Nordic, Turkish, German etc. letters. When the string is mixed it gets a little tricky. What i ended up doing is to check first for any Latin character occurence, no point of further checking if there&#8217;s no Latin character there. And then going on to checking for other common &#8220;crazy&#8221; alphabets that might come into play. If the string contains any of these alphabets I&#8217;d just skip them.<\/p>\n<pre class=\"lang:default decode:true \"> if ( ! preg_match(\"\/\\p{Latin}+\/u\", $string) ) {\r\n          echo \"No Latin characters here\";\r\n          return;\r\n }\r\n else {\r\n    $crazies = array(\"Han\",\"Hangul\",\"Hebrew\",\"Arabic\",\"Cyrillic\",\"Greek\",\"Khmer\");\r\n    foreach($crazies as $crazie) {\r\n        if ( preg_match(\"\/\\p{\".$crazie.\"}+\/u\", $string) ) {\r\n            echo \"returned because has crazy:\" . $crazie; .\" letters.\";\r\n            return;    \r\n        }\r\n    }\r\n }<\/pre>\n<p>Still not sure if that&#8217;s the most efficient way to do it but works for me.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s always a pain to work with multibyte characters; most fonts only support ISO-8859-1 English letters. I was working with Imagemagick but some of the strings contained Arabic, Chinese, Russian letters, even some Chinese \/ Korean, and i did not want to end up with ???&#8217;s instead of non-supported characters all over the place. The&#8230;<\/p>\n","protected":false},"author":1,"featured_media":106,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[7],"tags":[28,9],"class_list":["post-105","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-coding","tag-multibyte-characters","tag-php"],"acf":[],"_links":{"self":[{"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/posts\/105","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/comments?post=105"}],"version-history":[{"count":0,"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/posts\/105\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/media\/106"}],"wp:attachment":[{"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/media?parent=105"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/categories?post=105"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/eminozlem.com\/tr\/wp-json\/wp\/v2\/tags?post=105"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}