Skip to content
Advertisement

Using RegEx to highlight Arabic text

My database contains Arabic text with diacritics/tashkeel. To search user types without diacritics/tashkeel and I can successfully search using full-text search statements but unable to highlight the search term using regular expressions:

$str="اِنَّ الَّذِیۡنَ اٰمَنُوۡا وَ عَمِلُوا الصّٰلِحٰتِ وَ اَخۡبَتُوۡۤا اِلٰی رَبِّہِمۡ ۙ اُولٰٓئِکَ اَصۡحٰبُ الۡجَنَّۃِ ۚ ہُمۡ فِیۡہَا خٰلِدُوۡنَ";

$ptr="عملوا";

$result = preg_replace("/$ptr/", '<span style="background:yellow">' . $ptr . '</span>', $str);

echo $result;

Any ideas on how to resolve this?

Advertisement

Answer

Your string has extra character like tashkil. but character you want to match have no tashkil so solution is replace extra char and make both strings similar.

<?php
function stripDiacritics($str) {
    $diacritic = array("ِ" ,"ٰ" ,"ّ" ,"ۡ" ,"ٖ" ,"ٗ" ,"ؘ" ,"ؙ" ,"ؚ" ,"ٍ" ,"َ" ,"ُ", "ٓ" ,"ْ" , "ٌ" , "ٍ",  "ً",  "ّ", "ۤ");
    $str = str_replace($diacritic, '', $str); 
    return $str;       
}

$str="اِنَّ الَّذِیۡنَ اٰمَنُوۡا وَ عَمِلُوا الصّٰلِحٰتِ وَ اَخۡبَتُوۡۤا اِلٰی رَبِّہِمۡ ۙ اُولٰٓئِکَ اَصۡحٰبُ الۡجَنَّۃِ ۚ ہُمۡ فِیۡہَا خٰلِدُوۡنَ";
$words = explode(" ",$str);
$resultText='';
foreach ($words as $word) {
    $strippedWord = stripDiacritics($word);
    $ptr="عملوا";
    if ($strippedWord == $ptr) {
        $resultText .= ' <span style="background:yellow">'.$word.'</span>';
    }            
    else {
        $resultText .= ' '.$word;
    }
}
echo $resultText;

enter image description here

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement