How to check if letter is upper or lower in PHP?
I have texts in UTF-8 with diacritic characters also, and would like to check if first letter of this text is upper case or lower case. How to do this?
@Elizabeth Buckwalter Because I work out other text from this text, and If this first letter is upper than I must do the same with second one.
11 Answers 11
function starts_with_upper($str)
Note that mb_substr is necessary to correctly isolate the first character.
Doesn’t always work. There are Unicode characters that are capital letters (i.e., category Lu) but don’t have a lowercase mapping. Mostly, the mathematical bold/italic/double-struck letters.
@dan04 That’s an excellent point. On top of that, there’s title case (LT). However, the mbstring extension does not expose functions to userspace to test for those properties. It’s a pity because the functionality is there — see svn.php.net/viewvc/php/php-src/trunk/ext/mbstring/…
To clarify, «There are over 100 lowercase letters in the Unicode Standard that have no direct uppercase equivalent.» — unicode.org/faq/casemap_charprop.html
Use ctype_upper for check upper case:
$a = array("Word", "word", "wOrd"); foreach($a as $w) < if(ctype_upper($w)) < print $w; >>
Those are latin chars. ctype_upper doesn’t work with non-ASCII chars (including those nordic latins, as well as many other latin, and especially non-latin chars).
Thank you for both comments! But in the question stands «UTF-8 with diacritic characters» and it works fine. If you need a function for other chars, use answer from Artefacto.
This answer is incorrect for two reasons because you failed to test multibyte characters as the question clearly states. 1. You cannot grab a multibyte character by the 0 byte offset — you will only access the first byte of the letter. 2. ctype_ doesn’t provide the necessary multibyte support for this task.
It is my opinion that making a preg_ call is the most direct, concise, and reliable call versus the other posted solutions here.
echo preg_match('~^\p~u', $string) ? 'upper' : 'lower';
~ # starting pattern delimiter ^ #match from the start of the input string \p #match exactly one uppercase letter (unicode safe) ~ #ending pattern delimiter u #enable unicode matching
Please take notice when ctype_ and < 'a' fail with this battery of tests.
$tests = ['âa', 'Bbbbb', 'Éé', 'iou', 'Δδ']; foreach ($tests as $test) < echo "\n:"; echo "\n\tPREG: " , preg_match('~^\p~u', $test) ? 'upper' : 'lower'; echo "\n\tCTYPE: " , ctype_upper(mb_substr($test, 0, 1)) ? 'upper' : 'lower'; echo "\n\t
âa: PREG: lower CTYPE: lower < a: lower MB: lower Bbbbb: PREG: upper CTYPE: upper < a: upper MB: upper Éé:
If anyone needs to differentiate between uppercase letters, lowercase letters, and non-letters see this post.
It may be extending the scope of this question too far, but if your input characters are especially squirrelly (they might not exist in a category that Lu can handle), you may want to check if the first character has case variants:
\p or \p: a letter that exists in lowercase and uppercase variants (combination of Ll, Lu and Lt).
To include Roman Numerals ("Number Letters") with SMALL variants, you can add that extra range to the pattern if necessary.
echo preg_match('~^[\p\x-\x]~u', $test) ? 'upper' : 'not upper';
$str = 'the text to test'; if($str === strtoupper($str)) < echo 'yepp, its uppercase'; >else
$str <0>is the same as $str[0]. Sometimes substr(string, start, length) is useful with start or length being negative.0>
This answer is incorrect for two reasons because you failed to test multibyte characters as the question clearly states. 1. You cannot grab a multibyte character by the 0 byte offset -- you will only access the first byte of the letter. 2. strtoupper doesn't provide the necessary multibyte support for this task.
As used in Kohana 2 autoloader function:
When a string character is cast to integer it evaluates to its ASCII number. As you know in the ASCII table first there are some control characters and others. Then the uppercase letters from the Latin alphabet. And then the lowercase letters from the Latin alphabet. Thus you can easily check whether the code of a letter is smaller or bigger than the small latin character a .
BTW this is around twice as fast than a solution with regular expressions.
This answer is incorrect because you failed to test multibyte characters as the question clearly states.
Note that PHP provides the ctype family like ctype_upper.
You have to set the locale correctly via setLocale() first to get it to work with UTF-8.
See the comment on ctype_alpha for instance.
Doesn't work on UTF-8. That comment on php.net has -2 (down) votes. Try: setlocale(LC_ALL, 'ru_RU.utf-8'); return ctype_upper('П') === false;
Getting the setLocale() setting to be correct in a dynamic envoronment can be a hassle. More importantly you cannot access a whole multibyte character by the first byte offset. This answer is incorrect/unstable. 3v4l.org/38R6f
I didn't want numbers and others to be an upper char, so I use:
This answer is incorrect because: 1. It is not checking the first character, it is checking the last character. 2. It is not attempting to match multibyte characters as the question clearly states.
If you want it in a nice function, I've used this:
function _is_upper ($in_string)
Your solution is inappropriate/incorrect for the question asked. Your solution does not provide support for "diacritic characters" as clearly stated in the question.
Another possible solution in PHP 7 is using IntlChar
IntlChar provides access to a number of utility methods that can be used to access information about Unicode characters.
$tests = ['âa', 'Bbbbb', 'Éé', 'iou', 'Δδ']; foreach ($tests as $test) < echo ":\t"; echo IntlChar::isUUppercase(mb_substr($test, 0, 1)) ? 'upper' : 'lower'; echo PHP_EOL; >
âa: lower Bbbbb: upper Éé: upper iou: lower Δδ: upper
While @mickmackusa's first pattern ( ~^\p~u ) is good, it will give the wrong result for different general category values (other than "Lu" uppercase letter category). *Note, he has since extended the pattern at the bottom of his answer to include Roman Numerals.
For example
var_dump(preg_match('~^\p~u', 'Ⅷ') ? 'upper' : 'lower'); // Resutl: lower var_dump(preg_match('~^\p~u', 'ⅷ') ? 'upper' : 'lower'); // Result: lower
var_dump(IntlChar::isUUppercase(mb_substr('Ⅷ', 0, 1)) ? 'upper' : 'lower'); // Result: upper var_dump(IntlChar::isUUppercase(mb_substr('ⅷ', 0, 1)) ? 'upper' : 'lower'); // Result: lower
Make sure to use IntlChar::isUUppercase but not IntlChar::isupper if you want to check for characters that are also uppercase but have a different general category value
Note: This library depends on intl (Internationalization extension)
Check if a String is mostly ALL CAPS in PHP
What is the best way to see if a string contains mostly capital letters? The string may also contain symbols, spaces, numbers, so would still want it to return true in those cases. For example: I can check if a strings is ALL-CAPS by something similar to this:
I can loop though all the characters, checking them individually, but thats seems a bit wasteful imho. Is there a better way?
Count the number of uppercase letters. Divide that by the total length. Test if this is more than the threshold.
You could use a regexp with preg_match_all() to get all the uppercase letters. Sum the length of these to get the total number.
"I can loop though all the characters, checking them individually, but thats seems a bit wasteful imho." How do you think most functions are implemented internally? In PHP there's no faster way than to loop the chars imho. In other languages you could get a pointer to the internal bytes and then use CPU instructions to compare 8 bytes at the same time and even that would take a lot of custom code. Don't use Regex if you care about performance.
4 Answers 4
$countUppercase = strlen(preg_replace('/[^A-Z]+/', '', $str)); // or: mb_strlen(. )
. and then divide by strlen($str)
Seems legit use of a preg to me. For better result would just do twice, once for lower, once for upper.
A simple for loop should give the best performance
$numUpper = 0; for ($i = 0; $i < strlen($str); $i++)< if (ctype_upper($str[$i])) < $numUpper++; >> return $numUpper;
Or at least a more controlled outcome. Combine with ctype_alpha to make a determination and not just count.
Another option could be using preg_match_all which returns the number of full pattern matches and mb_strlen.
The pattern \p matches an uppercase letter that has a lowercase variant.
function mostlyUpperInString($s, $threshold) < return preg_match_all("/\p/u", $s) / mb_strlen($s) > $threshold; > function moreUpperThanLower($s, $threshold) < return preg_match_all("/\p/u", $s) / preg_match_all("/\P/u", $s) > $threshold; > $strings = [ "THE 15 SMALL BROWN FOXES JUMP INTO THE BURNING barn!", "The 15 Small Brown Foxes JUMP Into the Burning Barn!" ]; foreach ($strings as $str) < echo $str . " ->80% mostlyUpperInString: ". (mostlyUpperInString($str, 0.8) ? "true" : "false") . PHP_EOL; echo $str . " -> 80% moreUpperThanLower: ". (moreUpperThanLower($str, 0.8) ? "true" : "false") . PHP_EOL; echo PHP_EOL; >
THE 15 SMALL BROWN FOXES JUMP INTO THE BURNING barn! -> 80% mostlyUpperInString: false THE 15 SMALL BROWN FOXES JUMP INTO THE BURNING barn! -> 80% moreUpperThanLower: true The 15 Small Brown Foxes JUMP Into the Burning Barn! -> 80% mostlyUpperInString: false The 15 Small Brown Foxes JUMP Into the Burning Barn! -> 80% moreUpperThanLower: false
ctype_upper
Checks if all of the characters in the provided string , text , are uppercase characters.
Parameters
Note:
If an int between -128 and 255 inclusive is provided, it is interpreted as the ASCII value of a single character (negative values have 256 added in order to allow characters in the Extended ASCII range). Any other integer is interpreted as a string containing the decimal digits of the integer.
As of PHP 8.1.0, passing a non-string argument is deprecated. In the future, the argument will be interpreted as a string instead of an ASCII codepoint. Depending on the intended behavior, the argument should either be cast to string or an explicit call to chr() should be made.
Return Values
Returns true if every character in text is an uppercase letter in the current locale. When called with an empty string the result will always be false .
Examples
Example #1 A ctype_upper() example (using the default locale)
$strings = array( 'AKLWC139' , 'LMNSDO' , 'akwSKWsm' );
foreach ( $strings as $testcase ) if ( ctype_upper ( $testcase )) echo "The string $testcase consists of all uppercase letters.\n" ;
> else echo "The string $testcase does not consist of all uppercase letters.\n" ;
>
>
?>?php
The above example will output:
The string AKLWC139 does not consist of all uppercase letters. The string LMNSDO consists of all uppercase letters. The string akwSKWsm does not consist of all uppercase letters.
See Also
- ctype_alpha() - Check for alphabetic character(s)
- ctype_lower() - Check for lowercase character(s)
- setlocale() - Set locale information
Check if a String is ALL CAPS in PHP
What's the best way to see if a string contains all capital letters? I still want the function to return true if the string also contains symbols or numbers.
9 Answers 9
Check whether strtoupper($str) == $str
mb_strtoupper($str, 'utf-8') == $str
Except it doesnt work for "normal but non english" characters: more strtoupper('øæåØÆÅabcABC') > øæåØÆÅABCABC mb_convert_case or mb_strtoupper
As this answer is getting attention, it is interesting to note that it does not reply correctly to the question, because it does not consider numbers, as requested by the author.
Should note that Ctype functions are part of Variable and Type Related Extensions . Should also note that builtin support for ctype is available since PHP 4.3.0 .
Unfortunately this incorrect answer has too many upvotes to be deleted. Sadly, it will continue to confuse and misinform future researchers. . if only it could be deleted somehow.
If you want numbers included (and by "symbols" most everything else), then what you are actually trying to test for is the absence of lowercase letters:
$all_upper = !preg_match("/[a-z]/", $string)
You can use preg_match() . The regular expression would be /^[^a-z]+$/ .
return preg_match('/^[^a-z]+$/', $string) === 1 ? true : false;
Here is the documentation for preg_match().
$all_uppercase = preg_match('#^[A-Z]+$#', $string);
just make sure you don't use 'i' modifier
if(mb_strtoupper($string)===$string) < do the required task >else
I think you are looking for this function
$myString = "ABCDE"; if (ctype_upper($myString)) // returns true if is fully uppercase
Should note that Ctype functions are part of Variable and Type Related Extensions . Should also note that builtin support for ctype is available since PHP 4.3.0 .
In addition to Alexander Morland's comment on Winston Ewert's answer if you need to deal with utf-8 accented characters you can use the following set of functions:
define('CHARSET', 'utf-8'); function custom_substr($content='',$pos_start=0,$num_char=1) < $substr=''; if(function_exists('mb_substr'))< $substr=mb_substr($content,$pos_start,$num_char,CHARSET); >else < $substr=substr($content,$pos_start,$num_char); >return $substr; > function custom_str_case($string='', $case='lower') < $lower = array( "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "à", "á", "â", "ã", "ä", "å", "æ", "ç", "è", "é", "ê", "ë", "ì", "í", "î", "ï", "ð", "ñ", "ò", "ó", "ô", "õ", "ö", "ø", "ù", "ú", "û", "ü", "ý", "а", "б", "в", "г", "д", "е", "ё", "ж", "з", "и", "й", "к", "л", "м", "н", "о", "п", "р", "с", "т", "у", "ф", "х", "ц", "ч", "ш", "щ", "ъ", "ы", "ь", "э", "ю", "я" ); $upper = array( "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "À", "Á", "Â", "Ã", "Ä", "Å", "Æ", "Ç", "È", "É", "Ê", "Ë", "Ì", "Í", "Î", "Ï", "Ð", "Ñ", "Ò", "Ó", "Ô", "Õ", "Ö", "Ø", "Ù", "Ú", "Û", "Ü", "Ý", "А", "Б", "В", "Г", "Д", "Е", "Ё", "Ж", "З", "И", "Й", "К", "Л", "М", "Н", "О", "П", "Р", "С", "Т", "У", "Ф", "Х", "Ц", "Ч", "Ш", "Щ", "Ъ", "Ъ", "Ь", "Э", "Ю", "Я" ); if($case=='lower')< $string = str_replace($upper, $lower, $string); >else < $string = str_replace($lower, $upper, $string); >return $string; > function custom_strtolower($string) < return custom_str_case($string,'lower'); >function custom_strtoupper($string) < return custom_str_case($string,'upper'); >function custom_ucfirst($string) < $string=custom_strtolower($string); $first_char=custom_substr($string,0,1); $rest_char=custom_substr($string,1,custom_strlen($string)); $first_char=custom_strtoupper($first_char); return $first_char.$rest_char; >function is_uppercase($string='') < $is_uppercase=false; if($string === custom_strtoupper($string)) < $is_uppercase=true; >return $is_uppercase; > function is_ucfirst($string='')