- Parsing Strings with split
- Parsing Strings in Java
- When there is just one character used as a delimiter
- Example 1
- Example 2
- When there are several characters being used as delimiters
- Example 3
- Example 4
- General template for using split
- Парсинг строк в Java
- How to Parse a String in Java
- Parse a String in Java
- Method 1: Parse String by Using Java split() Method
- Method 2: Parse String by Using Java Scanner Class
- Method 3: Parse String by Using StringUtils Class
- Conclusion
- About the author
- Farah Batool
Parsing Strings with split
When we have a situation where strings contain multiple pieces of information (for example, when reading in data from a file on a line-by-line basis), then we will need to parse (i.e., divide up) the string to extract the individual pieces.
Parsing Strings in Java
Issues to consider when parsing a string:
- What are the delimiters (and how many are there)?
- How should consecutive delimiters be treated?
When there is just one character used as a delimiter
Example 1
We want to divide up a phrase into words where spaces are used to separate words. For example
the music made it hard to concentrate
In this case, we have just one delimiter (space) and consecutive delimiters (i.e., several spaces in a row) should be treated as one delimiter. To parse this string in Java, we do
String phrase = "the music made it hard to concentrate"; String delims = "[ ]+"; String[] tokens = phrase.split(delims);
- the general form for specifying the delimiters that we will use is "[delim_characters]+" . (This form is a kind of regular expression. You don’t need to know about regular expressions — just use the template shown here.) The plus sign (+) is used to indicate that consecutive delimiters should be treated as one.
- the split method returns an array containing the tokens (as strings). To see what the tokens are, just use a for loop:
Example 2
Suppose each string contains an employee’s last name, first name, employee ID#, and the number of hours worked for each day of the week, separated by commas. So
Smith,Katie,3014,,8.25,6.5. 10.75,8.5
represents an employee named Katie Smith, whose ID was 3014, and who worked 8.25 hours on Monday, 6.5 hours on Tuesday, 10.75 hours on Friday, and 8.5 hours on Saturday. In this case, we have just one delimiter (comma) and consecutive delimiters (i.e., more than one comma in a row) should not be treated as one. To parse this string, we do
String employee = "Smith,Katie,3014,,8.25,6.5. 10.75,8.5"; String delims = "[,]"; String[] tokens = employee.split(delims);
After this code executes, the tokens array will contain ten strings (note the empty strings): «Smith», «Katie», «3014», «», «8.25», «6.5», «», «», «10.75», «8.5»
There is one small wrinkle to be aware of (regardless of how consecutive delimiters are handled): if the string starts with one (or more) delimiters, then the first token will be the empty string («»).
When there are several characters being used as delimiters
Example 3
Suppose we have a string containing several English sentences that uses only commas, periods, question marks, and exclamation points as punctuation. We wish to extract the individual words in the string (excluding the punctuation). In this situation we have several delimiters (the punctuation marks as well as spaces) and we want to treat consecutive delimiters as one
String str = "This is a sentence. This is a question, right? Yes! It is."; String delims = "[ . ]+"; String[] tokens = str.split(delims);
All we had to do was list all the delimiter characters inside the square brackets ( [ ] ).
Example 4
Suppose we are representing arithmetic expressions using strings and wish to parse out the operands (that is, use the arithmetic operators as delimiters). The arithmetic operators that we will allow are addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (^) and we will not allow parentheses (to make it a little simpler). This situation is not as straight-forward as it might seem. There are several characters that have a special meaning when they appear inside [ ]. The characters are ^ - [ and two &s in a row(&&). In order to use one of these characters, we need to put \\ in front of the character:
String expr = "2*x^3 - 4/5*y + z^2"; String delims = "[+\\-*/\\^ ]+"; // so the delimiters are: + - * / ^ space String[] tokens = expr.split(delims);
General template for using split
String s = string_to_parse; String delims ; // use + to treat consecutive delims as one; // omit to treat consecutive delims separately String[] tokens = s.split(delims);
Парсинг строк в Java
Перед программистами часто стоят задачи, решение которых не всегда очевидно. Одна из таких задач — парсинг строк. Он используется при чтении данных с консоли, файла и других источников. Большинство данных, которые передаются через интернет, тоже находятся в строчном виде. К сожалению, производить математические операции со строками невозможно. Поэтому, каждому программисту необходимо точно знать, как производить преобразование строки в число в Java. В строках могут содержаться различные числовые типы:
- byte;
- short;
- int;
- long;
- float;
- double.
Для извлечения из строки числового значения необходимого типа, нужно воспользоваться его классом-оберткой:
byte a = Byte.parseByte("42"); short b = Short.parseShort("42"); int c = Integer.parseInt("42"); long d = Long.parseLong("42"); float e = Float.parseFloat("42.0"); double f = Double.parseDouble("42.0");
- Если в метод передать строку, которая не является целочисленным значением, будет получена ошибка java.lang.NumberFormatException , которая будет сообщать, что полученная строка не является целочисленным значением.
- NumberFormatException произойдет и в том случае, если переданная строка будет содержать пробел.
- parseInt() — может работать с отрицательными числами. Для этого строка должна начинаться с символа “-”.
- parseInt() — не может распарсить строку, если числовое значение выходит за пределы типы int (-2147483648 .. 2147483647).
How to Parse a String in Java
In the Java language, strings are objects that represent a character sequence. Also, string objects are immutable, so they cannot be updated after their creation. However, you can parse them by dividing the string to get the specific part of the string, which is considered a token.
This blog will explain how to parse a string in Java. Let’s start!
Parse a String in Java
In Java, there exist three main approaches to parse a string:
- Parse string by using Java Split() method
- Parse string by using Java Scanner class
- Parse string by using StringUtils class
We will now discuss each of the above-mentioned approaches in detail.
Method 1: Parse String by Using Java split() Method
In Java, there is a split() method of the String class that splits the given string and returns the array of substrings. It keeps the given string unchanged. Also, the split() method is case-sensitive.
The split() method has two variations.
Have a look at the given examples to know more about the usage of the split() method.
Example 1: Parsing String in Java Using split(regular-expression/delimiter) variant
In this example, we will use the split(regular-expression/delimiter) variant of the split() method. A regular expression is passed as an argument for this method. If the given expression matches the string, it will divide the string; otherwise, it will print out the whole string.
Here, we have a string named stg:
While using the split() method, you can also use different delimiters as a condition, such as any alphabet from string or special character, and many more. The below-given string will be split based on the white spaces:
Lastly, for printing the parsed string, use for loop:
As you can see, the split() method has successfully parsed the given string based on the occurrence of the white spaces:
Example 2: Parsing String in Java Using split(regular-expression/delimiter, limit) variant
This variant works almost the same as the above. Here, we will add the limit with a delimiter, which determines the number of splitting strings according to the string length.
For instance, we have a string named stg:
We will use for loop to split the whole stg string with white space as delimiter and limit as 3:
The specified string will split as the space occurs, and it will return three strings according to the added limit:
Method 2: Parse String by Using Java Scanner Class
To parse a string, Java Scanner class commonly uses a regular expression or regex. It divides the given string into tokens by using a useDelimiter() method.
Example
First, we have a string stng that needs to be parsed:
Create an object of the Scanner class and pass the string stng as a parameter:
The delimiter pattern is set using the useDelimiter() method of the Scanner class. Here, we will pass the colon “:” as a delimiter pattern in the useDelimiter() method:
This method splits the string when it finds a colon. To obtain all of the tokens in the string, use the hasNext() method in a while loop and print the result:
Method 3: Parse String by Using StringUtils Class
In order to parse a string using StringUtils class, first of all, we will create a maven project rather than a simple Java project and then add dependencies to it.
Here, we have specified maven dependency for adding StringUtils library in our XML file:
Then, create a Java file, and use the StringUtils class to parse the string stng:
We will use the substringsBetween() method of StringUtils class, specify the stng string and pass “:” and “!” as delimiters which means that the resultant value will contain the substring that is present in between these two delimiters:
To print the parsed strings, utilize for loop:
The output will display the substring between “:” and “!” delimiters:
We have provided the information related to parsing a string in Java.
Conclusion
To parse a string in Java, you can use the Java String split() method, Java Scanner class, or StringUtils class. For parsing a string based on the specified condition, these methods use delimiters to split the string. However, the split() method is majorly utilized as it supports adding delimiter/ regex and the relative limit. This blog explained the methods to parse a string in Java with examples.
About the author
Farah Batool
I completed my master’s degree in computer science. I am an academic researcher and love to learn and write about new technologies. I am passionate about writing and sharing my experience with the world.