- Read UTF-8 Encoded Data in java
- Using Files’s newBufferedReader()
- Using BufferedReader
- Using DataInputStream’s readUTF() method
- Was this post helpful?
- Related posts
- Count Files in Directory in Java
- How to Remove Extension from Filename in Java
- How to Get Temp Directory Path in Java
- Convert Outputstream to Byte Array in Java
- How to get current working directory in java
- Difference between Scanner and BufferReader in java
- Read UTF-8 Encoded Data in java
- Write UTF-8 Encoded Data in java
- Java read file line by line
- Java FileWriter Example
- Java FileReader Example
- Java – Create new file
- Share this
- Related Posts
- Author
- Related Posts
- Count Files in Directory in Java
- How to Remove Extension from Filename in Java
- How to Get Temp Directory Path in Java
- Convert Outputstream to Byte Array in Java
- How to get current working directory in java
- Difference between Scanner and BufferReader in java
- Reading and Writing UTF-8 Data into File
- How to Read Files in Java
- Reading Text Files in Java with BufferedReader
- Reading UTF-8 Encoded File in Java with BufferedReader
- Using Java Files Class to Read a File
- Reading Small Files in Java with Files Class
- Reading Large Files in Java with Files Class
- Reading Files with Files.lines()
- Reading Text Files in Java with Scanner
- Reading an Entire File
- Conclusion
- Reading UTF8 data from a file using Java
- Writing UTF data to a file
- Example
- Output
- Example
- Output
Read UTF-8 Encoded Data in java
In this post, we will see how to read UTF-8 Encoded Data.
Sometimes, we have to deal with UTF-8 Encoded Data in our application. It may be due localization or may be processing data from user input.
There are multiple ways to read UTF-8 Encoded Data in Java.
Using Files’s newBufferedReader()
We can use java.nio.file.Files’s newBufferedReader() to read UTF8 data to String.
Please note that WriteUTF8newBufferWriter.txt was written from this example.
Using BufferedReader
We need to pass encoding as UTF8 while creating new InputStreamReader .
Please note that UTFDemo.txt was written from this example.
Using DataInputStream’s readUTF() method
We can use DataInputStream readUTF() to read UTF8 data to file.
Please note that WriteUTFDemo.txt was written from this example.
That’s all about how to write UTF-8 Encoded Data in java
Was this post helpful?
Related posts
Count Files in Directory in Java
How to Remove Extension from Filename in Java
How to Get Temp Directory Path in Java
Convert Outputstream to Byte Array in Java
How to get current working directory in java
Difference between Scanner and BufferReader in java
Read UTF-8 Encoded Data in java
Write UTF-8 Encoded Data in java
Java read file line by line
Java FileWriter Example
Java FileReader Example
Java – Create new file
Share this
Related Posts
Author
Related Posts
Count Files in Directory in Java
Table of ContentsUsing java.io.File ClassUse File.listFiles() MethodCount Files in the Current Directory (Excluding Sub-directories)Count Files in the Current Directory (Including Sub-directories)Count Files & Folders in Current Directory (Excluding Sub-directories)Count Files & Folders in Current Directory (Including Sub-directories)Use File.list() MethodUsing java.nio.file.DirectoryStream ClassCount Files in the Current Directory (Excluding Sub-directories)Count Files in the Current Directory (Including Sub-directories)Count […]
How to Remove Extension from Filename in Java
Table of ContentsWays to Remove extension from filename in javaUsing substring() and lastIndexOf() methodsUsing replaceAll() methodUsing Apache common library In this post, we will see how to remove extension from filename in java. Ways to Remove extension from filename in java There are multiple ways to remove extension from filename in java. Let’s go through […]
How to Get Temp Directory Path in Java
Table of ContentsGet Temp Directory Path in JavaUsing System.getProperty()By Creating Temp File and Extracting Temp PathUsing java.io.FileUsing java.nio.File.FilesOverride Default Temp Directory Path In this post, we will see how to get temp directory path in java. Get Temp Directory Path in Java Using System.getProperty() To get the temp directory path, you can simply use System.getProperty(«java.io.tmpdir»). […]
Convert Outputstream to Byte Array in Java
Table of ContentsConvert OutputStream to Byte array in JavaConvert OutputStream to ByteBuffer in Java In this post, we will see how to convert OutputStream to Byte array in Java. Convert OutputStream to Byte array in Java Here are steps to convert OutputStream to Byte array in java. Create instance of ByteArrayOutputStream baos Write data to […]
How to get current working directory in java
Difference between Scanner and BufferReader in java
Table of ContentsIntroductionScannerBufferedReaderDifference between Scanner and BufferedReader In this post, we will see difference between Scanner and BufferReader in java. Java has two classes that have been used for reading files for a very long time. These two classes are Scanner and BufferedReader. In this post, we are going to find major differences and similarities […]
Reading and Writing UTF-8 Data into File
Many times we need to deal with the UTF-8 encoded file in our application. This may be due to localization needs or simply processing user input out of some requirements.
Even some data sources may provide data in UTF-8 format only. In this Java tutorial, we will learn two very simple examples of reading and writing UTF-8 content from a file.
1. Writing UTF-8 Encoded Data into a File
The given below is a Java example to demonstrate how to write “UTF-8” encoded data into a file. It uses the character encoding “UTF-8” while creating the OutputStreamWriter .
File file = new File("c:\\temp\\test.txt"); try (Writer out = new BufferedWriter(new OutputStreamWriter( new FileOutputStream(file), StandardCharsets.UTF_8))) < out.append("Howtodoinjava.com") .append("\r\n") .append("UTF-8 Demo") .append("\r\n") .append("क्षेत्रफल = लंबाई * चौड़ाई") .append("\r\n"); out.flush(); >catch (Exception e)
We need to enable the Eclipse IDE for support of the UTF-8 character set before running the example in Eclipse. By default, it is disabled. If you wish to enable the UTF-8 support in eclipse, we will get the necessary help for my previous post:
Read: How to compile and run a java program written in another language
2. Reading UTF-8 Encoded Data from a File
We need to pass StandardCharsets.UTF_8 into the InputStreamReader constructor to read data from a UTF-8 encoded file.
File file = new File("c:\\temp\\test.txt"); try (BufferedReader in = new BufferedReader( new InputStreamReader(new FileInputStream(file), "UTF8"))) < String str; while ((str = in.readLine()) != null) < System.out.println(str); >> catch (Exception e)
Howtodoinjava.com UTF-8 Demo क्षेत्रफल = लंबाई * चौड़ाई
How to Read Files in Java
Throughout the tutorial, we are using a file stored in the src directory where the path to the file is src/file.txt .
Store several lines of text in this file before proceeding.
Note: You have to properly handle the errors when using these implementations to stick to the best coding practices.
Reading Text Files in Java with BufferedReader
The BufferedReader class reads a character-input stream. It buffers characters in a buffer with a default size of 8 KB to make the reading process more efficient. If you want to read a file line by line, using BufferedReader is a good choice.
BufferedReader is efficient in reading large files.
import java.io.*; public class FileReaderWithBufferedReader < public static void main(String[] args) throws IOExceptionbufferedReader.close(); > >
The readline() method returns null when the end of the file is reached.
Reading UTF-8 Encoded File in Java with BufferedReader
We can use the BufferedReader class to read a UTF-8 encoded file.
This time, we pass an InputStreamReader object when creating a BufferedReader instance.
import java.io.*; public class EncodedFileReaderWithBufferedReader < public static void main(String[] args) throws IOException < String file = "src/fileUtf8.txt"; BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8")); String curLine; while ((curLine = bufferedReader.readLine()) != null)< //process the line as you require System.out.println(curLine); >> >
Using Java Files Class to Read a File
Java Files class, introduced in Java 7 in Java NIO, consists fully of static methods that operate on files.
Using Files class, you can read the full content of a file into an array. This makes it a good choice for reading smaller files.
Let’s see how we can use Files class in both these scenarios.
Reading Small Files in Java with Files Class
The readAllLines() method of the Files class allows reading the whole content of the file and stores each line in an array as strings.
You can use the Path class to get the path to the file since the Files class accepts the Path object of the file.
import java.io.IOException; import java.nio.file.*; import java.util.*; public class SmallFileReaderWithFiles < public static void main(String[] args) throws IOException < String file = "src/file.txt"; Path path = Paths.get(file); Listlines = Files.readAllLines(path); > >
You can use readAllBytes() to retrieve the data stored in the file to a byte array instead of a string array.
byte[] bytes = Files.readAllBytes(path);
Reading Large Files in Java with Files Class
If you want to read a large file with the Files class, you can use the newBufferedReader() method to obtain an instance of BufferedReader class and read the file line by line using a BufferedReader .
import java.io.*; import java.nio.file.*; public class LargeFileReaderWithFiles < public static void main(String[] args) throws IOException < String file = "src/file.txt"; Path path = Paths.get(file); BufferedReader bufferedReader = Files.newBufferedReader(path); String curLine; while ((curLine = bufferedReader.readLine()) != null)< System.out.println(curLine); >bufferedReader.close(); > >
Reading Files with Files.lines()
Java 8 introduced a new method to the Files class to read the whole file into a Stream of strings.
import java.io.IOException; import java.nio.file.*; import java.util.stream.Stream; public class FileReaderWithFilesLines < public static void main(String[] args) throws IOException < String file = "src/file.txt"; Path path = Paths.get(file); Streamlines = Files.lines(path); lines.forEach(s -> System.out.println(s)); lines.close(); > >
Reading Text Files in Java with Scanner
The Scanner class breaks the content of a file into parts using a given delimiter and reads it part by part. This approach is best suited for reading content that is separated by a delimiter.
For example, the Scanner class is ideal for reading a list of integers separated by white spaces or a list of strings separated by commas.
The default delimiter of the Scanner class is whitespace. But you can set the delimiter to another character or a regular expression. It also has various next methods, such as next() , nextInt() , nextLine() , and nextByte() , to convert content into different types.
import java.io.IOException; import java.util.Scanner; import java.io.File; public class FileReaderWithScanner < public static void main(String[] args) throws IOException< String file = "src/file.txt"; Scanner scanner = new Scanner(new File(file)); scanner.useDelimiter(" "); while(scanner.hasNext())< String next = scanner.next(); System.out.println(next); >scanner.close(); > >
In the above example, we set the delimiter to whitespace and use the next() method to read the next part of the content separated by whitespace.
Reading an Entire File
You can use the Scanner class to read the entire file at once without running a loop. You have to pass “\\Z” as the delimiter for this.
scanner.useDelimiter("\\Z"); System.out.println(scanner.next()); scanner.close();
Conclusion
As you saw in this tutorial, Java offers many methods that you can choose from according to the nature of the task at your hand to read text files. You can use BufferedReader to read large files line by line.
If you want to read a file that has its content separated by a delimiter, use the Scanner class.
Also you can use Java NIO Files class to read both small and large files.
Reading UTF8 data from a file using Java
In general, data is stored in a computer in the form of bits (1 or, 0). There are various coding schemes available specifying the set of bytes represented by each character.
Unicode (UTF) − Stands for Unicode Translation Format. It is developed by The Unicode Consortium. if you want to create documents that use characters from multiple character sets, you will be able to do so using the single Unicode character encodings. It provides 3 types of encodings.
- UTF-8 − It comes in 8-bit units (bytes), a character in UTF8 can be from 1 to 4 bytes long, making UTF8 variable width.
- UTF-16 − It comes in 16-bit units (shorts), it can be 1 or 2 shorts long, making UTF16 variable width.
- UTF-32 − It comes in 32-bit units (longs). It is a fixed-width format and is always 1 «long» in length.
Writing UTF data to a file
The readUTF() method of the java.io.DataOutputStream reads data that is in modified UTF-8 encoding, into a String and returns it. Therefore to read UTF-8 data to a file −
- Instantiate the FileInputStream class by passing a String value representing the path of the required file, as a parameter.
- Instantiate the DataInputStream class bypassing the above created FileInputStream object as a parameter.
- read UTF data from the InputStream object using the readUTF() method.
Example
import java.io.DataInputStream; import java.io.EOFException; import java.io.FileInputStream; import java.io.IOException; public class UTF8Example < public static void main(String args[]) < StringBuffer buffer = new StringBuffer(); try < //Instantiating the FileInputStream class FileInputStream fileIn = new FileInputStream("D:\test.txt"); //Instantiating the DataInputStream class DataInputStream inputStream = new DataInputStream(fileIn); //Reading UTF data from the DataInputStream while(inputStream.available()>0) < buffer.append(inputStream.readUTF()); >> catch(EOFException ex) < System.out.println(ex.toString()); >catch(IOException ex) < System.out.println(ex.toString()); >System.out.println("Contents of the file: "+buffer.toString()); > >
Output
Contents of the file: టుటోరియల్స్ పాయింట్ కి స్వాగతిం
The new bufferedReader() method of the java.nio.file.Files class accepts an object of the class Path representing the path of the file and an object of the class Charset representing the type of the character sequences that are to be read() and, returns a BufferedReader object that could read the data which is in the specified format.
The value for the Charset could be StandardCharsets.UTF_8 or, StandardCharsets.UTF_16LE or, StandardCharsets.UTF_16BE or, StandardCharsets.UTF_16 or, StandardCharsets.US_ASCII or, StandardCharsets.ISO_8859_1
Therefore to read UTF-8 data to a file −
- Create/get an object of the Path class representing the required path using the get() method of the java.nio.file.Paths class.
- Create/get a BufferedReader object, that could read UtF-8 data, bypassing the above-created Path object and StandardCharsets.UTF_8 as parameters.
- Using the readLine() method of the BufferedReader object read the contents of the file.
Example
import java.io.BufferedReader; import java.nio.charset.StandardCharsets; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; public class UTF8Example < public static void main(String args[]) throws Exception< //Getting the Path object String filePath = "D:\samplefile.txt"; Path path = Paths.get(filePath); //Creating a BufferedReader object BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8); //Reading the UTF-8 data from the file StringBuffer buffer = new StringBuffer(); int ch = 0; while((ch = reader.read())!=-1) < buffer.append((char)ch+reader.readLine()); >System.out.println("Contents of the file: "+buffer.toString()); > >
Output
Contents of the file: టుటోరియల్స్ పాయింట్ కి స్వాగతిం