Java read parse file

Parsing a file with Stream API in Java 8

Streams are everywhere in Java 8. Just look around and for sure you will find them. It also applies to java.io.BufferedReader . Parsing a file in Java 8 with Stream API is extremely easy. I have a CSV file that I want to be read. An example below:

username;visited jdoe;10 kolorobot;4
  • Open a source for reading,
  • Get the first line and parse it,
  • Split line by a separator,
  • Get the first line and parse it,
  • Convert the line to list of strings and return.
class CsvReader < private static final String SEPARATOR = ";"; private final Reader source; CsvReader(Reader source) < this(source); >List readHeader() < try (BufferedReader reader = new BufferedReader(source)) < return reader.lines() .findFirst() .map(line ->Arrays.asList(line.split(SEPARATOR))) .get(); > catch (IOException e) < throw new UncheckedIOException(e); >> >

Fairly simple. Self-explanatory. Similarly, I created a method to read all records. The algorithm for reading the records is as follows:

  • Open a source for reading,
  • Skip the first line,
  • Split line by a separator,
  • Apply a mapper on each line that maps a line to a list of strings.
class CsvReader < List> readRecords() < try (BufferedReader reader = new BufferedReader(source)) < return reader.lines() .substream(1) .map(line ->Arrays.asList(line.split(separator))) .collect(Collectors.toList()); > catch (IOException e) < throw new UncheckedIOException(e); >> >

Nothing fancy here. What you could notice is that a mapper in both methods is exactly the same. In fact, it can be easily extracted to a variable:

Function> mapper = line -> Arrays.asList(line.split(separator));

To finish up, I created a simple test.

public class CsvReaderTest < @Test public void readsHeader() < CsvReader csvReader = createCsvReader(); Listheader = csvReader.readHeader(); assertThat(header) .contains("username") .contains("visited") .hasSize(2); > @Test public void readsRecords() < CsvReader csvReader = createCsvReader(); List records = csvReader.readRecords(); assertThat(records) .contains(Arrays.asList("jdoe", "10")) .contains(Arrays.asList("kolorobot", "4")) .hasSize(2); > private CsvReader createCsvReader() < try < Path path = Paths.get("src/test/resources", "sample.csv"); Reader reader = Files.newBufferedReader( path, Charset.forName("UTF-8")); return new CsvReader(reader); >catch (IOException e) < throw new UncheckedIOException(e); >> >

Источник

How to read and parse a CSV file in Java

A Comma Separated Values (CSV) file is a simple text file that stores data in a tabular format, where columns are separated by a delimiter (usually a comma or a tab). These files are typically used for importing and exporting data between servers and applications.

In my previous articles, I wrote about reading and write CSV files using core Java, OpenCSV, Apache Common CSV, and Spring Boot.

In this article, we shall look at different ways to read and parse a CSV file in Java.

Here is an example of a simple CSV file that uses a comma ( , ) as a delimiter to separate column values and doesn’t contain any double-quote:

1,John Doe,john@example.com,AE 2,Alex Jones,alex@example.com,DE 3,Jovan Lee,jovan@example.com,FR 4,Greg Hover,greg@example.com,US 

To read and parse a simple CSV like the above that doesn’t contain the delimiter inside column values, core Java classes can be used. You can either use the BufferedReader class or the Scanner class to easily read the file in Java.

Since CSV is just a plain-text file, the BufferedReader class can be used to read it line by line. You can then use the String.split() method to split each line by comma to convert it into columns. Here is an example:

// create a reader try (BufferedReader br = Files.newBufferedReader(Paths.get("users.csv")))  // CSV file delimiter String DELIMITER = ","; // read the file line by line String line; while ((line = br.readLine()) != null)  // convert line into columns String[] columns = line.split(DELIMITER); // print all columns System.out.println("User["+ String.join(", ", columns) +"]"); > > catch (IOException ex)  ex.printStackTrace(); > 
User[1, John Doe, john@example.com, AE] User[2, Alex Jones, alex@example.com, DE] User[3, Jovan Lee, jovan@example.com, FR] User[4, Greg Hover, greg@example.com, US] 

Another way of reading and parsing a CSV file in core Java is using the Scanner class. This class converts its input into tokens using a delimiter pattern. The resulting tokens may then be converted into values of different types using different next() methods. Here is an example that shows how you can use Scanner to read and parse a CSV file:

// create scanner instance try (Scanner scanner = new Scanner(Paths.get("users.csv").toFile()))  // CSV file delimiter String DELIMITER = ","; // set comma as delimiter scanner.useDelimiter(DELIMITER); // read all fields while (scanner.hasNext())  System.out.print(scanner.next() + " "); > > catch (IOException ex)  ex.printStackTrace(); > 
1 John Doe john@example.com AE 2 Alex Jones alex@example.com DE 3 Jovan Lee jovan@example.com FR 4 Greg Hover greg@example.com US 

OpenCSV is a popular library for reading, writing, parsing, serializing, and deserializing CSV files in Java. This library is a good choice for handling different CSV formats, delimiters, and special characters. To add OpenCSV support to your Gradle project, add the following dependency to the build.gradle file:

implementation 'com.opencsv:opencsv:5.0' 
dependency> groupId>com.opencsvgroupId> artifactId>opencsvartifactId> version>5.0version> dependency> 

The following example demonstrates how you can read and parse a CSV file named users.csv using OpenCSV:

// create a csv reader try (Reader reader = Files.newBufferedReader(Paths.get("users.csv")); CSVReader csvReader = new CSVReader(reader))  // read one record at a time String[] record; while ((record = csvReader.readNext()) != null)  System.out.println("User["+ String.join(", ", record) +"]"); > > catch (IOException | CsvValidationException ex)  ex.printStackTrace(); > 
User[1, John Doe, john@example.com, AE] User[2, Alex Jones, alex@example.com, DE] User[3, Jovan Lee, jovan@example.com, FR] User[4, Greg Hover, greg@example.com, US] 

Apache Commons CSV is another 3rd-party library for reading and parsing CSV files in Java. It provides several ways to read CSV files in different formats. For a Gradle project, add the following dependency to the build.gradle file to import Commons CSV:

implementation 'org.apache.commons:commons-csv:1.7' 
dependency> groupId>org.apache.commonsgroupId> artifactId>commons-csvartifactId> version>1.7version> dependency> 

Here is an example that shows how you can use the Apache Commons CSV library to read and parse the contents of a CSV file in Java:

// create a reader try (Reader reader = Files.newBufferedReader(Paths.get("users.csv")))  // read csv file IterableCSVRecord> records = CSVFormat.DEFAULT.parse(reader); for (CSVRecord record : records)  System.out.println("Record #: " + record.getRecordNumber()); System.out.println("ID: " + record.get(0)); System.out.println("Name: " + record.get(1)); System.out.println("Email: " + record.get(2)); System.out.println("Country: " + record.get(3)); > > catch (IOException ex)  ex.printStackTrace(); > 
Record #: 1 ID: 1 Name: John Doe Email: john@example.com Country: AE Record #: 2 ID: 2 Name: Alex Jones Email: alex@example.com Country: DE Record #: 3 ID: 3 Name: Jovan Lee Email: jovan@example.com Country: FR Record #: 4 ID: 4 Name: Greg Hover Email: greg@example.com Country: US 

Check out Apache Commons CSV tutorial for a deeper understanding of how It works and how you can use it to read and write different CSV formats.

That’s all for reading and parsing a CSV file in Java. We talked about different ways of reading and parsing a CSV file, including core Java and 3rd-party libraries like OpenCSV and Apache Commons CSV. For simple CSV file formats where column values do not contain the delimiter itself, core Java is a good choice. For more complex CSV file formats, you should rely on a 3rd-party library like OpenCSV or Apache Commons CSV for correctly parsing the data. Personally, I prefer to use OpenCSV because it supports multiple CSV formats, special characters, and more. If you want to create and download a CSV file in a Spring Boot application, check out this excellent tutorial I wrote a while ago. ✌️ Like this article? Follow me on Twitter and LinkedIn. You can also subscribe to RSS Feed.

You might also like.

Источник

Parse and Read a CSV File in Java

A CSV file is used to store tabular data in plain-text form. A comma delimiter is used to identify and separate different data tokens in the CSV file.

  • CSV (Comma Separated Values) files are used by consumers, businesses, and scientific applications. Among its most common uses is moving tabular data between programs in runtime that natively operate on incompatible formats.
  • CSV data is popular because so many programs and languages support some variation of CSV at least as an alternative import/export format.

In Java, there are different ways of reading and parsing CSV files. Let us discuss some of the best approaches:

OpenCSV is a brilliant library for operating on CSV files. It has the following features:

  • Reading arbitrary numbers of values per line
  • Ignoring commas in quoted elements
  • Handling entries that span multiple lines
  • Configurable separator and quote characters
  • Read all the entries at once, or use an Iterator-style model

Import the latest version of OpenCSV into project dependencies.

Example 1: Reading the CSV File line by line into String[]

In the given example, we are using CSVReader class from OpenCSV library which wraps a FileReader for reading the actual CSV file. The file uses the delimiter comma.

  • Using the reader.readNext() , we read the CSV file line by line.
  • It throws IOException if an error occurs in reading the file.
  • It throws CsvValidationException if the read line is not a valid CSV string.
  • When all the lines are read, readNext() method returns null and the program terminates.
try(CSVReader reader = new CSVReader(new FileReader("SampleCSVFile.csv"))) < String [] nextLine; //Read one line at a time while ((nextLine = reader.readNext()) != null) < //Use the tokens as required System.out.println(Arrays.toString(nextLine)); >> catch (IOException | CsvValidationException e)

2. Using Super CSV Library

Super CSV is to be the foremost, fastest, and most programmer-friendly, free CSV package for Java. It supports a very long list of useful features out of the box, such as:

  • Ability to read and write data as POJO classes
  • Automatic encoding and decoding of special characters
  • Custom delimiter, quote character and line separator
  • Support for cell processors to process each token in a specific manner
  • Ability to apply one or more constraints, such as number ranges, string lengths or uniqueness
  • Ability to process CSV data from files, strings, streams and even zip files

Add the latest version of the latest version of Super CSV in the project.

 net.sf.supercsv super-csv 2.4.0 

Example 2: Reading the CSV File into POJO

We will read the following CSV file.

CustomerId,CustomerName,Country,PinCode,Email 10001,Lokesh,India,110001,abc@gmail.com 10002,John,USA,220002,def@gmail.com 10003,Blue,France,330003,ghi@gmail.com

The corresponding POJO class is:

Remember that the column names should match up exactly with the bean’s field names, and the bean has the appropriate setters defined for each field.

import java.io.FileReader; import java.io.IOException; import org.supercsv.cellprocessor.Optional; import org.supercsv.cellprocessor.ParseInt; import org.supercsv.cellprocessor.ParseLong; import org.supercsv.cellprocessor.constraint.NotNull; import org.supercsv.cellprocessor.constraint.StrRegEx; import org.supercsv.cellprocessor.ift.CellProcessor; import org.supercsv.io.CsvBeanReader; import org.supercsv.io.ICsvBeanReader; import org.supercsv.prefs.CsvPreference; public class ReadCSVFileExample < static final String CSV_FILENAME = "data.csv"; public static void main(String[] args) throws IOException < try(ICsvBeanReader beanReader = new CsvBeanReader(new FileReader(CSV_FILENAME), CsvPreference.STANDARD_PREFERENCE)) < // the header elements are used to map the values to the bean final String[] headers = beanReader.getHeader(true); //final String[] headers = new String[]; final CellProcessor[] processors = getProcessors(); Customer customer; while ((customer = beanReader.read(Customer.class, headers, processors)) != null) < System.out.println(customer); >> > /** * Sets up the processors used for the examples. */ private static CellProcessor[] getProcessors() < final String emailRegex = "[a-z0-9\\._]+@[a-z0-9\\.]+"; StrRegEx.registerMessage(emailRegex, "must be a valid email address"); final CellProcessor[] processors = new CellProcessor[] < new NotNull(new ParseInt()), // CustomerId new NotNull(), // CustomerName new NotNull(), // Country new Optional(new ParseLong()), // PinCode new StrRegEx(emailRegex) // Email >; return processors; > >

The Scanner class breaks its input into tokens using a specified delimiter pattern. The default delimiter is whitespace.

  • We can use a separate Scanner to read lines, and another scanner to parse each line into tokens. This approach may not be useful for large files because it is creating one scanner instance per line.
  • We can use the delimiter comma to parse the CSV file.
  • The CSV tokens may then be converted into values of different datatypes using the various next() methods.

Example 3: Parsing a CSV file using Scanner

try(Scanner scanner = new Scanner(new File("SampleCSVFile.csv"))) < //Read line while (scanner.hasNextLine()) < String line = scanner.nextLine(); //Scan the line for tokens try (Scanner rowScanner = new Scanner(line)) < rowScanner.useDelimiter(","); while (rowScanner.hasNext()) < System.out.print(scanner.next()); >> > > catch (FileNotFoundException e)

4. Using BufferedReader and String.split()

In this approach, we use BufferedReader to read the file line by line. Then the String.split() function is used to get tokens from the current line based on provided delimiter as the method parameter.

It is useful for small strings or small files.

Example 4: Splitting the CSV String or CSV File

In the given example, we are reading a file line by line. Then each line is split into tokens with a delimiter comma.

try(BufferedReader fileReader = new BufferedReader(new FileReader(«SampleCSVFile.csv»))) < String line = ""; //Read the file line by line while ((line = fileReader.readLine()) != null) < //Get all tokens available in line String[] tokens = line.split(","); //Verify tokens System.out.println(Arrays.toString(tokens)); >> catch (IOException e)

Reading a CSV file is possible with many approaches in Java. As Java does not directly have dedicated APIs for CSV handling, we can rely on open-source libraries such as SuperCSV that are very easy to use and highly configurable.

Источник

Читайте также:  Что такое websocket java
Оцените статью