Java Virtual Thread vs Platform Thread Performance in Big Data Engineering
(Project Loom) introduced virtual threads, which are lightweight threads. Although they are mapped to native OS threads and controlled by the Java runtime, they use resources more effectively.
(Platform)Normal threads are native operating system threads that are controlled by the Java Virtual Machine (JVM). An operating system thread is represented by each thread.
1)Performance Overhead (Memory)
Virtual threads use less memory and transition between contexts, they have low overhead. For high-concurrency and I/O-bound applications, they are more effective.
Platform threads because of the high overhead associated with kernel context switching and memory usage, these threads are heavyweight.
2)Performance Overhead (Scalability)
Virtual threads unlike traditional threads, which have high memory and performance costs, virtual threads scale very well, supporting millions of concurrent processes.
Platform threads due to the significant overhead, they don't scale well for applications that have a lot of concurrent activities.
Lets see the quick overview with the sample code. we have a large data file that need to process quickly using the latest java virtual thread and see time metrics while comparing with conventional thread.
Produce a sample Big data csv file with one million records having 10 columns each
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
import java.util.Random;
public class CsvGenerator {
private static final String CSV_FILE_PATH = "./customer_data/csv_file_0135.csv";
private static final int NUM_RECORDS = 1000000; // 1 million records
private static final int NUM_COLUMNS = 10;
private static String generateRandomCsvLine(int numColumns) {
StringBuilder line = new StringBuilder();
Random random = new Random();
for (int i = 0; i < numColumns; i++) {
line.append(random.nextInt(10000));
if (i < numColumns - 1) {
line.append(",");
}
}
return line.toString();
}
public static void main(String[] args) {
try (BufferedWriter writer = new BufferedWriter(new FileWriter(CSV_FILE_PATH))) {
for (int i = 0; i < NUM_RECORDS; i++) {
writer.write(generateRandomCsvLine(NUM_COLUMNS));
writer.newLine();
}
System.out.println("Customer Data generated successfully.");
} catch (IOException e) {
e.printStackTrace();
}
}
}The above program produces a sample data file like this

Now we have the sample data file lets write some java code to process this file.
first we need to write performance metrics method it simply calculate the time start time and end time how much time it consumed to process that particular task.
private static void performanceMetrics(Runnable task) throws InterruptedException {
long startTime = System.currentTimeMillis();
task.run();
long endTime = System.currentTimeMillis();
System.out.println("Time taken: " + (endTime - startTime) + "ms");
}Then we need to read that csv data file that produces in last step
private static final String CSV_FILE_PATH = "./customer_data/csv_file_0135.csv";
private static final int NUM_RECORDS = 1000000; // Additonal check
private static void readCsv() {
try (BufferedReader reader = new BufferedReader(new FileReader(CSV_FILE_PATH))) {
String line;
int count = 0;
while ((line = reader.readLine()) != null && count < NUM_RECORDS) {
count++;
}
} catch (IOException e) {
e.printStackTrace();
}
}Then the final step submit these task in virtual thread and platform thread
Virtual Thread
private static void readWithVirtualThreads() throws InterruptedException {
var executor = Executors.newVirtualThreadPerTaskExecutor();
for (int i = 0; i < 10; i++) {
executor.submit(ThreadComparison::readCsv);
}
executor.shutdown();
executor.awaitTermination(1, TimeUnit.HOURS);
}Platform Thread
private static void readWithPlatformThreads() throws InterruptedException {
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
executor.submit(ThreadComparison::readCsv);
}
executor.shutdown();
executor.awaitTermination(1, TimeUnit.HOURS);
}Here is the complete source code
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
public class ThreadCompare {
private static final String CSV_FILE_PATH = "./customer_data/csv_file_0135.csv";
private static final int NUM_RECORDS = 1000000;
private static void performanceMetrics(Runnable task) throws InterruptedException {
long startTime = System.currentTimeMillis();
task.run();
long endTime = System.currentTimeMillis();
System.out.println("Time taken: " + (endTime - startTime) + "ms");
}
private static void readWithPlatformThreads() throws InterruptedException {
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
executor.submit(ThreadCompare::readCsv);
}
executor.shutdown();
executor.awaitTermination(1, TimeUnit.HOURS);
}
private static void readWithVirtualThreads() throws InterruptedException {
var executor = Executors.newVirtualThreadPerTaskExecutor();
for (int i = 0; i < 10; i++) {
executor.submit(ThreadCompare::readCsv);
}
executor.shutdown();
executor.awaitTermination(1, TimeUnit.HOURS);
}
private static void readCsv() {
try (BufferedReader reader = new BufferedReader(new FileReader(CSV_FILE_PATH))) {
String line;
int count = 0;
while ((line = reader.readLine()) != null && count < NUM_RECORDS) {
count++;
}
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws Exception {
System.out.println("Normal Thread Performance:");
performanceMetrics(() -> {
try {
readWithPlatformThreads();
} catch (InterruptedException e) {
e.printStackTrace();
}
});
System.out.println("\nVirtual Thread Performance:");
performanceMetrics(() -> {
try {
readWithVirtualThreads();
} catch (InterruptedException e) {
e.printStackTrace();
}
});
}
}After execution of the above program the following performance metrics is generated.



It clearly indicates that the normal platform thread takes huge amount of time consuming more compute resources comparing to project loom virtual thread.
In summary conventional thread are more suitable for applications with a moderate number of concurrent tasks that require less maintenance and complexity. However, virtual threads are perfect for high-concurrency applications with low overhead and efficient resource utilization.
Thanks for reading this article and stay tuned :-)