Spring Batch Skip and Retry (Fault-Tolerant Batch Processing Guide)

Spring Batch Skip and Retry — Fault-Tolerant Batch Processing (Complete Guide)

In real-world batch jobs, failures are inevitable. External APIs time out, records are malformed, and downstream systems reject requests. A production-ready Spring Batch job must handle these failures gracefully — without stopping the entire job.

This guide explains Skip and Retry in Spring Batch using a real, runnable example that simulates API timeouts and bad requests. We’ll go deep into how Spring Batch actually behaves at runtime, not just configuration snippets.

๐ŸŽฅ Video Walkthrough:
Spring Batch Skip & Retry — Real-Time Execution Explained

TL;DR
✔ Retry transient failures (timeouts, network issues)
✔ Skip permanent failures (bad input, validation errors)
✔ Combine retry + backoff + skip limits for safe batch execution

Why Skip & Retry are critical in production

  • External APIs are unreliable
  • Input data is rarely 100% clean
  • Batch jobs must complete even with partial failures
  • Manual restarts are expensive and error-prone

Without skip and retry, a single bad record can fail an entire batch job — which is unacceptable in enterprise systems.


High-level execution flow

Reader → Processor (retry happens here) → Writer
               ↓
            Skip (after retry exhausted)

Spring Batch always attempts retry first. Only when retry attempts are exhausted does it evaluate skip rules.


Sample input CSV (persons.csv)

id,name,email
1,John Doe,john@yopmail.com
2,Alice,retry@yopmail.com
3,BadRequest User,badrequest1@yopmail.com
4,Test,test@yopmail.com
5,BadRequest User,badrequest2@yopmail.com
6,BadRequest User,badrequest3@yopmail.com
7,BadRequest User,badrequest4y@opmail.com
8,BadRequest User,badrequest5@yopmail.com

Custom exceptions — retryable vs skippable

The most important design decision is deciding which failures are: retryable and which are skippable.

public class ApiTimeoutException extends RuntimeException {
  public ApiTimeoutException(String message) {
    super(message);
  }
}
public class BadRequestException extends RuntimeException {
  public BadRequestException(String message) {
    super(message);
  }
}
Exception Type Behavior
ApiTimeoutException Transient Retry with backoff
BadRequestException Permanent Skip immediately

Processor — where retry really happens

Retry logic should live where failures occur. In this example, the processor calls an external API simulation.

@Component
public class PersonProcessor implements ItemProcessor {

  @Autowired
  private PersonRegistrationProcessor registrationProcessor;

  @Override
  public Person process(Person person) {
    registrationProcessor.registerPerson(person.getEmail());
    return person;
  }
}

If the processor throws an exception, Spring Batch:

  • Retries (if configured)
  • Applies backoff
  • Skips after retry limit is exceeded

Simulating real API failures

@Component
public class PersonRegistrationProcessor {

  private final Map attempts = new ConcurrentHashMap<>();

  public void registerPerson(String email) {

    if ("retry@yopmail.com".equalsIgnoreCase(email)) {
      int count = attempts.merge(email, 1, Integer::sum);
      if (count < 3) {
        throw new ApiTimeoutException("API timeout");
      }
    }

    if (email.contains("badrequest")) {
      throw new BadRequestException("400 Bad Request");
    }
  }
}

This simulates:

  • Retry succeeds on 3rd attempt
  • Bad input skipped immediately

Step configuration — fault tolerance in action

@Bean
public Step processingStep(JobRepository jobRepository,
                           PlatformTransactionManager txManager,
                           ItemReader reader,
                           RegistrationWriter writer,
                           PersonProcessor processor) {

  FixedBackOffPolicy backOff = new FixedBackOffPolicy();
  backOff.setBackOffPeriod(2000);

  return new StepBuilder("learn-skip-and-retry", jobRepository)
    .chunk(1, txManager)
    .reader(reader)
    .processor(processor)
    .writer(writer)
    .faultTolerant()
      .retry(ApiTimeoutException.class)
      .retryLimit(3)
      .backOffPolicy(backOff)
      .skip(BadRequestException.class)
      .skipLimit(4)
    .build();
}
Why chunk size = 1?
Each item is processed in its own transaction, so retry and skip affect only one record at a time.

SkipListener — auditing skipped records

@Component
public class PersonSkipListener implements SkipListener {

  @Override
  public void onSkipInProcess(Person item, Throwable t) {
    log.error("Skipping {} due to {}", item.getEmail(), t.getMessage());
  }
}

In production, you can:

  • Persist skipped records to an error table
  • Export them to a failed CSV
  • Trigger alerts when skip limit is close

What happens when skipLimit is exceeded?

  • The step fails immediately
  • The job is marked FAILED
  • Remaining records are not processed

This prevents silently ignoring large data quality issues.


Common mistakes (and how to avoid them)

Mistake Why it’s bad Better approach
Retrying validation errors Wastes time Skip immediately
Large chunk size Whole chunk rolls back Use chunk=1 for APIs
No backoff Overloads downstream systems Add Fixed or Exponential backoff

Conclusion

Spring Batch Skip and Retry allow you to build resilient batch jobs that can tolerate real-world failures without human intervention. When designed correctly, they transform fragile batch pipelines into robust, self-healing systems.

๐ŸŽฅ Watch the complete execution walkthrough:
Spring Batch Skip & Retry — Full Video Tutorial

๐Ÿงฑ Spring Batch Core Components

Understand how ItemReader, ItemProcessor, and ItemWriter work together when exporting data to CSV files.

๐Ÿ”„ Spring Batch ItemProcessor Example

Apply transformation and formatting logic before writing records into CSV output files.

๐Ÿ” CSV to Database with Spring Batch

Compare inbound (CSV → DB) and outbound (DB → CSV) batch processing patterns.

๐Ÿšซ Skip Policy & Error Handling

Handle write failures and formatting errors gracefully while exporting large datasets.

๐Ÿ”€ Conditional Flow in Spring Batch Jobs

Control job execution paths based on CSV generation success or failure.

๐Ÿงต Multithreaded Step in Spring Batch

Improve export performance by parallelizing data processing and CSV writing steps.