Spring Batch Skip and Retry — Fault-Tolerant Batch Processing (Complete Guide)
In real-world batch jobs, failures are inevitable. External APIs time out, records are malformed, and downstream systems reject requests. A production-ready Spring Batch job must handle these failures gracefully — without stopping the entire job.
This guide explains Skip and Retry in Spring Batch using a real, runnable example that simulates API timeouts and bad requests. We’ll go deep into how Spring Batch actually behaves at runtime, not just configuration snippets.
๐ฅ Video Walkthrough:
Spring Batch Skip & Retry — Real-Time Execution Explained
✔ Retry transient failures (timeouts, network issues)
✔ Skip permanent failures (bad input, validation errors)
✔ Combine retry + backoff + skip limits for safe batch execution
Why Skip & Retry are critical in production
- External APIs are unreliable
- Input data is rarely 100% clean
- Batch jobs must complete even with partial failures
- Manual restarts are expensive and error-prone
Without skip and retry, a single bad record can fail an entire batch job — which is unacceptable in enterprise systems.
High-level execution flow
Reader → Processor (retry happens here) → Writer
↓
Skip (after retry exhausted)
Spring Batch always attempts retry first. Only when retry attempts are exhausted does it evaluate skip rules.
Sample input CSV (persons.csv)
id,name,email 1,John Doe,john@yopmail.com 2,Alice,retry@yopmail.com 3,BadRequest User,badrequest1@yopmail.com 4,Test,test@yopmail.com 5,BadRequest User,badrequest2@yopmail.com 6,BadRequest User,badrequest3@yopmail.com 7,BadRequest User,badrequest4y@opmail.com 8,BadRequest User,badrequest5@yopmail.com
Custom exceptions — retryable vs skippable
The most important design decision is deciding which failures are: retryable and which are skippable.
public class ApiTimeoutException extends RuntimeException {
public ApiTimeoutException(String message) {
super(message);
}
}
public class BadRequestException extends RuntimeException {
public BadRequestException(String message) {
super(message);
}
}
| Exception | Type | Behavior |
|---|---|---|
| ApiTimeoutException | Transient | Retry with backoff |
| BadRequestException | Permanent | Skip immediately |
Processor — where retry really happens
Retry logic should live where failures occur. In this example, the processor calls an external API simulation.
@Component public class PersonProcessor implements ItemProcessor{ @Autowired private PersonRegistrationProcessor registrationProcessor; @Override public Person process(Person person) { registrationProcessor.registerPerson(person.getEmail()); return person; } }
If the processor throws an exception, Spring Batch:
- Retries (if configured)
- Applies backoff
- Skips after retry limit is exceeded
Simulating real API failures
@Component
public class PersonRegistrationProcessor {
private final Map attempts = new ConcurrentHashMap<>();
public void registerPerson(String email) {
if ("retry@yopmail.com".equalsIgnoreCase(email)) {
int count = attempts.merge(email, 1, Integer::sum);
if (count < 3) {
throw new ApiTimeoutException("API timeout");
}
}
if (email.contains("badrequest")) {
throw new BadRequestException("400 Bad Request");
}
}
}
This simulates:
- Retry succeeds on 3rd attempt
- Bad input skipped immediately
Step configuration — fault tolerance in action
@Bean
public Step processingStep(JobRepository jobRepository,
PlatformTransactionManager txManager,
ItemReader reader,
RegistrationWriter writer,
PersonProcessor processor) {
FixedBackOffPolicy backOff = new FixedBackOffPolicy();
backOff.setBackOffPeriod(2000);
return new StepBuilder("learn-skip-and-retry", jobRepository)
.chunk(1, txManager)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.retry(ApiTimeoutException.class)
.retryLimit(3)
.backOffPolicy(backOff)
.skip(BadRequestException.class)
.skipLimit(4)
.build();
}
Each item is processed in its own transaction, so retry and skip affect only one record at a time.
SkipListener — auditing skipped records
@Component public class PersonSkipListener implements SkipListener{ @Override public void onSkipInProcess(Person item, Throwable t) { log.error("Skipping {} due to {}", item.getEmail(), t.getMessage()); } }
In production, you can:
- Persist skipped records to an error table
- Export them to a failed CSV
- Trigger alerts when skip limit is close
What happens when skipLimit is exceeded?
- The step fails immediately
- The job is marked FAILED
- Remaining records are not processed
This prevents silently ignoring large data quality issues.
Common mistakes (and how to avoid them)
| Mistake | Why it’s bad | Better approach |
|---|---|---|
| Retrying validation errors | Wastes time | Skip immediately |
| Large chunk size | Whole chunk rolls back | Use chunk=1 for APIs |
| No backoff | Overloads downstream systems | Add Fixed or Exponential backoff |
Conclusion
Spring Batch Skip and Retry allow you to build resilient batch jobs that can tolerate real-world failures without human intervention. When designed correctly, they transform fragile batch pipelines into robust, self-healing systems.
๐ฅ Watch the complete execution walkthrough:
Spring Batch Skip & Retry — Full Video Tutorial
๐งฑ Spring Batch Core Components
Understand how ItemReader, ItemProcessor, and ItemWriter work together when exporting data to CSV files.
๐ Spring Batch ItemProcessor Example
Apply transformation and formatting logic before writing records into CSV output files.
๐ CSV to Database with Spring Batch
Compare inbound (CSV → DB) and outbound (DB → CSV) batch processing patterns.
๐ซ Skip Policy & Error Handling
Handle write failures and formatting errors gracefully while exporting large datasets.
๐ Conditional Flow in Spring Batch Jobs
Control job execution paths based on CSV generation success or failure.
๐งต Multithreaded Step in Spring Batch
Improve export performance by parallelizing data processing and CSV writing steps.