Mastering Skipping Faulty Records in Spring Batch – SkipPolicy Explained

Skipping Faulty Records in Spring Batch – A Complete, Practical, and Modern Guide

Real-world batch systems deal with **imperfect data**—missing fields, malformed CSV rows, incorrect formats, and more. In such cases, failing the whole job because of a few bad records is unacceptable. This is where Spring Batch’s skip feature becomes invaluable.

In this guide, you’ll learn:

  • Why skipping is needed
  • How SkipPolicy works internally
  • Difference between skip and retry
  • How to skip in reader, processor, and writer
  • How to log and audit skipped items
  • SkipListener explained
  • Real-world production patterns
  • Full working code with copy buttons

๐Ÿ“Œ Why Skip Faulty Records?

In enterprise batch processing, your input might contain:

  • Invalid emails
  • Missing mandatory fields
  • Unparseable numbers or dates
  • Rows that violate business rules

Instead of failing the entire batch, you want the job to continue with good records while collecting details of the bad ones. This is exactly what Spring Batch is designed to do.


๐Ÿ“˜ How Spring Batch Skip Mechanism Works Internally


// High-level skip flow (ASCII diagram)

Item Reader ----> Item Processor ----> Item Writer
      |                  |                   |
      |        ❌ Exception occurs?          |
      |--------------------------------------|
      |      Should we skip this item? (SkipPolicy)
      |                 |
      |      Yes → Skip + continue
      |      No  → Fail the step

The job step continues processing if the exception is skippable. If not, the step fails immediately.


๐Ÿ”ง Maven Dependencies

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-batch</artifactId>
</dependency>

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

<dependency>
  <groupId>com.h2database</groupId>
  <artifactId>h2</artifactId>
  <scope>runtime</scope>
</dependency>

⚙️ application.properties

spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driver-class-name=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=

spring.h2.console.enabled=true
spring.batch.job.enabled=false

๐Ÿงฑ Employee Entity

@Entity
public class Employee {
  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  private String name;
  private String email;
}

๐Ÿ“‚ Sample CSV File (employees.csv)

name,email
John,john@example.com
InvalidUser,invalid-email
Jane,jane@example.com

๐Ÿง  Custom SkipPolicy (Recommended Approach)

Use SkipPolicy when you want reusable, clean, pluggable skip logic.

public class EmailValidationSkipPolicy implements SkipPolicy {

  @Override
  public boolean shouldSkip(Throwable t, int skipCount) {
    return t instanceof IllegalArgumentException
           && skipCount < 10; // skip up to 10 bad records
  }
}

๐Ÿ” Processor With Business Validation

The processor is the most common place to validate data. Throw exceptions for invalid rows, and let SkipPolicy decide.

public class EmployeeItemProcessor implements ItemProcessor<Employee, Employee> {

  @Override
  public Employee process(Employee employee) {
    if (employee.getEmail() == null || !employee.getEmail().contains("@")) {
      throw new IllegalArgumentException("Invalid email: " + employee.getEmail());
    }
    return employee;
  }
}

๐Ÿ“Œ SkipListener – Log & Audit Skipped Records

SkipListener is extremely useful for logging bad rows, writing to a separate file, or inserting into an error table.

public class EmployeeSkipListener implements SkipListener<Employee, Employee> {

  @Override
  public void onSkipInProcess(Employee item, Throwable t) {
    System.out.println("Skipped Employee: " + item.getName() + ", Reason: " + t.getMessage());
  }
}

⚙️ Spring Batch Configuration

@Configuration
public class BatchConfig {

  @Autowired private JobRepository jobRepository;
  @Autowired private PlatformTransactionManager transactionManager;
  @Autowired private EntityManagerFactory entityManagerFactory;

  @Bean
  public Job employeeJob() {
    return new JobBuilder("employee-import-job", jobRepository)
      .start(step())
      .incrementer(new RunIdIncrementer())
      .build();
  }

  @Bean
  public Step step() {
    return new StepBuilder("employee-import-step", jobRepository)
      .<Employee, Employee>chunk(5, transactionManager)
      .reader(reader())
      .processor(new EmployeeItemProcessor())
      .writer(writer())
      .faultTolerant()
      .skipPolicy(new EmailValidationSkipPolicy())
      .listener(new EmployeeSkipListener())
      .build();
  }

  @Bean
  public FlatFileItemReader<Employee> reader() {
    return new FlatFileItemReaderBuilder<Employee>()
      .name("employee-reader")
      .resource(new ClassPathResource("employees.csv"))
      .delimited()
      .names("name", "email")
      .targetType(Employee.class)
      .linesToSkip(1)
      .build();
  }

  @Bean
  public JpaItemWriter<Employee> writer() {
    JpaItemWriter<Employee> writer = new JpaItemWriter<>();
    writer.setEntityManagerFactory(entityManagerFactory);
    return writer;
  }
}

๐Ÿš€ REST Controller to Trigger Job

@RestController
@RequestMapping("/jobs")
public class JobLauncherController {

  @Autowired private JobLauncher jobLauncher;
  @Autowired private Job job;

  @GetMapping("/run-skip-job")
  public String runJob() throws Exception {
    JobParameters params = new JobParametersBuilder()
      .addLong("time", System.currentTimeMillis())
      .toJobParameters();

    jobLauncher.run(job, params);
    return "Job launched!";
  }
}

๐Ÿ” Retry vs Skip (Most Interviewed Topic)

Feature Retry Skip
Purpose Try again if temporary issue Ignore faulty record
Good for Network, DB issues Data validation errors
Config .retry(Exception.class) .skip(Exception.class)

✔️ Summary

  • SkipPolicy helps continue processing even when some records are invalid.
  • SkipListener is essential for logging, auditing, and debugging.
  • Use skip for **data errors**, retry for **transient system errors**.
  • Validation is best placed inside the Processor.
๐Ÿ“บ Want to learn Spring with hands-on videos?
Subscribe to our YouTube channel: Spring Java Lab for practical tutorials!

๐Ÿšซ Related Spring Batch Error Handling Guides

Strengthen your understanding of fault-tolerant Spring Batch jobs by exploring related concepts like retries, conditional flows, listeners, and execution models.

๐Ÿงฑ Spring Batch Core Components

Learn how Job, Step, ItemReader, ItemProcessor, and ItemWriter interact with skip and fault-tolerant configurations.

๐Ÿ” Spring Batch Retry Mechanism

Understand when to use retry versus skip strategies to handle transient and recoverable failures.

๐Ÿ”€ Conditional Flow in Spring Batch Jobs

Control job execution paths based on step exit status after skip or failure scenarios.

๐Ÿงต Multithreaded Step in Spring Batch

Combine skip policies with parallel processing while ensuring thread safety and data consistency.

⚙️ Spring Batch Tasklet

Use Tasklet-based steps for custom error handling logic beyond chunk-oriented processing.

๐Ÿ‘‚ JobExecutionListener

Monitor skipped items, failed records, and execution outcomes using job and step listeners.