Spring Batch Download Files from SFTP Server Using Tasklet (Complete Guide)

Spring Batch Download Files from SFTP Server Using Tasklet (Complete Guide)

In enterprise applications, files are often exchanged through SFTP servers. Banks, insurance companies, logistics systems, payroll applications, and reporting platforms frequently place files on an SFTP server for downstream systems to consume. Instead of manually downloading files every day, we can automate the process using Spring Batch.

What you'll learn
✔ Create a Tasklet-based Spring Batch job
✔ Connect to SFTP using JSch
✔ Download multiple files automatically
✔ Delete files from remote server after successful download
✔ Trigger jobs using REST APIs
✔ Understand internal Spring Batch execution flow

Why Tasklet Instead of Chunk Processing?

Chunk processing is excellent when records need to be read, processed, and written. However, downloading files from an SFTP server is a single operation. For such use cases, Tasklet is a cleaner and simpler solution.

TaskletChunk Processing
Single operationProcesses records in chunks
File download/uploadCSV to DB, DB to CSV
Simple implementationReader/Processor/Writer required

Application Flow

Client
  |
GET /api/jobs/trigger-download
  |
JobController
  |
Spring Batch Job
  |
Tasklet
  |
Connect to SFTP
  |
Download Files
  |
Delete Remote Files
  |
Store Metadata in Oracle

Maven Dependencies

spring-boot-starter-batch
spring-boot-starter-webmvc
ojdbc11
jsch
lombok

Spring Batch provides Job and Step infrastructure. JSch handles SFTP communication. Oracle stores batch execution metadata.

application.properties

spring.batch.job.enabled=false

sftp.host=YOUR_HOST
sftp.port=22
sftp.username=YOUR_USERNAME
sftp.password=YOUR_PASSWORD

sftp.remote.dir.path=/home/reports
sftp.local.dir.path=F:/reports

The most important property here is spring.batch.job.enabled=false. Without it, Spring Batch would automatically execute the job during application startup.

Batch Configuration

@Bean
public Job sftpJob(JobRepository jobRepository,
                   Step sftpDownloadStep) {
    return new JobBuilder("sftpDownloadJob", jobRepository)
            .start(sftpDownloadStep)
            .build();
}

@Bean
public Step sftpDownloadStep(
        JobRepository jobRepository,
        PlatformTransactionManager transactionManager,
        SftpDownloadTasklet tasklet) {

    return new StepBuilder("sftpDownloadStep", jobRepository)
            .tasklet(tasklet, transactionManager)
            .build();
}

The Job represents the entire workflow while the Step represents a single unit of work. Since downloading files is a single activity, the Step invokes the Tasklet only once.

REST API Trigger

@GetMapping("/trigger-download")
public ResponseEntity<String> triggerSftpDownload() {

    JobParameters jobParameters =
            new JobParametersBuilder()
                    .addLong("executionTime",
                            System.currentTimeMillis())
                    .toJobParameters();

    jobOperator.start(sftpJob, jobParameters);

    return ResponseEntity.ok("Triggered");
}

A timestamp parameter is added to every execution. Spring Batch identifies Job Instances using Job Name + Job Parameters. Without unique parameters, Spring Batch may reject duplicate executions.

SFTP Download Tasklet

session = jSch.getSession(username, host, port);
session.setPassword(password);
session.setConfig("StrictHostKeyChecking", "no");
session.connect();

channelSftp =
   (ChannelSftp) session.openChannel("sftp");

channelSftp.connect();

The Tasklet first creates an SSH session. Since SFTP operates over SSH, a secure connection must be established before file transfer begins. After that, an SFTP channel is opened.

Reading Remote Files

Vector<ChannelSftp.LsEntry> files =
        channelSftp.ls(remoteDirPath);

This retrieves all files from the configured remote directory. The code skips '.' and '..' entries because they are system directory references.

Download and Delete Logic

channelSftp.get(remoteFilePath, localFilePath);
channelSftp.rm(remoteFilePath);

This is the heart of the implementation. The file is downloaded to the local machine and then removed from the remote server. This ensures the same file is not downloaded again during the next execution.

Important:
Your implementation performs a MOVE operation rather than a COPY operation because the remote file is deleted after download.

Resource Cleanup

finally {
    if(channelSftp != null){
        channelSftp.disconnect();
    }

    if(session != null){
        session.disconnect();
    }
}

The finally block guarantees that resources are released even when exceptions occur. Failing to close SFTP sessions can eventually exhaust server resources.

What Happens During Runtime?

reports.csv
customers.csv
employees.csv

Assume these files exist on the SFTP server. When the endpoint is called, Spring Batch creates a Job Execution and invokes the Tasklet. The Tasklet downloads each file one by one and removes it from the remote directory. Once all files are processed, the Step is marked COMPLETED and the Job is marked COMPLETED.

Spring Batch Metadata Tables

TablePurpose
BATCH_JOB_INSTANCEStores Job Instances
BATCH_JOB_EXECUTIONStores Execution History
BATCH_STEP_EXECUTIONStores Step Details
BATCH_JOB_EXECUTION_PARAMSStores Parameters

Even though downloaded files are not stored in Oracle, Spring Batch still records execution history in these tables.

Security Discussion

session.setConfig("StrictHostKeyChecking", "no");

This is acceptable for learning purposes but should be avoided in production. Host key validation protects against man-in-the-middle attacks by verifying the identity of the remote server.

Common Mistakes

MistakeRecommendation
Hardcoded credentialsUse Vault or environment variables
No loggingLog every downloaded file
No retry mechanismAdd retry for network issues
Ignoring cleanupAlways disconnect resources

Interview Questions

  1. When should Tasklet be preferred over chunk processing?
  2. Why are unique JobParameters required?
  3. How does JSch communicate with SFTP servers?
  4. What metadata tables does Spring Batch create?
  5. Why should host key verification be enabled in production?

Conclusion

Spring Batch Tasklets provide a simple and efficient solution for automating SFTP downloads. By combining Spring Batch with JSch, we can securely download files, maintain execution history, and integrate seamlessly with enterprise workflows.

๐Ÿงฑ Spring Batch Core Components

Understand how ItemReader, ItemProcessor, and ItemWriter work together when exporting data to CSV files.

๐Ÿ”„ Spring Batch ItemProcessor Example

Apply transformation and formatting logic before writing records into CSV output files.

๐Ÿ” CSV to Database with Spring Batch

Compare inbound (CSV → DB) and outbound (DB → CSV) batch processing patterns.

๐Ÿšซ Skip Policy & Error Handling

Handle write failures and formatting errors gracefully while exporting large datasets.

๐Ÿ”€ Conditional Flow in Spring Batch Jobs

Control job execution paths based on CSV generation success or failure.

๐Ÿงต Multithreaded Step in Spring Batch

Improve export performance by parallelizing data processing and CSV writing steps.