Currently leveraging NestJS v9, fast-csv v4, and BigQuery for my project.
- Controller (CSV Upload):
@Post('upload')
@ApiOperation({ description: 'Upload CSV File' })
@ApiConsumes('multipart/form-data')
...
// Code shortened for brevity
return await this.filesService.uploadCsv(file, uploadCsvDto);
}
- Service Logic:
async uploadCsv(
...
// Code shortened for brevity
console.log(`Inserted ${rows.length} rows`);
...
}
// More service functions not shown here
...
export function memoryUsage(): void {
return console.log(
`APP is using ${
Math.round((process.memoryUsage().rss / 1024 / 1024) * 100) / 100
} MB of memory.`,
);
}
- Explanation of Service Process:
- Initial check in BigQuery for duplicates from previous CSV uploads.
- Utilization of stream (fast-csv) to avoid loading all CSV data into memory at once. ... // Details on processing every 200 rows and saving batches into BigQuery ...
Issue: Encounter memory problems when handling larger files above 50,000 rows:
50,000 row CSV example: https://i.sstatic.net/jkaL1.png
250,000 row CSV example: https://i.sstatic.net/PMcUg.png
Unable to identify why the memory issues persist even after clearing unnecessary variables.
Current node server indicates a maximum memory size of 512MB.
Memory Checking Function Used:
export function memoryUsage(): void {
return console.log(
`APP is using ${
Math.round((process.memoryUsage().rss / 1024 / 1024) * 100) / 100
} MB of memory.`,
);
}