We have seen a big issue with Spark job, which is, it writes its output files with part-nnnn naming due to its distributed behavior, and its not possible to rename it directly before writing, or modifying the underlying functions is not that easy.
Renaming Spark Part-NNNN Files on S3
Renaming Spark Part-NNNN Files on S3
Renaming Spark Part-NNNN Files on S3
We have seen a big issue with Spark job, which is, it writes its output files with part-nnnn naming due to its distributed behavior, and its not possible to rename it directly before writing, or modifying the underlying functions is not that easy.