Hello
We use bitbucket as a repository of certain data files that our users edit. Upon completion of their work they push changed files with their work back into bitbucket. Currently we have a pipeline defined that uploads to S3 all of the data files in the repository, which in turn triggers processing of all of the files. But this is not efficient if a user modifies only one files but processing is triggered for all say 30 data files in the repo.
I am looking for an improvement where the pipeline would be triggered _only_ for the files that were pushed/merged/changed into the branch. I do not want to upload to S3 all files, but only those that were modified and pushed by the users. I guess some changes would have to be done to our yml file to get this, but i lack skills to figure this out. Can i please ask if what i am after is possible? If so how to approach this? How to operate only on the 'delta' files? Is the information available withing pipeline on the commit that triggered the pipeline?
Regards
What command are you using to upload to S3?
aws s3 sync
should handle only uploading files that have updated for you according to the AWS docs
A local file will require uploading if the size of the local file is different than the size of the s3 object, the last modified time of the local file is newer than the last modified time of the s3 object, or the local file does not exist under the specified bucket and prefix
Hope this helps!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.