Get the most recent object from S3 with script

Here is the code

# This script just echos the most recent object in a bucket
# You can do what you want with it from here.

S3_OBJECT="$(aws s3 ls --profile $AWS_CLI_PROFILE $S3_BUCKET --recursive | sort | tail -n 1| awk '{print $4}'|grep $TARGET_FILENAME)"

echo  s3://$S3_BUCKET/$S3_OBJECT

Explanation of How it works

  1. Use AWS CLI to list the objects
  2. Sort the list by last modified time
  3. Grab the last item in the list
  4. Extract the 4th column

First, list all the S3 objects in the bucket with AWS-CLI:

$> aws s3 ls $BUCKET --recursive
    2017-06-04 00:46:13     234665 albatros_object_628374
    2016-07-04 04:46:47   16782594 elephant_object_238
    2014-08-04 04:04:39     872657 zebra_object_234283746

By default, the list is sorted alphabetically by key. Note the first column is the last modified time.

Then piping the list into sort will reorder them by the first column (last modified time),

After that, use tail -n 1 to select the last row of the now time sorted list.

$> aws s3 ls $BUCKET --recursive | sort
    2019-05-12 01:37:19     65455
    2019-06-02 04:30:47     98754 elephant_object_5788
    2019-06-17 01:30:25    567894 platapus_object_45

TIP: tail -n 1 gives the very latest.
However you may want a group of latest (like 3) from which to select a specific name.

tail -n 3 selects the last 3 rows

In the past, I have used tail -n 6 to get the latest 6 objects and then looked for a specific filename within those latest 6 files.

Since the name is what is wanted, use awk ‘{print $4}’ to extract the fourth column.

$> aws s3 ls $BUCKET --recursive | sort | tail -n 1 | awk '{print $4}'
platapus_object_45 <-- object name returned

Then you can copy it to some directory if you like:

$> S3_OBJECT=`aws s3 ls $BUCKET --recursive | sort | tail -n 1 | awk '{print $4}'`

$> aws s3 cp s3://$BUCKET/$S3_OBJECT path/to/some/directory

Leave a Reply

Your email address will not be published. Required fields are marked *