Updating object timestamps in an s3 bucket with boto

aws
s3
python

(Full Snack Developer) #1

I’m going to keep putting these tutorials in here until @oaktree makes an “ops” category just for me :wink: And I realize pretty much all of us here are in infosec in some way or another and not really “ops” people, but bear with me. All these little things are going to come in useful someday somehow. I promise.

Now, onto the post…

So tonight because of some vendor’s stupidity, I had to update the timestamps on all the objects in an S3 bucket to something less than 24h old, ie, “now”. How do I do that? No idea. But I knew I had to start with my trusty Swiss Army Knife for AWS: python3 + boto3.

Firing up boto, and python, I sketched out my short script. I knew I needed to do the following:

  1. Create a boto “client” object
  2. Enumerate all the objects in the given bucket
  3. For each of those objects, bump the timestamp to “now” somehow.

Steps 1 and 2 were easy. Step 3 was less clear: boto has no method to modify metadata directly. At least, none that I could find. What to do? A bit of Googling turned up a Github issue describing the use of copy_object() pointing to the same source and dest key. It’s the boto version of touch, sorta. It’s not documented and it’s not really obvious.

So there you have it: If you want to touch an object in an S3 bucket, the best way to do it is to use client.copy_object() with the same source and destination.


(Full Snack Developer) #2

Small caveat: You have to change something about an S3 key in order for you to copy back to the same source, so you’ll probably want to change the storage class.


(Full Snack Developer) #3

'''
Touches every item in a given s3 bucket to bump the Unix timestamp

Written because Logz is stupid and won't import "old" logs. wtf, guys
'''

import boto3
import logging
import sys

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.INFO)
formatter = logging.Formatter(
    '%(asctime)s - %(name)s - %(levelname)s - %(message)s')
ch.setFormatter(formatter)
logger.addHandler(ch)

def crawl(client, bucket):
    returnval = []
    response = client.list_objects_v2(Bucket=bucket, MaxKeys=10000, FetchOwner=False)

    for k in response['Contents']:
        returnval.append(k['Key'])

    return returnval

def touch(client, bucket, key):
    source = bucket + "/" + key
    logger.info("Setting source key to %s" % source)
    try:
        client.copy_object(Bucket=bucket, CopySource=source, Key=key, StorageClass='REDUCED_REDUNDANCY')
        logger.info("Updated %s..." % key)
    except Exception as e:
        logger.error("--- Unable to modify key %s in bucket %s" % (key, bucket))
        logger.error(str(e))
    return

def main():
    if not sys.argv[1]:
        logger.critical('You must provide a bucket name!')
        exit(1)
    else:
        bucket = sys.argv[1]

    try:
        client = boto3.client('s3')
    except Exception as e:
        logger.error("Unable to initialize boto s3 client")
        logger.error(str(e))

    keys = crawl(client, bucket)
    for k in keys:
        touch(client, bucket, k)
    return

if __name__ == "__main__":
    main()

(Command-Line Ninja) #4

This is cool. But would you mind doing a somewhat crash course of AWS and or buckets?

Fun fact: this site uses AWS as a CDN.


(Full Snack Developer) #5

Yeah, I can write up a few things on AWS.


(Full Snack Developer) #6

I’ve updated this script a few times since the original post. Bonus points to anyone who spots the handful of bugs in it.


(Full Snack Developer) #7

This topic was automatically closed after 30 days. New replies are no longer allowed.