AWS Glacier is Amazon’s secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup. It’s durability is 99.999999999% and with a price starting at $0.004 per GB per month, it’s suitable to store large amounts of data for a longer period.
You will access Glacier most of the times through the API that’s implemented in an existing backup/archiving tool. Glacier management options in the AWS Console/GUI are pretty limited, and actually reduced to creating/deleting vaults, settings permissions, notifications and tags.
When I tried to delete a vault that wasn’t used anymore I ran in the following error: “Error deleting vault. Vault not empty or recently written to: arn:aws:glacier:eu-central-1:XYZ:VAULT01”.
Apparently this vault was not empty and still some archives (files) exist in the vault. It’s quite obvious I had to delete these archives, but if the original backup/archive solution isn’t connected anymore to the vault you will have a challenge.
You have four options if you want to delete (an) archive(s) from a vault:
- Deleting an Archive in Amazon Glacier Using the AWS SDK for Java
- Deleting an Archive in Amazon Glacier Using the AWS SDK for .NET
- Deleting an Archive Using the REST API
- Deleting an Archive in Amazon Glacier Using the AWS Glacier Command Line Interface (CLI).
Option 1, 2 and 3 are used in your backup/archiving application. You can use Freeze as a Glacier browser and use this solution to delete archives from Glacier.
In this article I will show you how to delete archives through the CLI (which is not very intuitive).
Installing the AWS CLI
Before you can the AWS CLI you have to install it. Things depend a bit on which OS you’re running, for MacOS the steps are:
- Download and install python 3.6
- Download and install pip
$ curl -O https://bootstrap.pypa.io/get-pip.py $ python3 get-pip.py --user
- Install awscli
$ pip3 install awscli --upgrade --user
- Add the AWS command to your path
You can use the AWS CLI by simply typing aws in the terminal/command line. First step is to configure a connection to AWS for the CLI. Create a new programmatic access user for this, and make a note of the access key ID and secret access key. Now run:
aws configure
Using the Glacier AWS CLI
After you’ve completed the previous steps, let’s start with the Glacier stuff. First do an inventory retrieval to get more info on what’s on Glacier.
aws glacier initiate-job --account-id - --vault-name VAULT01 --job-parameters '{"Type": "inventory-retrieval"}'
This command will return a location and jobId variable. It can take a couple of hours before the command completes, this has to do with how Glacier works. Check the progress of the command with:
aws glacier list-jobs --account-id - --vault-name VAULT01
You will get info on the progress in JSON format:
{ "JobList": [ { "JobId": "akjhihgkh89MBNKHJKW92aLKJYXtNQCQRBI8nERD-7QR4FhVlMYgyqlsczhnXJC", "Action": "InventoryRetrieval", "VaultARN": "arn:aws:glacier:eu-central-1:123467890:vaults/VAULT01", "CreationDate": "2018-02-18T14:44:26.339Z", "Completed": false, "StatusCode": "InProgress", "InventoryRetrievalParameters": { "Format": "JSON" } } ] }
Wait until the StatusCode changes to “Completed”, now it’s time to retrieve the output of the command:
aws glacier get-job-output --account-id - --vault-name VAULT01 --job-id XYZ123 output.json
The output.json contains a list of all the archives (files) that are in your Glacier vault. You can use your favourite text editor to view and analyse the file, for example Textwrangler in combination with a JSON text filter.
So, now it’s time to delete a few archives. The command for this is:
aws glacier delete-archive --account-id - --vault-name VAULT01 --archive-id "n7XzcNhdl0QMBK3G-Y3F8pLst9oqIlwafsdfsdfsdfsdfGhSL2CpNL2_3yYlgu1Cc-riLJIHVFkRLpnx0WTakjflndsflnaMpgDj0kw"
That’s it :).
But, wait….if I have a lot files to delete? Well I ran into the glacier-vault-remove script that will help you. The script leverages the boto3 which is a Python SDK for AWS.
That’s it, hope this was helpful!
6 Comments
Brian
It doesn’t work here’s why because I copied and pasted you have a misspelling for account-id see below I copied and pasted your cli.
Wait until the StatusCode changes to “Completed”, now it’s time to retrieve the output of the command:
aws glacier get-job-output –acount-id – –vault-name VAULT01 –job-id XYZ123 output.json
viktorious
Thanks Brian. I’ve updated the article and removed the typo.
Dheeraj Kumawat
Thanks Man. You Have Been a Great Help.
viktorious
You’re welcome!
TimoK
Actually what only worked for me with AWS CLI on Windows is below:
aws glacier initiate-job –account-id – –vault-name VAULT01 –job-parameters “Type”=”inventory-retrieval”
PETER C SMITH
Just used FastGalcier to erase some vaults successfully. https://fastglacier.com