-
Notifications
You must be signed in to change notification settings - Fork 0
Technologies Database
AWS has many databases, to many for me to care to list at this time; read about their comparison here. The two we will talk/care about are Aurora RDS and DynamoDB.
Pricing for most RDS is based on the EC2 Instance you choose to launch it on, plus Hard Drive type (IOPS is more expensive), plus data egress (data leaving AWS system for broader internet): (EC2-Type + (drive-type * GB)) * Hours + data-egress.
AWS RDS is a hosted SQL database system. In most cases it allows you to set a medium or large server specifically geared to be the best virtual SQL server you can muster with the least effort. RDS offers MySQL, PostgreSQL, and a host of other SQL based options.
Aurora RDS is designed to be a drop in replacement for MySQL, meaning it conforms to all the normal things you would expect from MySQL. Aurora is a superset because it adds, among other possible things, serverless functionality.
Aurora basically stores all the data in an S3 bucket in a partially loaded state. Once a request is made, usually via the API gateway, a lambda function is started that launches the RDS Instance, loads the partially loaded data from S3, and serves the request.
Most Lambda is only required to stay alive for the duration of a request, but this one stands up and stays up for a fixed amount of quiet time. So for a 5 minute delay, the first request is processed (with a couple extra ms delay for spinning up) and stays active for 5 minutes. Within that 5 minute window, a second request is sent through and is processed without the spin-up delay -- this starts the 5 minute timer over. As long as request come through, the 5 minute delay will not expire. Once the delay does expire, RDS copies all changed data to the S3 bucket and spins down the instance.
Pricing for a serverless instance is different than that of other RDS. It is based on time and throughput.
Database capacity is measured in Aurora Capacity Units (ACUs). 1 ACU has approximately 2 GB of memory with corresponding CPU and networking, similar to what is used in Aurora user-provisioned instances.
From Aurora RDS Pricing
I can't elaborate any better than this pricing article in the serverless Aurora RDS description. The pricing examples are good.
One thing to keep in mind with this service is how infrequent we plan to use it. It is most likely we will use our program 1-3 hours out of the day, 5 or fewer days a week, and not even all weeks. This makes it hard to calculate our expense.
DynamoDB is not a SQL database, NoSQL (which actually stands for "Not Only SQL"). Specifically it is a serverless, hosted, multi-key-value store. There is only one table in each database, so no joins. It allows for nested objects and arrays but is difficult to query deeper than 2 layers.
There are two types of keys, HASH and RANGE. Every table/row must have a HASH key. It is the key defining where the row is stored in the larger data storage. The RANGE key on the other hand further delineates rows that should be stored together (due to your querying needs). ex: the edu-person database is based on a single HASH of ucinetid because it only ever searches for a person at a time and that person isn't clumped with anyone else ever. On the other hand the sessions database has a HASH and RANGE key for two reasons: 1) in order to collect the row, you need auth token and service name, also 2) in the event I want to see what websites a single session is accessing (possibly simultaneously) I can search for the token only and all rows with that token will be stored together for a faster search.
As described above, the find method is the most common action and is the fastest. The other common searching method is slower, because it ignores indexes entirely: scan has a base assumption that it will return all rows and uses a filter expression to prevent returning some. It's the opposite of most SQL queries, as they prefer to return nothing unless requested in the where clause.
Other than the key(s) of a row, the row may have as many top level properties as required. They may be Strings, Ints, Floats, Arrays, Objects, etc. Arrays and Objects can be as deep as needed. Each read request can only be 4KB at max, so it's a good idea to only store needed data. Also, since searching deeply nested data is hard, if you plan on doing scans, structure the data that controls your scan at or near the top of the row.
There's a long version, but here's the short description: From the pricing page
| Action | Cost/Million Requests |
|---|---|
| Write | $1.25 |
| Read | $0.25 |
So each read request is $0.000,000,25
Please read the Overview and Features of RDS, Aurora RDS, and DynamoDB.
- Go to the RDS Web Console
- Click "Create Database"
- Select "Standard Create", "Amazon Aurora", and "Serverless"
- Set a unique identifier in "DB cluster identifier", I set "pokemon-db", this is your identifier in AWS console/endpoints
- username may stay "admin" if you wish but I recommend setting your own password (... I don't know how to get the password after the fact)
- Min Capacity is fine at 1. I recommend setting max capacity to between 1-4. This will change max cost per minute
- Expand "Additional Scaling Configuration" and check "Pause compute capacity after ..." and set the time to a short-ish time. I set mint to 5 minutes or fewer. This will shut off the server after that specified period of non-use.
- Expand "Additional Connectivity Configuration" and enable the Data API
- (Optional) Under VPC Security Group, Select "Create new" and enter a name like "pokemon-db-provider". we will use this later when providing access via the paired security groups described in this section
- click "Create Database" at the bottom of the screen.
This method of loading is rather pedestrian, but we can't easily connect using MySQL Workbench or Datagrip, so we will use the built in Web Console Query tool and paste the data in manually.
- go here and copy the SQL statements
- go to the RDS Console and go to the pokemon-db
- In the Actions dropdown, select Query
- Add new database credentials, add username (admin) and password from step 5 above, and leave database name blank for now. Click connect to database.
- Try running
SHOW DATABASES;first just to see what it says. You will notice that there are the standard meta tables, but that there is one extra namesmysql. This one is not your database, it is the AWS meta table added in Aurora.
- Try running
- Past the Pokemon SQL statements into the Query.
- Scan the results and look for the green ✔️ icons. If you see any red 'x', fix the issue or let me know