This article explains databases for people who are unfamiliar with them. It may also be helpful for anyone who understands the basic concept of databases, but wants to make sure they have a full understanding of the fundamentals.
What is a Database?
In short, a database is any structured system that holds electronic data. Think of a database as a series of buckets arranged in a grid. Each bucket can hold anything you choose to put in it. You could choose to fill up each bucket with IRS forms, screws, pictures of your family or anything you want. You could even choose to put a mixture all of those items in each bucket.
In order to make the organization structure usable, however, you would likely only put one type of item in each bucket and label buckets in a way that would help you most easily find what you're looking for.
While there are many types of databases, let's take a look at an example using our bucket grid that is a common way to organize data as well. Let's say that you are assembling wooden toys and you are organizing their parts in the buckets. One way you might do this is to add the parts of each single toy into a single row of buckets in your grid.
Let's say that each toy is made up of some or all of the following parts: screws, wooden pieces, metal gears, plastic pieces, and springs. If we choose to keep only the parts of a single toy in a single row of buckets and only one type of item in each bucket, it will be clear where to look for which pieces of a single toy.
Translating this into database terms, a bucket is referred to as a database location. A row of buckets is referred to as a record, the labels for each column of buckets are called database fields, and the whole grid is a table.
The type of item a database location (bucket) will hold is referred to as its data type. Data types get very specific, but they are basically things like text, numbers, dates, etc. Databases are "smart" and have built-in checks, called validations, that will make sure you aren't putting the wrong kind of data in a bucket. For example, if a bucket is set up to hold dates, but you try to put in a number, the number will be rejected.
The other concern when creating a database is the size of the database locations. Looking at our example, if you had 1 small screw, but the bucket for screws could hold 55 gallons, it would make the organization system harder to uses than if the bucket was small.
Because each row of buckets holds the parts for a single toy, when choosing the size of the buckets, it's important to anticipate the largest items the bucket may need to hold. If you are choosing bucket sizes based on a 9-inch-long toy airplane, for example, you might be tempted to choose a small bucket for wooden pieces. But if you know that other toys are larger, a full-sized rocking horse for example, you will choose a larger bucket.
The example given for a database structure is relatively simple. But what if you want to do something more complex. Let's say that you aren't just keeping track of the parts for a wooden toy, but you are, ultimately, assembling a full gift basket that also includes packaging, candy, baked goods, and a card. Let's also say that each basket was custom ordered and the recipient expects a specific toy, packaging, candy, etc. In this case, each of the other items could have a grid similar to the one for toys. Each row of buckets, or record, in these grids would represent contents like the candy or baked goods for a specific basket. This is what a database with more than one table looks like.
The problem is that we need some way to relate the items that will go in the baskets (i.e. which toy goes with which baked good, candy, etc.). In order to do this, we need something that identifies each row in a grid of buckets (table) and how it relates to a row in another grid of buckets. These identifiers are referred to as keys.
Tables usually have a field (column) that holds an ID, which identifies each row. These IDs are most often numeric, and don't have any meaning other than as the identifier of a record. These IDs must be unique so that two rows in the same table never have the same ID.
The ID that uniquely identifies a row or record in a table is called the primary key. This key can be used in other related tables so it is easy to identify how rows in the tables relate to each other.
In our example, if we have a grid of buckets for toys and one for baked goods, a row of buckets for baked goods may have the ID of 14. The row for the toy that will go in the same basket may have the ID of 7. In this case, 14 is the primary key for the row of baked goods, while 7 is the primary key for row that holds the parts of the toy.
The row of baked goods may have a column called something like Toy ID that will hold the ID of 7. This would identify that row 7 in the toy table holds the parts for the associated toy. When a row holds an ID from another table it is referred to as a foreign key.
Rows in tables are often related in a database, but the relationship is not always one to one. In our example, if you could order as many toys as you wanted in a single basket, then packaging and toys would have a one-to-many relationship. This means that one set of packaging would hold many toys. The same thing can happen in a table. For example, if you wanted to organize homes in a database table and family members in another, there would often be one home that was related to many family members.
The options for relationships between tables include:
- One-to-one: One specific gift basket is connected to a specific card.
- One-to-many: One specific gift basket is connected to several different toys and candies.
- Many-to-one: Many different gift baskets all have the same baked good.
- Many-to-many: Many different gift baskets have many different candies.
Structured Query Language
Structured Query Language or SQL is a very common language that is used to set up databases and create, read, update, or delete database data. Many popular databases, including most of the ones used in LoanPro, are set up to use SQL. While queries can be complex, simple queries to read data will look something like this:
SELECT * FROM toys WHERE screws = 'grabber screws'
This statement simply says, "Select everything from the table called toys in rows where the screws bucket contains grabber screws". This article won't get into writing queries, but it's important to note that to pull data from a database, a textual query is most often used.
SQL statements can be complicated and can pull data from multiple tables and relate the data using table keys. The results of a query will most often look similar to a .CSV or spreadsheet file.
The developers of a program or application determine the databases used and the way data is stored and split across tables. The tables, fields (columns), data types, and field size (bucket size) for databases is often given to external users of a database. This information is called the database schema and information for a single table usually looks like this:
The column on the left shows field names, which are the names of the columns in the table. These names identify the data the field (column) holds. The names are not very descriptive, which is why our database documentation gives descriptions of each field.
The column on the right shows the data type (e.g. varchar, timestamp, int) and the size of field (e.g. 100). The size is the number of characters the field will hold. It also displays a PK or FK, which is shorthand for primary keys and foreign keys.
SQL Data Types
A FIXED length string (can contain letters, numbers, and special characters). The size parameter specifies the column length in characters - can be from 0 to 255. Default is 1
A VARIABLE length string (can contain letters, numbers, and special characters). The size parameter specifies the maximum column length in characters - can be from 0 to 65535
Equal to CHAR(), but stores binary byte strings. The size parameter specifies the column length in bytes. Default is 1
Equal to VARCHAR(), but stores binary byte strings. The size parameter specifies the maximum column length in bytes.
For BLOBs (Binary Large OBjects). Max length: 255 bytes
Holds a string with a maximum length of 255 characters
Holds a string with a maximum length of 65,535 bytes
For BLOBs (Binary Large OBjects). Holds up to 65,535 bytes of data
Holds a string with a maximum length of 16,777,215 characters
For BLOBs (Binary Large OBjects). Holds up to 16,777,215 bytes of data
Holds a string with a maximum length of 4,294,967,295 characters
For BLOBs (Binary Large OBjects). Holds up to 4,294,967,295 bytes of data
ENUM(val1, val2, val3, ...)
A string object that can have only one value, chosen from a list of possible values. You can list up to 65535 values in an ENUM list. If a value is inserted that is not in the list, a blank value will be inserted. The values are sorted in the order you enter them
SET(val1, val2, val3, ...)
A string object that can have 0 or more values, chosen from a list of possible values. You can list up to 64 values in a SET list
Other Database Structures
Databases that use SQL are generally referred to as SQL Databases. Databases that don't use SQL are referred to as No-SQL Databases. Databases aren't required to be structured in the way illustrated in our example, as long as they organize data in a structured way.
Here is a short list of popular databases in each category: