This article describes the design of the database management system (DBMS) in the Suneido integrated application platform.
Suneido has an integrated client-server relational database. The database is accessed via a language which includes administration requests and update operations as well as actual queries. The query language is based on the relational algebra language in An Introduction to Database Systems by C.J.Date.
The database can store the following types of values:
- booleans (true or false)
- strings (including binary e.g. image data)
- number
- dates
- Suneido objects (i.e. arrays or records)
Suneido’s DBMS, like its language, is dynamically typed i.e. database columns (fields) do not have fixed types – they can hold any type of value. Validating data is left up to the application. All fields and records are variable length.
Suneido stores the entire database as a single operating system file. This includes the schema (table layout) information, data records, indexes, and concurrency & recovery information. The database is accessed as a memory mapped file.
The database can operate in single-user local mode, or in multi-user client-server mode. TCP/IP is used to communicate between the clients and the server.
System Tables
Information about the database (the schema) is stored in the following system tables:
tables – lists the tables that exist
columns – lists the columns for each table
indexes – lists the keys and indexes for each table
triggers – lists the summarize triggers for each table
views – lists the view definitions
These tables may only be altered by the system. However, they can be read from just like any other table.
Administration
The following administrative requests allow you to create, alter, rename, and destroy database tables.
create
Every table must specify at least one key – one or more columns that uniquely identify records. You are not required to identify a “primary key”. You cannot add a record with a key that already exists in the table.
create customers (name, phone, fax, email) key(name)
If a table should only have at most one record you can specify an empty key:
create configuration (option1, option2, option3) key()
Tables can also have “indexes”. These are not part of the “logical” design of the database; they serve simply to speed up queries. Suneido will automatically use indexes to execute queries faster. For example, if you often searched for customers by phone number:
create customers (name, phone, fax, email) key(name) index(phone)
ensure
Ensure has the same syntax as create, with “create” replaced by “ensure”. If the table doesn’t exist, it is the same as create. Otherwise. it compares the specification with the existing table and adds any columns, keys, indexes, or summarizes that don’t exist. It does not remove any extra columns, keys, indexes, or summarizes. In other words, it ensures that the table has at least the specified schema. For example:
ensure customers (name, phone, fax, email) key(name)
alter
Alter can be used to add or remove columns, keys, indexes, or summarizes, with a syntax similar to create and ensure.
alter customers create (city) index(city)
alter customers delete (fax)
rename
Rename tables, for example:
rename calls to phone_calls
view
Define a view, for example:
view local_customers = customers where city = ‘Saskatoon’
or:
view calls = phone_calls union fax_calls
Views permit an application’s “logical” view of the database to differ from the “physical” design. For example, the application can refer to a view as if it is a simple table, regardless of whether it is actually implemented as a single table or a union, or a where.
Currently only some views are update-able, depending on their query. Update-able queries include WHERE, PROJECT that includes a key, JOIN that is one to one, RENAME, and EXTEND. Theoretically, according to C.J.Date, all queries are update-able.
destroy
Destroy a table. For example:
destroy customers
Destroy can also be used to un-define a view. This does not affect the actual data.
destroy local_customers
Foreign Keys
A foreign key consists of one or more columns in one table that uniquely identify a row in another table, i.e. are a key in the other table. Foreign keys place constraints both on the source table – foreign key fields must contain values that exist in the other table, and on the target table – you can’t delete or update records if doing so would make source records invalid.
create calls (date, customer, comments, minutes) key(date)
index(customer) in customers(name)
You can also specify cascading deletes and updates. For example, if you delete an invoice header record, you can have the invoice line records automatically deleted. Or if you update the invoice number on the header, you can have the invoice line items automatically updated. For example:
create invoice_lines … index (invoice_num) in invoices cascade
summarize
Summarize creates and maintains stored summaries of a table.
For example:
alter calls create summarize(customer, count, total
minutes)
would create a table called _summarize_calls_customer which would store the count of calls and total minutes of calls for each customer. This table would be automatically updated as records are added, updated, or deleted from the calls table.
See also: Queries – summarize.
Queries
where (select/restrict)
Produces only those records that match certain criteria.
For example:
customers where city = “Saskatoon”
would produce only the customer records where the city was Saskatoon.
Where expressions are a subset of the expressions allowed in the Suneido language, including arithmetic and string operations, and also allow calling user defined functions.
project
Extract specified columns and eliminate any resulting duplicates.
For example:
customers project city
would produce a list of cities where you have customers.
rename
Rename columns – useful for join, leftjoin, union, minus, and intersect.
For example:
(sales rename salesman to employee) join staff
extend
Add calculated columns. For example:
sales extend amount = quantity * price
join (natural eq-join)
The result of join is a table with all the columns from the input queries (without duplicates) and with the set of rows formed by combining each pair of rows with equal common columns. The input queries must have at least one column in common.
For example:
customers join sales
Join excludes rows from the first table that do not have a matching row in the second table.
leftjoin (left outer natural eq-join)
Similar to join except leftjoin includes rows from the first table that do not have a matching row in the second table. These rows will have empty (“”) values for the columns of the second table.
For example:
customers leftjoin sales
will include all customers, whether they had sales or not.
times (product)
The result of times is a table with all the columns from both queries and with the set of rows formed by combining each possible pair of rows from the queries.
For example, to get all possible combinations of city and items:
city times items
summarize
Summarize a table by specified columns, with added columns for accumulated functions (count, total, max, min, average).
For example, to get the total sales for each city:
summarize sales city, total price
See also: Administration – summarize
union
minus (difference)
intersect
These are the standard set operations. Union eliminates any duplicates.
These operations require that their inputs have the same set of columns – project, extend, and rename are often useful to achieve this.
For example, to get a combined list of cities for your customers and suppliers:
(customers project city) union (suppliers project city)
Note: Although many of the example above show the query operations being applied to tables, the real power of the query language comes from being able to apply operations to any other query, not just tables.
Query Optimization
Query optimization has two main phases. The first phase applies some standard tranformations to the query that are almost always advantageous. For example, moving where’s towards tables and combining adjacent operations. In the second phase, operations choose appropriate strategies, indexes, and temporary indexes based on estimated costs. Data sizes are estimated using the indexes. Some operations have multiple strategies (e.g. project) they can use, other operations (e.g. rename) have only a single strategy.
Updating
Database update requests can be used to modify the database.
For example, to insert a record into a table:
insert { name: ‘Fred’, salary: 37500 } into employees
or to insert records from a query into a table:
insert sales where city = “Fargo” into fargo_sales
or to update all the records from a query:
update parts where category = 2 set price = price * 1.1
or to delete all the records from a query:
delete sales where price < .10
Rules
Unlike many systems, which limit business rules to constraints, Suneido’s business rules support a variety of uses including supplying default values to fields, performing calculations, and summarizing other data. Business rules have many advantages. They keep your business logic separate from your user interface and reports, enable code re-use, and allow your code to be written in smaller modules that are easier to test and maintain.
You can define rules for fields by defining functions called Rule_fieldname. When you access a field that the record does not contain, if there is a rule it will be called. If the rule returns a value, it will be stored in that field of the record. When rules are executed, Suneido automatically tracks their dependencies on other fields they access. If a dependency is changed, then the rule field is invalidated. This means that the next time the field is accessed, the rule will be executed again. Dependencies can be stored in the database (by creating a field called fieldname_deps) so that when old records are manipulated, rules will be triggered just as on new records. Invalidations also trigger record.Observer – this is used to update the user interface when records change. Invalidations do not affect non-rule values. i.e. if the user has overridden a derived value, then the rule on that field will no longer be triggered. Rules can be used without actually storing the values, or calculated columns can be stored in the database. Rules can also be used to adjust user interface controls.
Triggers
Whenever records are output, updated, or deleted from a table, the system checks for a user defined trigger named “Trigger_” followed by the table name. Triggers are only called after the operation succeeds. i.e. If the output, update, or delete fails, the trigger will not be called. Triggers can be used to maintain secondary tables such as summaries.
History
The “history” for a table can be accessed with history(table). Any records removed from a table by deleting or updating will automatically appear in the history table. The history has two additional columns, _date is the timestamp when that version of the record was deleted (or updated), and _action is whether the record was created or deleted.
Note: Since Suneido needs old versions of records for concurrency and recovery, this history facility is virtually “free”.
Concurrency
Suneido’s DBMS can operate in one of two modes – single-user local mode, or multi-user client-server mode. In either case, the database file itself is only ever accessed by a single program exclusively, so no operating system file locking is required.
All access to the database must be done within transactions. Transactions can be either read-only or update. Transactions see the database as of their start time, as if they were viewing a “snapshot” of the database. Suneido uses a multi-version, optimistic, timestamp-based approach to concurrency, which provides full transaction isolation, i.e. is serializable. Because of this, read-only transactions (e.g. reports) always succeed – they will never conflict with other transactions. Update transactions check for conflicts with other transactions when they complete, and fail (rollback) if conflicts are found.
On-line backups are done using a single read-only transaction to get a “snap-shot” of the entire database without interfering with use of the database.
Recovery
Suneido uses the database itself as its “log” i.e. to keep track of updates, both to support concurrency (e.g. multi-version and rollback) and to support recovery. This is possible because data records are only ever added at the end of the database file, never updated in place. This means that recovery (e.g. from a crash) is largely a matter of determining what portion of the end of the file must be discarded in order to rollback to a consistent state. Since indexes are updated in-place for efficiency, they are ignored by the recovery process and rebuilt from the data. Checksums are used to determine the integrity of the database.
References
An Introduction to Database Systems, C.J.Date