Database

A database is an organized collection of data stored and accessed electronically. Formally studied this through CS348.

Two main types of database:

  1. Relational Database
  2. Non-Relational Database

Database

A large and persistent collection of metadata and data organized in a way that facilitates efficient retrieval and revision.

Database Management System (DBMS)

A DBMS is a set of programs that implements a data model to manage a database.

A DBMS manages two kinds of information: metadata and data.

  • metadata tells you how the data is organized
    • Ex: There are employee entities that have a name and a salary.
  • data is the actual data that you want to store
    • Ex: Mary Smith is an employee who earns $92,000 per year.

Data Model

A data model determines the nature of the metadata and how retrieval and revision is expressed. See Data Model.

Benefits that Database Systems bring (slides 10-11 of chapter 1)

  • Reliability
  • Concurrency
  • Data Integrity (Integrity Constraint)
  • Security: Restrictions on who can access and update
  • Productivity
  • Data is Structured

General properties of a DBMS

  1. A DBMS adopts some data model for managing structured data via an interface with two sub-languages: a DDL, and a DML
  2. A DBMS supports physical and logical Data Independence
  3. A DBMS supports concurrent data manipulation (through Transactions)
  4. A DBMS guarantees data is reliably recorded and can be recovered in case of hardware or software failure (through Transactions)
  5. A DBMS provides access control to information via data access permissions relating to users and roles
  6. A DBMS provides utilities for database monitoring and maintenance.
  7. A DBMS supports a variety of users

Fundamentally

Fundamental in codifying the three big ideas underlying a DBMS:

  1. physical data independence,
  2. data manipulation that is declarative, and
  3. interaction via transactions.

Why are databases hard?

  • Data redundancy and inconsistency
  • Concurrent-access anomalies
  • Security issues

Other misc. learnings

From SE464

Why not just use filesystems as database?

Filesystems don’t give us Analysis, integrity and deduplication properties. Also, shortcomings:

  • Indexing – efficient access in just one dimension – the path/filename.
  • Concurrency – multiple apps can read/write, but lacks transaction