The Databases Course

Exercise No. 10

Datalog and Concurrency Control

Due on February 1st by midnight

This is the only exercise about the two important subjects in the title. In order to encourage you to do it, and to help you improve your final exercises grade, anyone who submits this exercise will get an automatic bonus of four points.

  1. Write datalog programs without negation to express the following queries:
  1. Given a binary relation child-of(x,y) meaning that x is a child of y, express same-gen(x,y) which is true when x and y are of the same generation (i.e. x and y have a common ancestor which is the same number of generations away from both of them).
  2. Given a directed graph G represented as a binary relation G(x,y), compute the vertices of G odd-vertex(v) which have a cycle of odd length passing through them.
  3. Given a directed graph G together with colors for its edges represented as a relation G(x,y,col) which means that the edge from vertex x to vertex y has color col, compute all paths 2-col-paths(x,y) such that there is a path from x to y which involves edges from exactly two colors.
  1. Let Q be a query in relational algebra, whose result relation has n attributes. An equivalent datalog program to Q is a program (IDB) that computes, among other relations, a relation Q(x1, ..., xn) such that for any EDB, Q(a1, ..., an) is true if and only if Q's result (in algebra) includes the tuple (a1, ..., an) for that same EDB (input relations).
  1. Prove that any query in relational algebra can be expressed as a datalog program (possibly with negation). You may assume that comparison relations such as =(x,y) and >(x,y) are given.
  2. Describe the set of relational algebra queries that can be expressed by datalog programs without negation. Prove your answer.
  1. A query Q is domain independent if its result does not depend on the domain, i.e. the result is the same for any two supersets of the active domain. In class, we have seen a safety criterion for datalog programs with negation. Prove that safety entails domain independency.
  1. Assume that LOCK A, UNLOCK A means that A is read and also written. Give a schedule for two transactions and two items, which is unserializable and involves the least number of LOCK, UNLOCK steps. Illustrate that the schedule is unserializable using an example database. Prove that it uses the least number of steps.
  1. Consider the following schedule:

T1:LOCK(A), T2:LOCK(B), T1:LOCK(C), T1:UNLOCK(A), T3:LOCK(A), T1:UNLOCK(C), T2:UNLOCK(B), T3:UNLOCK(A), T4:LOCK(A), T4:LOCK(C), T4:UNLOCK(A), T2:LOCK(A), T2:UNLOCK(A), T4: UNLOCK(C).

  1. Build the schedule graph of this schedule.
  2. Give an equivalent serial schedule.
  3. Add a single LOCK X, UNLOCK X pair to this schedule that will cause it to be unserializable.
  1. In practice, there are read locks which allow only reading an item, and write locks which allow both reading and writing. Many transactions can hold a read lock on an item at once, but write locks are exclusive - if a transaction holds a write lock on an item, then no other transaction can hold any kind of lock on that item.
  1. Why might this be preferable over allowing only write locks? Illustrate your answer with an example.
  2. Does a transaction which only reads a certain item and doesn't write it really need to take a read lock on the item? Illustrate your answer with an example.