Skip to content
UoL CS Notes

Fragmentation, Replication & Transparency

COMP207 Lectures

This lecture covers transparency:

  • Things that the DDBMS keeps hidden from people accessing the database.
    • Fragmentation and replication are examples of the things that are handled transparently.

Fragmentation

This is where the database is split into different fragments that can then be stored at different nodes.

Horizontal Fragmentation (Sharding)

Fragment 1
Fragment 2
Fragment 3
Fragment 4

The fragments may overlap.

  • Each tuple could be stored at a different site.
  • The way to get the original table back is to complete the UNION of the fragmented tables.

Vertical Fragmentation

1






2






3






4






       

The fragments typically overlap.

  • Certain attributes are stored at different sites.
  • You can get the original table back by completing the natural join of the distributed tables.

Users don’t see the fragments, just the full relations.

Replication

Different sites can store copies of data that belongs to other sites:

  • Consider that a central office holds all the data from the branches.
    • This improves resilience.
  • Consider that a branch holds data about suppliers from the central office.
    • This improves efficiency.

Full Replication

This means that each fragment is stored at every site:

  • There are no longer any fragments.
  • Faster query answering.
  • Very slow updates.

No Replication

This is where each fragment is stored at a unique site:

  • Crashes are a big problem.

Partial Replication

This is where a combination of the above are used:

  • Limit the number of copies of each fragment.
  • Replicate only the fragments that are useful.

Transparency

DDBMSs ensure that users do not need to know certain facts when creating queries. Transparency can be implemented at different levels.

  • Fragmentation Transparency
    • Fragmentation is transparent to users.
    • Users pose queries against the entire database.
    • The distributed DBMS translates this into a query plan that fetchers the required information from appropriate nodes.
  • Replication Transparency
    • Ability to store copies of data items and fragments at different sites.
    • Replication is transparent to the user.
  • Location Transparency
    • The location where data is stored is transparent to the user.
  • Naming Transparency
    • A given name (of a relation) has the same meaning everywhere in the system.

There are also many more types of transparency.