To study Google's PageRank algortihm later, we introduce some terminology regarding graphs. There are variations, included directed and undirected graphs, weighteds graphs, graphs with loops and multiple edges, or simple graphs where these are disallowed.
Here's an example of a graph:
Definition
A (directed) graph is a pair where is a set of vertices and assigns to each pair of vertices a set of edges with source and target
In the example, the vertex set is and we have
Note that an equivalent definition is to define a graph as a quadruple where is a set of vertices, is a set of edges, and are the source and target functions. This is equivalent to the first definition because we can define the set of edges as the disjoint union of the sets for all pairs of vertices Note also the definition allow multiple edges between two vertices, or even loops. These graphs are furthermore directed, because there doesn't have to be an edge from to if there is an edge from to To be fully precise, this definition encompasses directed multigraphs.
To make sense of these possible restrictions, we need to define some more terms.
Definition
An undirected graph is a pair where is a set of vertices and assigns to each unordered pair of vertices a set of edges connecting and
A (directed or undirected) graph is simple if it has no loops and no multiple edges between any (ordered or unordered) pair of vertices.
A simple directed graph can be represented as a pair where is a set of vertices and is a binary irreflexive relation on the edges. (Irreflexivity means that for all vertices thus ensuring the absense of loops.)
Then a simple undirected graph corresponds to the case where the edge relation is symmetric (as well as irreflexive). Sometimes the word “graph” is taken to mean undirected graph, and then digraph is used to mean directed graph. So be careful to check the definitions when reading about graphs!
Let's look at some examples of the various types of graphs. Above we saw general (directed multi-)graph; if we remove the edge and the loop we get a simple directed graph:
If we ignore the direction of the edges, we get a (non-simple!) undirected graph:
To get a simple undirected graph, we can remove the edge and the edge
Because we can never see too many examples of graphs (well, maybe!), let's look at some more famous ones. First up, complete graphs: The the complete graph is the simple graph on vertices, with all pairs of vertices connected by an edge, e.g.,
Next, the line graph is the simple graph on ordered vertices, with edges between adjacent vertices, e.g.,
Finally, the cycle graph is the simple graph on ordered vertices, with edges between adjacent vertices and an edge between the first and last vertex, e.g.,
Subsection7.1.2Graph Terminology
Definition
The in-/out-degree of a vertex in a directed graph is the number of edges entering/leaving
Definition
The degree of a vertex in an undirected graph is the number of edges incident with
For example, in the complete graph the degree of each vertex is
Definition
A walk in a graph is a sequence of vertices and edges such that for
The length of a walk is the number of edges in it, i.e., We allow in which case the walk consists of a single vertex and no edges. A walk is closed if the first and last vertices are the same, i.e.,
Definition
A trail in a graph is a walk in which all edges are distinct.
For example, the walk is a trail in the first graph above.
Definition
A path in a graph is a walk in which all vertices (and hence all the edges) are distinct.
Definition
A cycle in a graph is a closed walk in which all vertices are distinct, except the first and the last, which are equal.
For example, the walk is a cycle in the first graph above.
There are also famous special type of trails and cycles:
A circuit is a closed trail.
A Eulerian circuit is a closed trail that visits every edge exactly once. A graph is called Eulerian if it has an Eulerian circuit.
A Hamiltonian cycle is a closed path that visits every vertex exactly once. A graph is called Hamiltonian if it has a Hamiltonian cycle.
The name “Eulerian” comes from the mathematician Leonhard Euler, who studied the famous problem of the Seven Bridges of Königsberg in 1736. It boils down to the question of whether the graph below has an Eulerian circuit:
Here, the vertices are the land masses (top and bottom the two banks of the Pregel river, left and right the islands Kneiphof and Lomse), and the edges are the bridges. It turns out that there is no Eulerian circuit in this graph, because of the following theorem:
Theorem
An undirected connected graph has has an Eulerian circuit if and only if every vertex has even degree.
A directed strongly connected graph has an Eulerian circuit if and only if every vertex has equal in-degree and out-degree.
An undirected graph is connected if it is nonempty and there is a path between any two vertices, and a directed graph is strongly connected if it is nonempty and there is a (directed) path between any two vertices.
We don't give the proof of the Eulerian circuit theorem, but it boils down to the correctness of a greedy algorithm: Keep finding cycles (where all vertices have even degree, viz. 2, until no edges remain; then glue the cycles together into a circuit). In contrast to Eulerian circuits, Hamiltonian cycles are much harder to find. In fact, it is NP-complete to determine whether a given graph has a Hamiltonian cycle.
Subsection7.1.3Graph isomorphism
The same graph can be represented in many equivalent (isomorphic) ways. For example, it doesn't matter whether we use the set or any other five element to represent the complete graph on five vertices. More formally, we introduce the notion of graph isomorphism:
Definition
An isomorphism between directed graphs is a bijection between the vertex sets together with, for each pair of vertices in a bijection
For simple graphs, the condition on the edges can be simplified to saying that preserves the edge relation, i.e., maps connected vertices to connected vertices in the sense that implies
Graphs provide a rich supply of difficult algorithmic problems in computer science. For example, the graph isomorphism problem is to determine whether two graphs are isomorphic. It is not known whether this problem is solvable in polynomial time (feasibly computable with a bounded amount of resources), or whether it is NP-complete (intractable). The subgraph isomorphism problem is to determine whether a simple graph contains another graph as a subgraph, i.e., whether there is an injection from the vertex set of to the vertex set of such that there is an edge between two vertices in if and only if there is an edge between the corresponding vertices in This is NP-complete.
Subsection7.1.4Adjacency matrices
The main connection between graph theory and linear algebra is that we can represent graphs without multiple edges as adjacency matrices. This works by ordering the vertices as and forming the matrix whose is one or zero, depending on whether there is an edge from to For example, for the graph that began the section, if we relabel the vertices we get the graph,
whose adjacency matrix is:
We can also accommodate graphs with multiple (yet finitely many) edges and loops by using a weighted adjacency matrix, where the entry is the number of edges from vertex to vertex For example, the adjacency matrix of the graph we started this section with is:
Note, however, that we loose some information, by replacing the finite sets with their size
Not only are adjacency matrices a useful data structure for storing graphs, we can even link matrix multiplication to graph theoretic features, e.g.:
Theorem
If is the adjacency matrix of a (directed) graph the entry of the power matrix is the number of walks of length from vertex to vertex for all
We prove this by induction on The base case is trivial, because the only walk of length zero is the empty walk, which has length zero and starts and ends at the same vertex. This matches being the identity matrix.
For the inductive step, we can use the fact that a walk of length from to can be obtained by taking a walk of length from to some vertex and then taking an edge from to
We can also represent undirected graphs as adjacency matrices, by using symmetric matrices. These have real eigenvalues, and we can learn a lot about the graph from the eigenvalues and eigenvectors. That is, however, is a topic for another day.
Adjacency matrices take on a special form when the graph is bipartite, i.e., the vertex set can be partitioned into two sets and such that all edges connect a vertex in to a vertex in In this case, the adjacency matrix can be written as a block matrix:
where is the adjacency matrix of edges going from and while is that of edges going from to For example, the complete bipartite graph on vertices, i.e.,
has the symmetric adjacency matrix:
This is the smallest bipartite graph that is not planar, i.e., drawable in the plane without intersecting edges. But that is a topic for another day.