# Introduction to Graph Theory

January 2, 2017 | Views: 4805

A company hires you to cut costs on their infrastructure. This requires you to know what devices are part of their infrastructure. From there, you determine where the new fiber optic cable should be installed to maximize the benefit to the organization. Finally, you discover that a router is underutilized, and recommend reconfiguring some of the users through that router. A job done well due to graph theory.

A graph G is a set of vertices V along with a set of edges E, which is the cross product of V with itself. The set of vertices may represent users, roles, devices, states, files, etc. The set of edges represents a relation between two vertices, such as coworkers, read access, a connection between two devices, a transition from one state to another, etc.

A vertex v is said to be incident to an edge e if the exists some vertex x such that e = (v, x). In this scenario, we would say that nodes v and x are adjacent. Note that if there existed some edge f and some node y such that f = (y, v) or f = (x, y) then we would say that edges e and f are incidents

A graph may be represented in many ways. Two of the most common ways to represent a graph is using an adjacency list or an adjacency matrix. Let V = {A, B, C, D}, E = {(A, B), (B, A), (B, D), (D, D)}. The corresponding adjacency list is E = {A : <B>, B : <A, D>, C : <>, D : <D>}. The ith row and jth column of the adjacency matrix for a directed graph are 0 if there is no edge from the ith vertex to the jth vertex, and 1 if the edge from the ith vertex to the jth vertex exists. For our graph G, the adjacency matrix is

 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1

A graph Is said to be undirected if for every edge e of E and for all nodes u and v of V, if e = (u,v), there exists an edge f in E such that f = (v, u). This is equivalent to saying that the transpose of our adjacency matrix is the adjacency matrix. Otherwise, the graph is said to be directed.

A walk is a tuple of edges (e1, e2, …, en) such that for all 0 < i < n, ei = (u. v) implies e(i+1) = (v, w) (The destination of the ith edge becomes the source of the next edge). A cycle is a walk in which e1’s source and en’s destination are the same nodes. A graph is cyclic if the graph contains at least one cycle, and acyclic if it does not. Walks may or may not contain cycles. A path is a walk which does not contain any cycles. Two nodes are said to be connected if there exists a path between them. To find all the walks of length 2, I can multiply the adjacency matrix by itself. The resulting matrix is listed bellow.

 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1

I can multiply the adjacency matrix by itself twice to final all walks of length 3.  The resulting matrix is listed bellow.

 0 1 0 1 1 0 0 2 0 0 0 0 0 0 0 1

Note that there are two walks of length 3 from B to D. They are ((B, D), (D, D), (D, D)) and ((B, A), (A, B), (B, D)). You can repeat this process of matrix multiplications to find walks of any length.

To find walks of length at most 3, we need only add the walks of length 1, 2, and 3. The resulting matrix is listed bellow.

 1 2 0 2 2 1 0 4 0 0 0 0 0 0 0 3

In evaluating the various storage schemes and algorithms, we use two different metrics. Asymptotic time complexity and asymptotic space complexity. We represent this using ‘Big-Oh’ notation, meaning how much would it take in the worst case. Storing the graph in an adjacency list takes O(|V| + |E|) space. (|V| is the number of nodes, |E| is the number of edges). Storing the graph in an adjacency matrix takes O(|V|^2) space. Determining whether there is an edge from u to v takes O(|V|) time in an adjacency list, where it would take O(1) time in an adjacency matrix.

Although we use commands or graphical user interfaces to search, We use searching all the time for computer security. Most windows machines use Discretionary Access Control (DAC). If you are explicitly told to remove all confidential files from a user account, you would need to traverse the file system looking for files marked “confidential”. Likewise, looking for a file or files containing passwords and/or secrets requires searching. Finding one or all the devices on a network requires searching. In layman’s terms, starting from a node s, we want to know about all nodes we can reach by following 1 or more edges. There are two famous algorithms for searching a graph. Breadth First Search (BFS) uses a queue, which allows it to search near the node before trying to search far from the node. Contrastingly, Depth First Search (DFS) allows you to search away from the node before searching near it. It’s better demonstrated in a YouTube video: https://www.youtube.com/watch?v=bIA8HEEUxZI Both BFS and DFS take O(|V| + |E|) time complexity, but BFS takes O(|V| + |E|) space complexity for an adjacency list and O(|V|^2) space complexity for an adjacency matrix while DFS uses O(|V|) space complexity. However, running BFS will give us the shortest path from our starting node s to all nodes, making it an extremely efficient searching algorithm.

A Tree is an acyclic graph or a graph with no cycles. A spanning tree is a tree which contains every vertex in our graph. If we have a weighted undirected graph, we can have a minimum spanning tree, that is a tree which spans all the vertices where the edge weights of our tree are less than or equal to any other spanning tree. Finding a minimum spanning tree is important especially in networking. Broadcasts to all computers should be using a minimum spanning tree. Designing our network so that we connect all our devices with the least amount of cable requires a minimum spanning tree.  We solve these problems with two algorithms. Prim’s algorithm considers two sets and moves all the vertices from one set to the other. Kruskal’s algorithm uses the disjoint set data structure and iterates through edges in increasing weight. Both algorithms are demonstrated in this YouTube video: https://www.youtube.com/watch?v=3fU0w9XZjAA and take O(|V|^2) time using an adjacency matrix and O(|E|log|V|) time using an adjacency list, and both algorithms have a space complexity of O(|V|). Prim’s algorithm works better in graphs with lots of edges. Such graphs are called dense graphs. Kruskal’s algorithm works better in graphs with few edges. Such graphs are called sparse graphs.

Suppose you want to download a file from a server. Many packets will be sent from the server to your computer. Each wire carries a certain capacity c, and you want to maximize the flow between the server and your computer. Capacity has the same domain and codomain as our weights; however, unlike weights, our Capacity comes with another function, called the flow which also shares the same domain and codomain as the weight function. We solve this problem with the Ford-Fulkerson Algorithm but in practice, we use the Edmond-Karp Algorithm as it relies only on the number of vertices and the number of edges. The corresponding YouTube video can be found here: https://www.youtube.com/watch?v=SqGeM3FYkfo .The running time of this algorithm is O(|V||E|^2) and the space complexity of the algorithm is O(|V|^2) time.

All the algorithm’s we have looked at so far are polynomial in size with respect to its input. There are several algorithms which may or may not be polynomial with respect to the size of its inputs. This is the famous P = NP problem in computer science. Examples of graph algorithms which seem to become exponentially harder with the size of the graph include finding a Hamilton path, where a Hamilton path visits each vertex exactly once, starting and ending at specific vertices. Follow this link: https://www.youtube.com/watch?v=oTN53mXXayw for more information. Finding the minimum vertex cover, or a subset of our nodes such that every node is incident to a node in our subset is NP-hard. Find out more here: https://www.youtube.com/watch?v=m_dOtat56vY Finding cliques or a subset of vertices where there exists an edge between all the nodes in the subset in a graph is NP-complete. Find more information here: https://www.youtube.com/watch?v=_ZKPP5PrHOk

I hope I have convinced you that graph theory is an important topic in networking and computer security. There are so many other important topics in graph theory, such as Euler paths, Graph coloring, Approximation algorithms, etc. that would take way too long to cover in such a small amount of time. To learn in detail everything, I’ve said here and a lot more, check out this book used in the third year math course on graph theory: “Graph Theory”, third edition, by A. Bondy and U. S. R. Murty: Springer. There are several ways to study graph theory. The reason that I put the focus on running time is because you are not likely going to be implementing these algorithms, and even if you do you can read the algorithm at that time and adapt it to your needs. However, when you call these functions in your computer, I believe it is extremely important to have an approximation of how long it will take and how much you have to store.

If you like my content, check out this similar article where I introduced mathematical logic: https://www.cybrary.it/0p3n/mathematical-introduction-logic-logic-provides-foundation-hacking/

Regards,

ProgrammerE

Share with Friends
Use Cybytes and
Tip the Author!
Share with Friends
1. Good one

2. 2500+ views. I’m honoured (It’s how we spell in Canada). 10 tips too. I’m so happy you like this type of content. Thank you all for your generous support and I will continue to write articles like this on Cybrary. Additionally, I am in the planning stage of starting a podcast/blog where I go in to depth on these sorts of topics (I have a couple of cryptography books checked out from my university library that go into detail about the cryptographic algorithms, both classic (Substitution, Linear Feedback Shift Register Sequences), and modern (AES, RSA, ElGamal, SHA-1, Shor). Any advice is appreciated.

Regards,

ProgrammerE

PS: Ferlom’s links seem good. I do not speak spanish but they look like a very good resource. Here’s a good site for dipping your toe into graph theory: https://www.hackerearth.com/practice/notes/graph-theory-part-i/ and it’s continuation https://www.hackerearth.com/practice/notes/graph-theory-part-ii/

3. Really cool and interesting article.
I think so…Graphs Theoty is a good way to work about security networks and commonly misundertanding or forget it. Spanish and LATAM people could find great content about in EdX portal by Universidad Politecnica de Valencia UPV from Spain.

Artículo realmente util e interesante.
Creo que sí … Teoria de Gráfos ( parte de la Matematica Aplicada ) es una buena manera de trabajar sobre las redes de seguridad y que comúnmente malentendido o olvidado. El español y el LATAM pudieron encontrar gran contenido en el portal EdX de la Universidad Politécnica de Valencia UPV de España.

### Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

### Support Cybrary

Donate Here to Get This Month's Donor Badge

### Cybrary|0P3N

Cloud G Suite to Office 365 Migration – Complete Tutorial
Views: 2086 / January 17, 2020
Watch LIVE Privileged Account Hack
Views: 5786 / January 7, 2020
How to Backup G Suite Cloud Data?
Views: 5440 / January 7, 2020
6 Information Security Predictions for the Year 2020
Views: 6298 / January 5, 2020

We recommend always using caution when following any link

Are you sure you want to continue?

Continue
Cancel