21. Knowledge Organization

The “nice” mathematical way: a tree, a forest. Universal and more realistic (but still mathematical): a graph. Also mathematical, but useless: Venn diagrams.

21.1. Trees

21.1.1. Definition of a Tree

The most useful definition of a tree, comes from graph theory, where

a tree is an undirected graph in which any two vertices are connected by exactly one path. Every acyclic connected graph is a tree, and vice versa. A forest is a disjoint union of trees, or equivalently an acyclic graph that is not necessarily connected.

Strangely enough, many examples on Wikipedia use numbered nodes, which implies some kind of order. That is an invalid generalization from the set of “unfounded brain-dead assumptions

graph "tree" { /* rankdir=LR; */ subgraph "cluster T1" { { rank = min; T1N1[label="node"]; T1N2[label="green"]; T1N3[label="car"]; } { rank = max; T1N5[label="autumn"]; T1N6[label="news"]; } T1N4[label="sky"]; T1N1 -- T1N4 -- T1N2; T1N4 -- T1N3; T1N4 -- T1N5 -- T1N6; } }

Mathematically, graph theory identifies all kinds of different trees with fancy names and classifications. Then there is a whole lot of more fancy properties for trees in computer science. Most of these differences are members of the set of “distinctions that the world don’t need”.

E.g., a rooted tree is defined as “a tree in which one vertex has been designated the root”. It is easy to see, that in a tree graph, any node can be designated as root. So, there is really nothing special about a rooted tree or a root node.

graph "tree" { // |:here:| subgraph "cluster T1" { { rank = min; T1N1[label="node"]; } T1N2[label="green"]; T1N3[label="car"]; T1N4[label="sky"]; T1N5[label="autumn"]; T1N6[label="news"]; T1N1 -- T1N4 -- T1N2; T1N4 -- T1N3; T1N4 -- T1N5 -- T1N6; } subgraph "cluster T2" { { rank = min; T2N2[label="green"]; } { rank = same; T2N1[label="node"]; T2N3[label="car"]; } T2N4[label="sky"]; T2N5[label="autumn"]; T2N6[label="news"]; T2N1 -- T2N4 -- T2N2; T2N4 -- T2N3; T2N4 -- T2N5 -- T2N6; } subgraph "cluster T3" { { rank = min; T3N3[label="car"]; } { rank = same; T3N1[label="node"]; T3N2[label="green"]; } T3N4[label="sky"]; T3N5[label="autumn"]; T3N6[label="news"]; T3N1 -- T3N4 -- T3N2; T3N4 -- T3N3; T3N4 -- T3N5 -- T3N6; } // |:here:| subgraph "cluster T4" { { rank = min; T4N4[label="sky"]; } { rank = same; } T4N1[label="node"]; T4N2[label="green"]; T4N3[label="car"]; T4N5[label="autumn"]; T4N6[label="news"]; T4N4 -- T4N1; T4N4 -- T4N2; T4N4 -- T4N3; T4N4 -- T4N5 -- T4N6; } subgraph "cluster T5" { { rank = min; T5N5[label="autumn"]; } T5N1[label="node"]; T5N2[label="green"]; T5N3[label="car"]; T5N4[label="sky"]; T5N6[label="news"]; T5N5 -- T5N4; T5N5 -- T5N6; T5N4 -- T5N1; T5N4 -- T5N2; T5N4 -- T5N3; } subgraph "cluster T6" { { rank = min; T6N6[label="news"]; } T6N1[label="node"]; T6N2[label="green"]; T6N3[label="car"]; T6N4[label="sky"]; T6N5[label="autumn"]; T6N5 -- T6N4; T6N5 -- T6N6; T6N4 -- T6N1; T6N4 -- T6N2; T6N4 -- T6N3; } }

The distcinction between undirected and directed graphs is pretty much useless and therefore in the set of “distinctions that the world don’t need”.

21.1.2. Walking Around a Tree

Using a rooted tree, there are two basic ways for tree traversal:

There are many variations and combinations of these two basic tree walks, which are also in the set of “distinctions that the world don’t need”.

21.1.2.3. Illustrative Video

There is a YouTube video, which illustrates Depth First Search vs Breadth First Search using a stack and a queue for backtracking. The local copy (subtitles) is copyright 2017 by ONeillCode on YouTube:

Connect with me on LinkedIn! https://www.linkedin.com/in/stephenaoneill/

This video will go over Depth First Search vs Breadth First Search of a graph. It will explore how stacks and queues are used in the two algorithms and walk through it.

If you like the video, give it a thumbs up! If you want to see more videos from me hit the subscribe button! All code can be found here: https://github.com/oneillcode/Youtube-Tutorials

ONeill Code’s purpose is to provide in-depth tutorials on a wide range of programming material. It can focus on interview questions or simple algorithms depending on the user requests. If you have any video requests please let me!

21.1.3. Books

The organization of knowledge in books is equivalent to an ordered depth-first search.

digraph "knowledge" { /* rankdir=LR; */ "topic" [shape=box]; "topic" -> "1."; "1." -> "1.1."; "1." -> "1.2."; "1." -> "1.3."; "1.3." -> "1.3.1."; "1.3." -> "1.3.2."; "1." -> "1.4."; "topic" -> "2."; "2." -> "2.1."; "2." -> "2.2."; "topic" -> "3."; "3." -> "3.1."; "3." -> "3.2."; }

which results in a document outline of:

1.
1.1.
1.2.
1.3.
1.3.1.
1.3.2.
1.4.
2.
2.1.
2.2.
3.
3.1.
3.2.

Additional pathes, which violate the tree conditions can be introduced with cross-references:

digraph "knowledge" { /* rankdir=LR; */ "topic" [shape=box]; "topic" -> "1."; "1." -> "1.1."; "1." -> "1.2."; "1." -> "1.3."; "1.3." -> "1.3.1."; "1.3." -> "1.3.2."; "1." -> "1.4."; "topic" -> "2."; "2." -> "2.1."; "2." -> "2.2."; "topic" -> "3."; "3." -> "3.1."; "3." -> "3.2."; "1.2." -> "3.1." [weight=0,style=dashed]; "1.3.1." -> "2.2." [weight=0,style=dashed]; "1.3.2." -> "3." [weight=0,style=dashed]; }

21.1.4. Mindmaps

The popular mindmap also follows a tree structure. For a quick introduction see the online services MindMup (can store on Google Drive), Mindmap erstellen kostenlos - ohne Anmeldung. The application freemind(1) (/srv/install/freemind, java application) is quite popular with mindmap fans. Mindmaps also allow to connect nodes directly, just like books with their cross-references.

The fact, that a mindmap is nothing but a simple tree, presented in a different manner can be visualized by using a circular layout engine for the book example:

digraph "knowledge" { /* rankdir=LR; */ scale = 0.5; layout = circo; "topic" [shape=box]; "topic" -> "1."; "1." -> "1.1."; "1." -> "1.2."; "1." -> "1.3."; "1.3." -> "1.3.1."; "1.3." -> "1.3.2."; "1." -> "1.4."; "topic" -> "2."; "2." -> "2.1."; "2." -> "2.2."; "topic" -> "3."; "3." -> "3.1."; "3." -> "3.2."; // "1.2." -> "3.1." [weight=0,style=dashed]; // "1.3.1." -> "2.2." [weight=0,style=dashed]; // "1.3.2." -> "3." [weight=0,style=dashed]; }

21.2. Generalization

Given an example with 6 sections, where each section shares a subset of information with at least 2 other sections and at most 3 other sections:

A -> B;
A -> F;
A -> D;
B -> C;
B -> F;
C -> D;
C -> E;
D -> F;
E -> F;

graphviz dot(1) draws an acceptable graph, which is quite tree like:

digraph "knowledge" { /* rankdir=LR; */ A -> B; A -> F; A -> D; B -> C; B -> F; C -> D; C -> E; D -> F; E -> F; }

When some restrictions are introduced, which request certain sections to be of equal importance:

subgraph { rank = same; A; F; }
subgraph { rank = same; B; D; }
subgraph { rank = same; C; E; }

the graph looks less like a tree and more like a graph:

digraph "knowledge" { /* rankdir=LR; */ subgraph { rank = same; A; F; } subgraph { rank = same; B; D; } subgraph { rank = same; C; E; } A -> B; A -> F; A -> D; B -> C; B -> F; C -> D; C -> E; D -> F; E -> F; }

With one additonal restriction:

subgraph { rank = same; A; F; }
subgraph { rank = same; B; D; }
subgraph { rank = same; C; E; }
subgraph { rank = same; B; F; }

the faint resemblance of a tree is gone:

digraph "knowledge" { /* rankdir=LR; */ subgraph { rank = same; A; F; } subgraph { rank = same; B; D; } subgraph { rank = same; C; E; } subgraph { rank = same; B; F; } A -> B; A -> F; A -> D; B -> C; B -> F; C -> D; C -> E; D -> F; E -> F; }

21.2.1. Circular Graphs

Giving up on trees, the most obvious layout to accomodate arbitrary graphs with any structure is a circular one:

digraph "knowledge" { /* rankdir=LR; */ layout = circo; A -> B; A -> F; A -> D; B -> C; B -> F; C -> D; C -> E; D -> F; E -> F; }

The graph for the very simple example might still convey some interesting structural information. The general case, however, really does not help much in visualizing structural relations, although it does visualize the exponentiality of all possible pathes quite nicely (see Total number of Spanning Trees in a Graph):

graph "knowledge" { layout = circo; N0 -- N1; N0 -- N2; N0 -- N3; N0 -- N4; N0 -- N5; N0 -- N6; N0 -- N7; N0 -- N8; N0 -- N9; N0 -- N10; N0 -- N11; N0 -- N12; N0 -- N13; N0 -- N14; N0 -- N15; N0 -- N16; N0 -- N17; N0 -- N18; N0 -- N19; N1 -- N2; N1 -- N3; N1 -- N4; N1 -- N5; N1 -- N6; N1 -- N7; N1 -- N8; N1 -- N9; N1 -- N10; N1 -- N11; N1 -- N12; N1 -- N13; N1 -- N14; N1 -- N15; N1 -- N16; N1 -- N17; N1 -- N18; N1 -- N19; N2 -- N3; N2 -- N4; N2 -- N5; N2 -- N6; N2 -- N7; N2 -- N8; N2 -- N9; N2 -- N10; N2 -- N11; N2 -- N12; N2 -- N13; N2 -- N14; N2 -- N15; N2 -- N16; N2 -- N17; N2 -- N18; N2 -- N19; N3 -- N4; N3 -- N5; N3 -- N6; N3 -- N7; N3 -- N8; N3 -- N9; N3 -- N10; N3 -- N11; N3 -- N12; N3 -- N13; N3 -- N14; N3 -- N15; N3 -- N16; N3 -- N17; N3 -- N18; N3 -- N19; N4 -- N5; N4 -- N6; N4 -- N7; N4 -- N8; N4 -- N9; N4 -- N10; N4 -- N11; N4 -- N12; N4 -- N13; N4 -- N14; N4 -- N15; N4 -- N16; N4 -- N17; N4 -- N18; N4 -- N19; N5 -- N6; N5 -- N7; N5 -- N8; N5 -- N9; N5 -- N10; N5 -- N11; N5 -- N12; N5 -- N13; N5 -- N14; N5 -- N15; N5 -- N16; N5 -- N17; N5 -- N18; N5 -- N19; N6 -- N7; N6 -- N8; N6 -- N9; N6 -- N10; N6 -- N11; N6 -- N12; N6 -- N13; N6 -- N14; N6 -- N15; N6 -- N16; N6 -- N17; N6 -- N18; N6 -- N19; N7 -- N8; N7 -- N9; N7 -- N10; N7 -- N11; N7 -- N12; N7 -- N13; N7 -- N14; N7 -- N15; N7 -- N16; N7 -- N17; N7 -- N18; N7 -- N19; N8 -- N9; N8 -- N10; N8 -- N11; N8 -- N12; N8 -- N13; N8 -- N14; N8 -- N15; N8 -- N16; N8 -- N17; N8 -- N18; N8 -- N19; N9 -- N10; N9 -- N11; N9 -- N12; N9 -- N13; N9 -- N14; N9 -- N15; N9 -- N16; N9 -- N17; N9 -- N18; N9 -- N19; N10 -- N11; N10 -- N12; N10 -- N13; N10 -- N14; N10 -- N15; N10 -- N16; N10 -- N17; N10 -- N18; N10 -- N19; N11 -- N12; N11 -- N13; N11 -- N14; N11 -- N15; N11 -- N16; N11 -- N17; N11 -- N18; N11 -- N19; N12 -- N13; N12 -- N14; N12 -- N15; N12 -- N16; N12 -- N17; N12 -- N18; N12 -- N19; N13 -- N14; N13 -- N15; N13 -- N16; N13 -- N17; N13 -- N18; N13 -- N19; N14 -- N15; N14 -- N16; N14 -- N17; N14 -- N18; N14 -- N19; N15 -- N16; N15 -- N17; N15 -- N18; N15 -- N19; N16 -- N17; N16 -- N18; N16 -- N19; N17 -- N18; N17 -- N19; N18 -- N19; }

21.2.2. Venn Diagrams

Represent the simple example as a Venn diagram:

<script src="https://d3js.org/d3.v4.min.js"></script>
<script src="http://benfred.github.io/venn.js/examples/../venn.js"></script>
<script>
// define sets and set set intersections
var sets = [
      {sets: ['A'], size: 12},
      {sets: ['B'], size: 12},
      {sets: ['C'], size: 12},
      {sets: ['D'], size: 12},
      {sets: ['E'], size: 12},
      {sets: ['F'], size: 12},
      {sets: ['A','B'], size: 2},
      {sets: ['A','F'], size: 2},
      {sets: ['A','D'], size: 2},
      {sets: ['B','C'], size: 2},
      {sets: ['B','F'], size: 2},
      {sets: ['C','D'], size: 2},
      {sets: ['C','E'], size: 2},
      {sets: ['D','F'], size: 2},
      {sets: ['E','F'], size: 2}
      ];
var chart = venn.VennDiagram();
d3.select("#venn").datum(sets).call(chart);
</script>

the algorithm for Venn diagrams implemented in benfred/venn.js: Area proportional Venn and Euler diagrams in JavaScript renders nice graphics:

_images/knowlegde-org-venn.png

However, it cannot draw all of the common subsections:

WARNING: area C,D not represented on screen venn.js:1737:17
WARNING: area E,F not represented on screen venn.js:1737:17

This makes Venn diagrams practically useless for representing complex matters.

21.3. Conclusion

The fallacy of trees is the assumption that knowledge does indeed have some magical structure. This assumptions does of course belong to the set of unfounded brain-dead assumptions.

The example of a language reference, which is essentially a dictionary, shows that there is only a superficially useful categorization possible. Beyond that, the possiblities of constructing programs (sentences) are far too numerous to be captured in a book.

The same is true for any attempt of describing good practices or tips and tricks. This includes the current chapter as well as the entire documentation treatise.

Therefore, most mapping efforts trying to impose a rigorous tree structure are usually futile.

As for programming, this means that “object oriented” is not the ultimate answer. Especially things like Java, C#, strong typing in general are far too inflexible for interconnected designs. This is where Lisp, Python, Javascript are exponentially better suited for the task.

Summarily, the collection of knowledge in books is better than nothing, but with the possibilities of digital information storage it appears very antiquated. It is indeed really hard to grep(1) through a paper book.