Wednesday, March 19, 2014

Inductive Reasoning Visualized

Ever wondered what inductive reasoning (technically, using inductive logic programming) would look like if you could draw a picture?
Here's how:

I'll explain what you're looking at. Let us say you know the following facts:
Gaia is a parent of Cronus.
Cronus is a parent of Zeus.
Zeus is a parent of Athena.

You also know that:
Gaia is a grandparent of Zeus.
Zeus is a grandparent of Athena.
Gaia is not a grandparent of herself.
Gaia is not a grandparent of Cronus.
Cronus is not a grandparent of Gaia.
Athena is not a grandparent of Cronus.

Now you ask the computer to induce a definition of grandparenthood based on these facts.

To do this, the machine needs to try different possibilities, and these are what you see in the graph.

On top, you see:
Now the sentence "X is a grandparent of Y" is what the machine writes as "grandparent(X,Y)", and an underscore simply means "anybody". So this is the hypothesis that "anybody is a grandparent of anybody".
The machine knows this to be false because of what you see in the red square: 4. Four is the number of "problems" found with this hypothesis: namely, it predicts that "Gaia is a grandparent of herself", which we know to be false. It predicts every instance of "is not a grandparent of" above, and there are 4 of them. Thus this hypothesis is not consistent with observations (its predictions are falsified).

Next we have
grandparent(A,_) :- parent(A,_)
which in English reads "A is a grandparent of anyone if A is a parent of anyone". As you can see, the red box says 3, because it has 3 problems: it predicts that Gaia is a grandparent of herself, since Gaia is a parent (of whom does not matter, but it happens to be Zeus), which we know to be false. For the same reason, it predicts that Gaia is a grandparent of Zeus, which is also false. Finally, it predicts that Cronus is a grandparent of Gaia, since Cronus is a parent (again, of whom does not matter). The negative example "Athena is not a grandparent of Cronus" is not (incorrectly) predicted to be true, since Athena is not a parent.

This is the basic idea in Inductive Logic Programming: we construct lots of hypotheses and test them against the examples we have. There are two solutions that look promising (green boxes):
grandparent(A,B) :- parent(A,C), parent(C,B)
which states that A is a grandparent of B if A is a parent of some C and that C is a parent of B. This is indeed the correct definition, and it does cover both positive examples ("Gaia is a grandparent of Zeus", "Zeus is a grandparent of Athena"), and does not cover any of the 4 negative examples.

The other promising solution is
grandparent(A,B) :- parent(A,C), parent(C,B), parent(B,_)
which gets the grandparent relation slightly wrong: on top of requiring the correct conditions (A is a parent of some C, which is a parent of B), it also requires the grandchild B to be a parent. So according to this definition, Athena is not the grandchild of Gaia because she does not (yet) have any children, but when she does, she'll satisfy the conditions. The machine knows this definition is worse than the right one because it cannot explain the fact that Zeus is a grandparent of Athena. Hence it only explains one of the two facts (that's the 1 in the green gox).

I'll leave it as an exercise to interpret all the other hypotheses in the graph.

The picture was produced using my ILP system Atom.


  1. "She lay with Heaven and bore deep-swirling Oceanus, Coeus and Crius and Hyperion and Iapetus, Theia and Rhea, Themis and Mnemosyne and gold-crowned Phoebe and lovely Tethys. After them was born Cronos the wily, youngest and most terrible of her children, and he hated his lusty sire." Hesiod, Theogony, excerpt on Gaia.

  2. Good article John, interesting so basically you are feeding the machine with facts, and leave it to the machine to induce theories. If you compare this with how humans construct theories to explain the world, what would you say the similarities are and the differences?

    1. In a naive view of the scientific method, we induce a hypothesis by cleverly guessing, just like above. Once we've guessed a hypothesis, predictions are derived, which are then tested against the examples. That's exactly what we're doing in my example above.

      The difference between real use of the scientific method and my example here is that in real life we don't start with an empty theory; we build upon the best theory so far found. When Einstein developed special relativity he did so by modifying Newton's laws, rather than starting from scratch. So we are talking about theory revision: start with a known (approximate) theory, make modifications to accommodate the new examples (observations) that don't fit the old theory, check by testing predictions. Theory revision can also be done in inductive logic programming, but don't expect relativity theory to come out yet, todays machines are no match for Einstein :)

    2. Great answer, another interesting analogy that is worth comparing is how babies coming into the world continuously form theories in their mind to explain the world to themselves, and keep learning until adulthood and beyond in different areas of life.

      We of course come pre-programmed with certain biases and instincts, but beyond that, how would you compare your above machine learning with how a single human learns about different areas of his/her own life and world experience? I would say that the human brain learns in a very similar fashion to your machine learning model above.

      When I, for example, learn about sales (the area I'm working in currently), I have a theory about "how we should probably be able to sell better" in my organization. From that theory (which might come from my own experience, intuition, analogies from other experiences, and books/other people's learnings) I formulate a hypothesis that is designed to test this theory. I run this experiment (not in a formal way, but informally in my mind by acting in a different way than I used to), and gather feedback. Based on the feedback I reformulate the theory in my mind and find another hypothesis to test in the real world. That way, I iteratively understand "selling" and "forming the best sales organization" better and better.

      This can of course also be formalized, but I think most of "human brain learning" is non-formalized, intuitive but in this way very similar (albeit unconscious) to your machine learning model above.

      What are your thoughts on how human brains learn, similarities and differences to your machine learning model above?

      Furthermore, based on these similarities, do you think you could essentially build a "machine brain" that does this automatically by being given a "goal" and from that goal deriving "pleasure" and "pain", and based on this feedback strengthening/weaking neurologic connectsions. Do you think something similar can be simulated easily in a computer or machine?

    3. Interesting, thanks for sharing your own experiences.

      The machine learning model I've presented above is very basic. For instance, we can add probabilities to all hypotheses, which would be more suitable in some sciences (social sciences, everyday learning) but not in others (physics, mathematical conjecturing). We humans are both able to use true/false theories ("E = mc^2") and probabilistic versions ("dogs are usually friendly"). We also invent new concepts, such as "energy", "atoms", "elliptical", which we then use in our learning (in machine learning and inductive logic programming, it's called Predicate Invention).

      Furthermore, the examples we're talking about are purely abductive/inductive. Much of learning, such as in school, is simply about being told what's true by the teacher. So instead of being given examples of parents and grandparents and having to guess a definition of grandparenthood, we are simply given the correct definition immediately. Learning by imitation is a similar form of copying information rather than inducing it. You only want to induce when there is no well established (and effective) theory for something; otherwise you may need to start from scratch or use an existing theory and modify it according to your context/environment.

      There's also issues of how information is obtained. We obtain information through our senses, and so evolution has already chosen some representations over others for us (we can't help but see animals and humans as separate objects, rather than just a screen of colorful pixels everywhere). This, and the "built-in" learning algorithms we have, bias our perception of reality towards certain features and away from others. This is actually a very good thing: searching for every hypothesis is not feasible, even in principle (more hypotheses than the number of atoms in the universe and all that). It is highly non-trivial to bias machine learning towards "good" perspectives of reality (i.e. towards promising hypotheses). In short, we have "intuition", although we don't understand this algorithm.

      Your last question, about condition, is an interesting one. Neural networks represent information in a way that is more similar to brains. Learning can be done with feedback to neurological connections: the neuron firing thresholds are updated, and new connections can be added/removed. So parts of it can be simulated, and most of the problem with simulating the human brain lies in our poor understanding of the brain's knowledge representation (memory) and its deductive/inductive processes. I'd like to emphasize that although it is certainly interesting to try to simulate the human brain, it may not be the most efficient nor the most useful way to do machine learning: Efficient because modern computer architectures are very different from the brain's "hardware" architecture (think of a brain as billions of mini-computers linked together and running in parallel, whereas computers are much more serial). Useful because you can't read of the neural network, so even after it correctly induces something, you can't get a deeper understanding for it (compare that to the graph above, where all formulas are directly readable in plain English). Every established machine learning technique deserves its place in science; what I'm pointing out here is merely that one should not assume that simulating the human brain is necessarily the best way to solve any problem.

    4. Perhaps the most important thing to note: the scientific method is "interactive". We are not provided all examples of what's true and false prior to our search for a theory. Instead, we ask questions (predictions made), that is, we probe reality for the examples we want ("is it true that all things accelerate to the ground equally fast?").

      The induction algorithm I've depicted above takes all examples in one swipe, and tries to generalize it (empirical ILP), as opposed to asking for examples, which must then be provided to it (if it is a robot, it could in principle setup an experiment to determine whether the example is true or false). This latter setting is called interactive ILP, and is much closer to how humans conduct the scientific method.