Tuesday, October 16, 2012

The Big Questions, Answered


For as long as we've existed, mankind has been asking the difficult questions of our place in this world: How did it all begin? When? How big is the universe? How will it end? When?

Still today, people ask themselves the same questions. What many may not know, is that most of these questions have meaningful scientific answers now.

How did it all begin?

Everything in the universe (including spacetime itself) was compressed to an extremely tiny pea, which "exploded" out into our existence. I'm writing "explosion" in quotes because space itself expanded, there wasn't (necessarily) anything outside of the expansion. This "explosion" is known as the Big Bang.

When?

13.7 billion years ago.

How big is the Universe?

The exact size is not known, but at the very least 90 billion light years in diameter (from one end to the other). Put differently, at least 45 billion light years in every direction from us (radius). A light year is the distance it takes for light to travel for one year, which is exactly 9460730472580800 meters. Multiply that by 45 billion to get the radius.

(If you expected the answer to be 13.7 billion light years in radius, the reason it's not is due to the expansion of the universe. See the answer to the next question.)

The perhaps most promising current theory, eternal inflation, suggests that the universe is infinite when viewed from the inside (as we see it), but is in fact only a "bubble universe" that looks finite to an outside observer. This discrepancy is due to Einstein's General Relativity. (The infinity of things we can see from the inside, that an outside observer does not see, is explained by the fact that those things are spread across different times to the outside observer. Roughly, with infinite time, the outsider sees the infinity of things we insiders see.)

How will it end?

The universe is still expanding since the big bang. Not only that, the expansion is going faster and faster (accelerating). The expansion doesn't "move out the edge of the universe". Rather, it stretches out space (but not within galaxies, as their gravitational pull overcomes this stretch).

Expansion of space means that the universe is cooling down (temperature wise), implying that the universe will freeze to death. More specifically, the disorder (entropy) increases due to the second law of thermodynamics, which, ultimately, will reach its stable state of complete disorderness. Except for some minor quantum fluctuations, the universe has quite literally come to a halt. The universal clock stops.

When?

Unknown, but existence of intelligence is most likely impossible after  years (a 1 followed by 100 zeros. For comparison, a million is 1 followed by 6 zeros, and a billion is 1 followed by 9 zeros). After that time, the last black holes will have evaporated (no suns or other shining objects are left either), and the universe enters the dark era.

Monday, October 1, 2012

The Logic of Showing up Late

I've been pondering about the logic of showing up on time or late to casual meetings. In my own experience, people tag others as "shows up late" vs "shows up on time" and try to mirror this behavior. This might seem a little odd since it would perhaps be both simpler and better for everybody just to show up in time. But is it odd?

Consider the following situation (in the sense of game theory, it is a "game"):

We have two people that have decided to meet at a certain time. Let us call them A and B. They can each choose to arrive on time, or come late.

If both come on time, nobody has to wait for the other, so there is no "suffering" involved. Hence they both receive 0 "happiness points" each. This is usually called "utility" in more formal contexts.

If A shows up in time but B is late, A will have to wait for B, which is less pleasant. However, A will be in a worse mood, which will also make B enjoy their time together less. So A, because he has to wait, will suffer a penalty of -5. B, because A will be in a worse mood, will receive a utility of -2.
The exact numbers don't really matter for now: what's relevant is the relative order they have. What we are effectively stating is that a person is better off when not having to wait (0) compared to when he has to wait (-5). Being in worse company, due to having dragged down his or her mood by making them wait, falls in between (-2). Because the situation (game) is symmetric, the roles are reversed when A is late and B on time.

If A and B are both late, nobody has to wait for the other, so nobody suffers. They both receive a utility of 0. (Note that this is simplified: it's more realistic to assume that they will not be late by equal amounts, so that one may suffer more than the other. What we're saying is that as both show up late, nobody has to wait for the other for a "significant" amount of time, making it equivalent to both showing up in time for all practical purposes.)

To summarize:
StrategyB In TimeB Late
A In TimeA: 0, B: 0A: -5, B: -2
A LateA: -2, B: -5A: 0, B: 0

The assumption in how they choose strategies is as follows: they are both aware of all the information contained in the table above (the utilities that follow from different combinations of the choices both make). Furthermore, they both try to maximize their own utility, they both know the other is, they both know that the other knows this, etc. In other words, the assumptions are the obvious ones: if I come late and you don't, you're not gonna be happy. I know this. You know this. I know you know it. You know I know it. And so on. (This is called "common knowledge".)

What we want to know is the following: which strategy do we select (in time or late) to maximize our own happiness? Looking at the table, it should be clear to you that if I come late, you will want to come late too. If you come in time, you'll have to wait for me, receiving -5. If you come late too, you'll receive 0, which is better. On the other hand, if I come in time, you will want to come in time too, so as not to upset me by making me wait. In other words, this is a coordination game: all we need to do is make sure we both do the same thing.

Indeed, if we are both used to come in time when we meet, there is no benefit for any of us to start deviating from this behavior (without informing the other). If you show up late, we'll both be worse off, since you'll ruin my mood and that's gonna bite you back. The same goes for me. If we're both used to showing up late, nobody benefits from coming in time (again, without informing the other), since that just means waiting for the other one to show up. This means that both coming in time, and both coming late, are Nash equilibrium - it is the best each of us can do given the circumstances. Note that the same cannot be said of the two other situations: if I tend to come late and you tend to come in time, we can both be better off by either you starting to come late too, or me starting to show up in time. Thus the asymmetric options are not Nash equilibria.

In fact, the equilibria are both strict, meaning switching is always worse. Hence this is not only an equilibria in the sense that it requires rational foresight: any insect could "figure it out" by virtue of having the behavior (strategy) hardwired into its nervous system. (Game theorists call this an Evolutionarily Stable Strategy, or ESS.)

This answers the question I posed in the beginning of this post: is it odd that people "remember" which friends tend to come in time and not, only to mirror the behavior? The answer is: no. The reason is: because it doesn't matter which is it - in time or late - as long as both do the same thing, both are better off.

You might think that this is an artifact of me having chosen the same utility for both coming in time and both coming late. It is not so. We could change the utility of both showing up late to say -1 for both, reflecting the fact that there is less time to enjoy the meeting compared to both arriving in time. We could even assign a utility of 1 to both arriving late, reflecting less stress in getting ready (or whatever other reason you might find). The point is that the utility of both coming late or in time does not matter to our conclusion: it is identical as long as both being in time or both being late is better than the asymmetric cases.

In fact, these coordination situations (games of two players) always have another solution: a randomized strategy ("mixed strategy") of sometimes being on time, and sometimes being late. Not surprisingly, however, such a strategy is not a very good idea, because in this situation the parties do not coordinate their arrivals (if they did, it would not be a randomized strategy, but the cases we just looked at). In game theory jargon, there is a mixed Nash equilibrium, but it is not an ESS. The latter is obvious, since those who randomize are worse off than those who always manage to coordinate their arrivals.

Finally, there's the question of which is the better solution: for both to come late, or both to come in time? This depends on the utilities received by the two equilibria. In my table above, both receive 0 when in time, and 0 when late. So both solutions are equally good. However, if both would receive 1 when in time (and still 0 when both are late), it would be better for both to arrive in time. If both were to arrive late, realizing they would both benefit from switching to arriving in time (which is the Pareto optimal solution), they could simply both agree on showing up in time, knowing that the other one would for the simple fact that it benefits him/her more too. For example, when two people dislike to stress, it may be preferable for both to arrive late. For people that have scarcity of time, it may be preferable to arrive in time.

However, if the game is not symmetric, there need not be such a Pareto optimal solution: if by both being in time, A receives 1 and B 0, whereas when both are late, it is the opposite: A receives 0 and B 1, well, then A would prefer the solution of both arriving in time, and B the solution of both arriving late. Note that both are still worst off by not coordinating, so the Nash equilibria are still the same: both in time, or both late. All that has changed is that it is no longer ubiquitous which of the two is the "better one", simply because what's better for A is worse for B, and vice versa. However, having an asymmetric situation means that there is a fundamental difference between A and B, each player now knows whether it is "A" or "B" (they can't just be swapped arbitrarily), so it does not represent a situation where "everything is the same for each person". For example, A may prefer to arrive late because she is sensitive to stress. But B may prefer to arrive on time, because he has very little time for procrastination. What Nash equilibria can tell us, is that they will (well, should) coordinate so that they both arrive in time or late. However, which of the two they pick, in this situation (of asymmetry), is a complicated issue not answers by game theory (as far as I know). Nobody said agreeing would always be easy.

UPDATE: My friend Mads discussed some (psychological) possibilities for an equilibrium in the asymmetric cases (one is one time, the other is late). We can analyze such situations (of psychological preferences) simply by changing the person's utilities. Let's say that B really hates being on time. The utilities are like before, except that B receives an additional punishment of -3 for coming on time:

StrategyB In TimeB Late
A In TimeA: 0, B: -3A: -5, B: -2
A LateA: -2, B: -8A: 0, B: 0

One can immediately note that B's strategy of being late is better than being in time, no matter what A does: if A is on time, B receives -2, which is better than -3. If A is late, B receives 0 instead of -8 (this is in fact solely due to the punishment of B being in time when A is in time, no additional punishment would have been needed on top of B having to wait, i.e. receiving -5).

This means that B's strategy of being late strictly dominates (it is always better), and hence, We can reduce the problem to:

StrategyB Late
A In TimeA: -5, B: -2
A LateA: 0, B: 0

Now, by the same logic, A's strategy of being late strictly dominates being on time, so the only Nash equilibrium will be that both show up late. Intuitively, this is because B hates being on time, and A doesn't mind accommodating for B's need to show up on time.

It is presumably harder to be on time than showing up late, so that as soon as one person is a time optimist (meaning he shows up late) both will settle for "an implicit 10 more minutes on top of the decided time".

Nevertheless, let's assume that, except for B's dislike of being on time, A actually dislikes being late. We thus have a conflict of interest, which we represent by punishing A with -X more for being late:

StrategyB In TimeB Late
A In TimeA: 0, B: -3A: -5, B: -2
A LateA: -2-X, B: -8A: -X, B: 0

As before, B's strategy of being late dominates being on time:

StrategyB Late
A In TimeA: -5, B: -2
A LateA: -X, B: 0

We now see that what it all boils down to is just how much A dislikes being late (quantified by the value of -X). If he dislikes being late more than to be in time and still wait for B, he will (obviously) actually go there and wait for B. We would then arrive at the asymmetric solution where A arrives on time (knowing B will be late) and B showing up late.

Friday, June 22, 2012

Tree Implementation with Iterators

There is no STL tree class, so you'll have to make your own. There are many choices to be made, with different trade offs. I'll share my own experiences here (for clarity, I will not write it as a template).

Let's say you want to represent mathematical expressions using syntax trees. The most obvious way is probably to start with:


class Node {
  Node* _parent; // one prefix underscore is not a reserved name
  std::vector<Node> _children;
};


However, mathematical expressions must consist of variables (x, y, z), constants (0, 1, 42), and functions/operators (+, - , *, /). Let's keep things simple and assume that we are only dealing with basic arithmetic, using integers with addition, subtraction, and multiplication. (But we allow for custom functions to be added later, so we make no assumption of having a binary tree. Hence we keep the vector representation for the children.)

The problem then is that the nodes will have different types (variables, constants, operators). So we need polymorphic types:


class Node {
  Node* _parent;
  std::vector<Node*> _children; // use vector of pointers
};


We add some typedefs and easy members:


class Node {
  public:
  typedef std::vector<Node*>::iterator arg_iterator;
  typedef std::vector<Node*>::const_iterator const_arg_iterator;

  typedef std::vector<Node*>::size_type arity_type;


  // Default Constructor
  Node() : _parent(nullptr) { }
  // Construct node with one child
  Node(Node* n) : _parent(nullptr) { add_child(n); }

  // Construct node with two children
  Node(Node* n1, Node* n2) : _parent(nullptr) { add_child(n1); add_child(n2); }
  // Destructor
  ~Node() { std::for_each(arg_begin(),arg_end(),[&](Node* n){ delete n; }); }


  // Evaluate expression
  virtual int evaluate() const = 0; // must be overriden
  // Get arity
  arity_type arity() const { return _children.size(); }
  bool is_leaf() const { return _children.empty(); }
  // Set/Get parent
  Node*& parent() { return _parent; }
  const Node* parent() const { return _parent; }
  // Iterator access to arguments
  arg_iterator arg_begin() { return _children.begin(); }
  const_arg_iterator arg_begin() const { return _children.begin(); }
  arg_iterator arg_end() { return _children.end(); }

  const_arg_iterator arg_end() const { return _children.end(); }
  // Erase argument (use const_arg_iterator if your compiler properly supports C++11)
  arg_iterator  arg_erase(arg_iterator i) { return _children.erase(i); }
  // Get argument
  Node* arg_first() { return _children.front(); }
  Node* arg_last() { return _children.back(); }
  Node* arg(arity_type i) { return _children[i]; }


  // Print subtree
  virtual void print(std::ostream&) const = 0;



We define a member function to add a child to the right:


void Node::add_child(Node* n)
{
  n->remove_parent_ptr(); // if n has a parent, disconnect it
  n->parent() = this;
  _children.push_back(n);
}



void Node::remove_parent_ptr()
{
  if (parent()) {
    auto i = parent()->arg_begin();
    while (*i != this) ++i; // Must be found, don't check arg_end()
    parent()->arg_erase(i);
  }
}


With this we can define our derived classes:


class Constant : public Node {
  public:
    Constant(int v) : val(v) {}
    int evaluate() const { return val; }
  protected:
    int val;
};

// Exception class thrown when evaluating non-ground expressions
class non_ground {};


class Variable : public Node {
  public:
    Variable(const std::string& s) : name(s) {}
    int evaluate() const { throw non_ground(); }
    void print(std::ostream& os) const { os << _name; }
protected:
    std::string name;
};


The purpose of the Variable class is to represent expressions containing unassigned variables, like "3*x+1". We can then implement a member function evaluate(const std::map<std::string,int>&) which is identical to evaluate() except for the variable class, in which case it looks up the variable to value mapping and returns it (or throws non_ground() if no such mapping is found).


class Add : public Node {
  public:
    Add(Node* n1, Node* n2) : Node(n1,n2) { }
    int evaluate() const { return arg_first()->evaluate() + arg_last()->evaluate(); }
    void print(std::ostream& os) const {
      os << "(";
      arg_first()->print(os); os << "+"; arg_last()->print(os);
      os << ")";
    }
};



class Sub : public Node {
  public:
    Sub(Node* n) : Node(n) { }
    Sub(Node* n1, Node* n2) : Node(n1,n2) { }
    int evaluate() const {
      if (is_leaf()) {
 return -arg_first()->evaluate();
      } else {
 return arg_first()->evaluate() - arg_last()->evaluate();
      }
    }
    void print(std::ostream& os) const {
      if (is_leaf()) {
 os << "-"; arg_first()->print(os);
      } else {
 os << "(";
 arg_first()->print(os); os << "-"; arg_last()->print(os);
 os << ")";
      }
    }
};



class Mul : public Node {
  public:
    Mul(Node* n1, Node* n2) : Node(n1,n2) { }
    int evaluate() const { return arg_first()->evaluate() * arg_last()->evaluate(); }
 void print(std::ostream& os) const {
  arg_first()->print(os); os << "*"; arg_last()->print(os);
 }
};


We can test it:



#include <iostream>
#include "tree.h"

int main()
{
 using namespace std;
 Add root(
  new Mul(new Constant(3), new Constant(12)),
  new Constant(6));
 cout << "Evaluating " << root << " => " << root.evaluate() << "\n"; 

 Mul root2(
  new Constant(3),
  new Add(new Constant(12), new Constant(6)));
 cout << "Evaluating " << root2 << " => " << root2.evaluate() << "\n";
}


Output is:

Evaluating (3*12+6) => 42
Evaluating 3*(12+6) => 54

Finally, we add iterators to traverse the tree depth first (similar techniques can be used to create breadth first iterators or even best-first by supplying a lambda).

There's two ways we can implement iterators: either we use a stack containing the expanded nodes, or we "manually" backtrack the tree. The first method leads to somewhat faster traversal in theory, since getting to the next node is O(1). On the other hand, we need to add all expanded nodes to the stack, and whenever we copy an iterator, we need to copy the entire stack.

STL algorithms are designed around an assumption of fast copying of iterators, so we pick the lighter iterator approach (manually backtracking).

We may start with the necessary typedefs:



class node_citerator {
        const Node* nptr; // pointer to node
public:
 typedef std::bidirectional_iterator_tag iterator_category;
 typedef Node value_type;
 typedef Node* pointer;
 typedef Node& reference;
 typedef ptrdiff_t difference_type;

        // Construct iterator from pointer to node
 node_citerator(const Node* fp) : nptr(fp) { }



        // Compare two iterators


 bool operator==(const node_citerator& di) const { return nptr == di.nptr; }
 bool operator!=(const node_citerator& di) const { return !(*this == di); }


        // Dereferencing

 const Node& operator*() const { return *nptr;  }
 const Node* operator->() const { return nptr; }


Here I have chosen to dereference directly to Node, rather than a pointer to node. This makes the iterator interface easier to use with STL algorithms (without the need to invoke lambdas to dereference all the time).


Our begin() and end() would then look as follows:


inline Node::const_iterator Node::begin() const
{
 return const_iterator(this);
}

inline Node::const_iterator Node::end() const
{
 return const_iterator(nullptr);
}


There is however a potential problem with this approach: it will not work if we wish to traverse a proper subtree, as opposed to the entire tree. This is because we are not storing the next nodes to visit, so when backtracking we essentially climb assuming the first node was the root (i.e. the node which has a nullptr parent pointer). As long as we only call begin() and end() from the root node, no surprises emerges. But consider:

Add root(new Mul(new Constant(3), new Constant(12)), new Constant(6)); std::for_each(root.arg_first()->begin(), root.arg_first()->end(), my_lambda);

This will not work as expected, since there is no clean way to implement the iterator's increment operator so that it knows where it started (i.e. that begin() and end() was called from root.arg_first(), and not root).

So we need to store the root node pointer too:


class node_citerator {
        const Node* nptr; // pointer to node
        const Node* root; // pointer to subtree root
public:
        // Same typedefs as before
        // Construct iterator from pointer to node, remember caller node
 node_citerator(const Node* fp, const Node* r) : nptr(fp), root(r) { }
        // Comparisons may be defined as before, using two iterators with different caller nodes
        // may simply be considered undefined behavior as it doesn't make any sense


Now our begin() and end() tells the iterator which node the call came from using the second argument:



inline Node::const_iterator Node::begin() const
{
 return const_iterator(this,this);
}

inline Node::const_iterator Node::end() const
{
 return const_iterator(nullptr,this);
}



Prefix increment would then look like this:



// Prefix increment
 node_citerator& operator++() {
  if (nptr->is_leaf()) {
   // This is a leaf node, so we need to climb up
   for (;;) {
    if (nptr == root) {
     nptr = nullptr;
     break;
    }
    // note: if nptr is not root, it must have a parent
    const Node* par = nptr->parent();
    // Determine which child we are
    auto next = par->arg_begin();
    for ( ; *next != nptr ; ++next); // no bounds check: nptr is in array
    ++next; // bring to next
    if (next != par->arg_end()) {
     // Branch down to next child
     nptr = *next;
     break;
    } else {
     // We were the last child of parent node, so continue up
     nptr = par;
    }
   }
  } else {
   // Not a leaf node, so move down one step to the left
   nptr = nptr->arg_first();
  }
  return *this;
 }



Prefix decrement may be defined similarly:



// Prefix decrement
 node_citerator operator--() {
  if (nptr) {
   // note: -- on first element is undefined => we may safely move up if not left
   if (nptr == nptr->parent()->arg_first()) {
    // nptr is first child => move up
    nptr = nptr->parent();
   } else {
    // nptr is not first child => move up one step, then traverse down
    // find pointer from parent
    auto prev = --nptr->parent()->arg_end();
    for ( ; *prev != nptr; --prev);
    --prev; // previous from nptr (prev can't be argv.front())
    nptr = *prev;
    // Now traverse down right most
    while (!nptr->is_leaf()) nptr = nptr->arg_last();
   }
  } else {
   // nptr at end, so we need to use root to get to last element
   for (nptr = root; !nptr->is_leaf(); ) {
    nptr = nptr->arg_last();
   }
  }
  return *this;
 }

Tuesday, June 5, 2012

map insertion with value_type and make_pair

Here's a peculiarity with MSVC 10 (not tested with other compilers):

First, create a class that keeps track of copies/moves:


#include <iostream>
#include <map>
#include <vector>

class Tracker {
public:
 Tracker() { std::cout << "Default Constructor\n"; }
 ~Tracker() { std::cout << "Destructor\n"; }
 Tracker(const Tracker&) { std::cout << "Copy Constructor\n"; }
 Tracker(Tracker&&) { std::cout << "Move Constructor\n"; }
 Tracker& operator=(const Tracker&) { std::cout << "Copy Assignment\n"; return *this; }
 Tracker& operator=(Tracker&&) { std::cout << "Move Assignment\n"; return *this; }
 bool operator<(const Tracker&) const { return false; }
};


Next, let's try to insert into a map using both value_type and make_pair. Note that C++11 has an overload to handle the pair type, which is convertible to the value_type (the only difference being that in the value_type, the first element is of type const Key rather than just Key).


int main()
{
 typedef std::map<Tracker,int> counter;
 counter m;

 std::cout << "Insert with value_type:\n";
 m.insert(counter::value_type(Tracker(),1));
 std::cout << "Done\n";
 m.clear();

 std::cout << "Insert with make_pair:\n";
 m.insert(std::make_pair(Tracker(),1));
 std::cout << "Done\n";
 m.clear();
}

You'd expect that both these operations would do the same thing: move the pairs into the map. However, the output is as follow:

Insert with value_type:
Default Constructor <------- Tracker()
Move Constructor <-------- counter::value_type(Tracker(),1)
Copy Constructor <-------- inside insert, copies value_type!
Destructor
Destructor
Done
Destructor
Insert with make_pair:
Default Constructor <------- Tracker()
Move Constructor <-------- std::make_pair(Tracker(),1)
Move Constructor <-------- inside insert, moves pair!
Destructor
Destructor
Done
Destructor

In each case, the default constructor is called for Tracker(). The object is then moved into the pair.
However, insertion with value_type copies the pair, whereas make_pair moves it!
This is quite unexpected, since value_type is the intended type to be used, and should hence be optimal!

Looking at the STL headers, it seems like the rvalue reference is lost during forwarding at some point (but I'm not sure). MSVC 10 bug?