tag:blogger.com,1999:blog-67678983513259182782017-07-18T22:46:44.524+08:00John AhlgrenA blog mostly about mathematics, computer science, and programming.John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.comBlogger39125tag:blogger.com,1999:blog-6767898351325918278.post-25699348654671630082014-04-18T16:33:00.002+08:002014-04-18T16:34:58.594+08:00A Soft Introduction to Constraints in Inductive Logic Programming<div dir="ltr" style="text-align: left;" trbidi="on">I've written <a href="http://www.ahlgren.info/research/cie" target="_blank">a gentle introduction to my research in constraints used in inductive logic programming.</a><br /><br /></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-65564219418760194972014-03-26T16:00:00.001+08:002014-04-11T12:44:24.757+08:00std::vector with constructors that throw exceptions<div dir="ltr" style="text-align: left;" trbidi="on">When using std::vector, reallocating may happen for several reasons:<br /><br /><pre><span style="color: maroon; font-weight: bold;">class</span> A <span style="color: purple;">{</span> <span style="color: #808030;">.</span><span style="color: #808030;">.</span><span style="color: #808030;">.</span> <span style="color: purple;">}</span><span style="color: purple;">;</span><br /><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">vector</span><span style="color: purple;"><</span>A<span style="color: purple;">></span> v<span style="color: purple;">;</span><br />v<span style="color: #808030;">.</span>push_back<span style="color: #808030;">(</span>A<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br />v<span style="color: #808030;">.</span>push_back<span style="color: #808030;">(</span>A<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// may reallocate</span><br />v<span style="color: #808030;">.</span>resize<span style="color: #808030;">(</span><span style="color: #008c00;">100</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// may reallocate</span></pre><br />When reallocation happens, all elements of the vector need to be copied or moved to a new memory location. There two things to wonder about then: will std::vector reallocate by copying, or moving the objects? And what if one of A's constructors (A(A&&)) throws and the other (A(const A&)) doesn't throw?<br /><br />The table below shows the situation, as guaranteed by the C++ standard. (This article assumes that A's destructor doesn't throw any exceptions.)<br /><br /><table border="1" style="border-collapse: collapse; border: 1px solid black; margin: 10px; padding: 10px;"> <tbody><tr> <th>Constructors</th> <th>No copy</th> <th>Copy throws</th> <th>Copy no throw</th> </tr><tr> <th>No move</th> <td>-</td> <td>Throws (copies)</td> <td>No throw (copies)</td> </tr><tr> <th>Move throws</th> <td>Throws (moves)</td> <td>Throws (moves)</td> <td>No throw (copies)</td> </tr><tr> <th>Move no throw</th> <td>No throw (moves)</td> <td>No throw (moves)</td> <td>No throw (moves)</td> </tr></tbody></table><br />In other words: if one of copy/move is not available, std::vector picks the other one (if none are available, you can't use std::vector). If both are available and one throws but not the other, it pick the one that doesn't throw. If both throw, it uses move (since we can't do better anyway). If none throw, it uses move.<br /><br />The most non-obvious part is this: when std::vector has to chose between the slower copy constructor that does not throw, and the faster move constructor that may throw, it takes the slower, but safer, copy constructor.<br /><br /><h3 style="text-align: left;">What can I assume when an exception is thrown (from copy/move constructor)?</h3>What are you allowed to assume about the state of the objects inside the vector after an exception is thrown? With respect to exceptions thrown by the copy and move constructors, you only get the basic exception guarantee: no memory leaks have occurred, and all objects (A) are in a well defined state. They may not have the same value as prior to the exception, but you can at least reassign them, e.g. v.back() = A();.<br /><br /><h3 style="text-align: left;">How does the compiler know whether a copy/move constructor throws?</h3><div>It doesn't usually. If it can <b>prove</b> that it won't, it will take advantage of that to offer the best possible alternative (that's what the table above shows). If it can't prove, it will assume the worst: that the copy/move constructor may throw (in the table above). You can "force" it to assume it won't throw by using <b>noexcept</b>:</div><div><br /></div><div>class A {</div><div> A(A&&) noexcept; // force compiler to always move</div><div>};</div><div><br /></div><div>Formally, the compiler encodes information about "provably not throwable" in <a href="http://www.cplusplus.com/reference/type_traits/is_nothrow_move_constructible/" target="_blank">std::is_nothrow_move_constructible</a> (which, if pedantic, should perhaps be called is_proven_nothrow_move_constructible). The logic of the table above is then encoded in the conditional move operator <a href="http://www.cplusplus.com/reference/utility/move_if_noexcept/" target="_blank">std::move_if_noexcept</a>. Indeed, this means different compilers may produce different results in terms of moving vs copying (but if a compiler moves, it's always because that was provably the better thing to do).<br /><br /></div><h3 style="text-align: left;">Do compilers actually do this correctly?</h3><div>Microsoft Visual Studio 2013 with CTP 1 does not. Here's code to test it with your compiler:</div><div><br /></div><div><pre><span style="color: maroon; font-weight: bold;">struct</span> A<br /><span style="color: purple;">{</span><br /> A<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">=</span> <span style="color: maroon; font-weight: bold;">default</span><span style="color: purple;">;</span><br /> A<span style="color: #808030;">(</span>A<span style="color: #808030;">&</span><span style="color: #808030;">&</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">throw</span> <span style="color: #008c00;">1</span><span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> A<span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">const</span> A<span style="color: #808030;">&</span><span style="color: #808030;">)</span> <span style="color: #808030;">=</span> <span style="color: maroon; font-weight: bold;">default</span><span style="color: purple;">;</span><br /><span style="color: purple;">}</span><span style="color: purple;">;</span><br /><br /><br /><span style="color: maroon; font-weight: bold;">int</span> <span style="color: #400000;">main</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span><br /><span style="color: purple;">{</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">vector</span><span style="color: purple;"><</span>A<span style="color: purple;">></span> v<span style="color: purple;">;</span><br /><br /> <span style="color: maroon; font-weight: bold;">try</span> <span style="color: purple;">{</span><br /> v<span style="color: #808030;">.</span>resize<span style="color: #808030;">(</span><span style="color: #008c00;">5</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> v<span style="color: #808030;">.</span>resize<span style="color: #808030;">(</span><span style="color: #008c00;">10000</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: purple;">}</span> <span style="color: maroon; font-weight: bold;">catch</span> <span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">cerr</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0000e6;">BUG: should have used copy constructor instead</span><span style="color: #0f69ff;">\n</span><span style="color: maroon;">"</span><span style="color: purple;">;</span><br /> <span style="color: purple;">}</span><br /><br /><span style="color: purple;">}</span></pre>With MSVS 2013 CTP 1, this prints "BUG: should have used copy constructor instead". Using <b>noexcept</b> does not help either.<br /><br /></div><h3 style="text-align: left;">Is there a way to do avoid these issues?</h3><div>Yes, only use std::vector when at least one of the copy/move constructor does not throw (and use noexcept to make sure the compiler sees that). If both constructors throw, use std::list. Since it never reallocates, you will never have these issues. std::deque can also be used if you only push_front and push_back elements, as it doesn't reallocate then.</div><div><br /><h3>What about exceptions thrown by other member functions (not copy/move constructor)?</h3><div>(Off topic, only here to clarify the difference between the basic exception guarantee above and the more commonly known strong exception guarantee). Then you get the strong exception guarantee, which basically says that everything has been rollbacked to the state just prior to the thrown exception. All objects have the values you'd expect. This is just like when performing database transactions.</div></div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com2tag:blogger.com,1999:blog-6767898351325918278.post-61603346758314348462014-03-19T23:44:00.001+08:002015-02-20T01:56:43.832+08:00Inductive Reasoning Visualized<div dir="ltr" style="text-align: left;" trbidi="on">Ever wondered what inductive reasoning (technically, using inductive logic programming) would look like if you could draw a picture?<br /><div>Here's how:</div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-XOIk62MtqxI/VOYjmEFgRXI/AAAAAAAA4-o/WRZ0wfjdI9g/s1600/grandparent.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://4.bp.blogspot.com/-XOIk62MtqxI/VOYjmEFgRXI/AAAAAAAA4-o/WRZ0wfjdI9g/s1600/grandparent.png" height="162" width="400" /></a></div><br /></div><div><br />I'll explain what you're looking at. Let us say you know the following facts:<br />Gaia is a parent of Cronus.<br />Cronus is a parent of Zeus.<br />Zeus is a parent of Athena.<br /><br />You also know that:<br />Gaia is a grandparent of Zeus.<br />Zeus is a grandparent of Athena.<br />Gaia is not a grandparent of herself.<br />Gaia is not a grandparent of Cronus.<br />Cronus is not a grandparent of Gaia.<br />Athena is not a grandparent of Cronus.<br /><br />Now you ask the computer to induce a definition of grandparenthood based on these facts.<br /><br />To do this, the machine needs to try different possibilities, and these are what you see in the graph.<br /><br />On top, you see:<br />grandparent(_,_)<br />Now the sentence "X is a grandparent of Y" is what the machine writes as "grandparent(X,Y)", and an underscore simply means "anybody". So this is the hypothesis that "anybody is a grandparent of anybody".<br />The machine knows this to be false because of what you see in the red square: 4. Four is the number of "problems" found with this hypothesis: namely, it predicts that "Gaia is a grandparent of herself", which we know to be false. It predicts every instance of "is not a grandparent of" above, and there are 4 of them. Thus this hypothesis is not consistent with observations (its predictions are falsified).<br /><br />Next we have<br />grandparent(A,_) :- parent(A,_)<br />which in English reads "A is a grandparent of anyone if A is a parent of anyone". As you can see, the red box says 3, because it has 3 problems: it predicts that Gaia is a grandparent of herself, since Gaia is a parent (of whom does not matter, but it happens to be Zeus), which we know to be false. For the same reason, it predicts that Gaia is a grandparent of Zeus, which is also false. Finally, it predicts that Cronus is a grandparent of Gaia, since Cronus is a parent (again, of whom does not matter). The negative example "Athena is not a grandparent of Cronus" is not (incorrectly) predicted to be true, since Athena is not a parent.<br /><br />This is the basic idea in Inductive Logic Programming: we construct lots of hypotheses and test them against the examples we have. There are two solutions that look promising (green boxes):<br />grandparent(A,B) :- parent(A,C), parent(C,B)<br />which states that A is a grandparent of B if A is a parent of some C and that C is a parent of B. This is indeed the correct definition, and it does cover both positive examples ("Gaia is a grandparent of Zeus", "Zeus is a grandparent of Athena"), and does not cover any of the 4 negative examples.<br /><br />The other promising solution is<br />grandparent(A,B) :- parent(A,C), parent(C,B), parent(B,_)<br />which gets the grandparent relation slightly wrong: on top of requiring the correct conditions (A is a parent of some C, which is a parent of B), it also requires the grandchild B to be a parent. So according to this definition, Athena is not the grandchild of Gaia because she does not (yet) have any children, but when she does, she'll satisfy the conditions. The machine knows this definition is worse than the right one because it cannot explain the fact that Zeus is a grandparent of Athena. Hence it only explains one of the two facts (that's the 1 in the green gox).<br /><br />I'll leave it as an exercise to interpret all the other hypotheses in the graph.<br /><br /></div><div>The picture was produced using my <a href="http://www.ahlgren.info/research/atom" target="_blank">ILP system Atom</a>.<br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com10tag:blogger.com,1999:blog-6767898351325918278.post-19740006445024985042014-02-26T18:17:00.002+08:002014-02-26T18:28:58.107+08:00How similar are two words? Or two formulas?<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">The problem</h2>Consider the word "John". How similar is it to "Jon", "Johan", and "on"?<br /><br />This of course depends on precisely what we mean by similar. But we have an intuition that "John" and "Jon" should be more similar than say "John" and "Table". In particular, "similarity" should have something to do with the occurrence of same characters, as well the order in which they occur. But can we make this idea precise?<br /><h2 style="text-align: left;">The Solution</h2>Yes, we can make this idea precise. The key is to consider how many <b>edits</b> would be necessary to transform one word into another:<br />To go from "John" to "Jon", simply remove the "h". (Or conversely, start with "Jon" and add an "h".)<br />To go from "John" to "Johan", add an "a". (Or start with "Johan" and remove "a".)<br />To go from "John" to "on", remove the "J" and the "h". (Or start with "on" and add "J" and "h".)<br /><br />Thus it requires one edit between "John" and "Jon", and likewise for "John" to "Johan", but two edits between "John" and "on". Clearly the fewer edits the more similar to words are. In particular, a word has zero edit distance to itself, and only to itself, so it is most similar to itself.<br /><br />Here we have assumed that an edit is a removal or addition. Given two words of length N and M, in the worst case we have to remove all N letters and then add all M, giving an edit distance of N+M. This happens precisely when all letters are distinct, so nothing can be reused: "John" and "Strawberry" has edit distance 4+10=14 since they share no letters in common.<br /><br />We could also consider other operations, such as swapping a letter for another. Thus the edit distance from "John" to "Joan" is one edit (swap "h" for "a") instead of two (remove "h", add "a"). Then the maximum edit distance between two words of length N and M is max(N,M), since we can always swap the first min(N,M) letters and then add the remaining max(N,M)-min(N,M). For example, from "John" to "Strawberry", start by swapping "J" to "S", "o" to "t", "h" to "r", and "n" to "a", giving "John" -> "Stra", then add the remainder of the word, "wberry", for a total of 10 edits.<br /><br /><div>When the edits are specifically addition, removal, and swapping of characters, it is known as the<a href="http://en.wikipedia.org/wiki/Edit_distance" target="_blank"> Levenshtein distance</a>. For the remainder of this post, this is implicitly what we'll mean by "edit distance".</div><div><br /></div><div>Now we know the minimium (0) and maximum (max(N,M)) edit distances. But how do we compute the distance in general? From a logical point of view, we could "try everything":</div><br />Let \( s_i \) be the substring obtaining by keeping the first i characters of a string s and cutting off the rest. For example, if \( s \) = "John", then \( s_0 \) = "" and \( s_2 \) = "Jo" . Note that \( s = s_4 \).<br /><br />The edit distance between two strings \( s \) and \( t \) of length N and M, respectively, can be defined using two base cases, namely when the strings run out:<br />\[ d(s_0, t_j) = j \]<br />\[ d(s_i, t_0) = i \]<br /><div>and a recursive case when \( i > 0 \) and \( j > 0 \):</div><div>\[ d(s_i, t_j) = \min \{ d(s_{i-1},t_j) + 1, d(s_i,t_{j-1}) + 1, d(s_{i-1},t_{j-1}) + [s[i] \neq t[j]] \} \]<br />where \( [s[i] \neq t[j]] \) is 1 when character at position i in s is the same as character at position j in t, and 0 otherwise. The computation is started using \( d(s,t) = d(s_N, t_M) \).<br /><br />The edit distance is a metric (as long as all edit costs are positive real numbers), so they do define a "distance" in a fairly intuitive sense of the word. In particular, the triangle inequality holds.</div><h2 style="text-align: left;">Efficient Implementation</h2><div>The definition above is triply recursive, and hence direct implementation has exponential complexity.</div><div><br /></div><div>A much more efficient implementation can be obtained by observing that for the recursive case, \( d(s_i, t_j) \) only needs to know the values of \( d(s_{i-1},t_j) \), \( d(s_i,t_{j-1}) \), and \( d(s_{i-1},t_{j-1}) \).</div><div>Thus, if we start by computing all possible base cases \( d(s_0, t_j) = j \) for all \( 0 \leq j \leq M \) and \( d(s_i, t_M) = i \) for all \( 0 \leq i \leq N \), we can then "unfold" the recursive values all the way back until we get to \( d(s_0, t_0) \).</div><div><br /></div><div>For example, consider the case s = "John" and t = "on".<br /><br />By using base cases (or trivial observation), d("","on") = 2, d("","o") = 1, and d("","") = 0. Likewise, d("John","") = 4, d("Joh","") = 3, d("Jo","") = 2, d("J","") = 1.<br /><br />Then, using the recursive formula:</div><div><ol style="text-align: left;"><li>d("J","o") = 1+d("","") = 1</li><li>d("Jo",o") = min{ d("Jo","")+1, d("J","o")+1, d("J","")+0 } = min { 3, 2, 1 } = 1</li><li>d("Joh","o") = min { d("Joh","")+1, d("Jo","o")+1, d("Jo","")+1 } = min { 4, 2, 3 } = 2</li><li>d("John","o") = min { d("John","")+1, d("Joh","o")+1, d("Joh","")+1 } = min { 4, 3, 4 } = 3</li></ol><div>And using this, we can compute:</div></div><div><ol style="text-align: left;"><li>d("J","on") = min { d("J","o")+1, d("","on")+1, d("","o")+1 } = min { 2, 3, 2 } = 2</li><li>d("Jo","on") = min { d("Jo","o")+1, d("J","on")+1, d("J","o")+1 } = min { 2, 2, 2 } = 2</li><li>d("Joh","on") = min { d("Joh","o")+1, d("Jo","on")+1, d("Jo","o")+1 } = min { 3, 3, 2 } = 2</li><li>d("John","on") = min { d("John","o")+1, d("Joh","on")+1, d("Joh","o")+0 } = min { 4, 3, 2 } = 2</li></ol></div><div>Thus d("John","on") = 2, and if we trace back a path, d("John","on") = d("Joh","o") = d("Jo","o")+1 = d("J","")+1 = 1+1 = 2. This corresponds to adding the "J" and "h" to "on".</div><div><br /></div><div>We can <a href="http://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm" target="_blank">visualize this algorithm</a> as constructing a matrix row by row, thus the time complexity is \( O(NM) \). Only the previous row is needed in memory at all times, therefore having memory usage \( O(min(N,M)) \). However, if we wish to obtain not only the distance but also the operations needed to go from one string to another, we need to store the entire matrix, that is, \( O(NM) \). <a href="http://en.wikipedia.org/wiki/Edit_distance#Improved_algorithm" target="_blank">Improvements to this algorithm also exist.</a></div><h2 style="text-align: left;">Generalization to formulas (trees)</h2><div>Similarity between strings/sequences can be generalized to trees (and forests) by again considering edit distances. One can then add, remove, and swap nodes instead. <a href="http://dl.acm.org/citation.cfm?id=1644017" target="_blank">Surprisingly, an algorithm with time complexity \( O(N^3) \) is known, which also reduces to \( O(N^2) \) for strings.</a> Thus we can also (quite) efficiently compare mathematical formulas and logical expressions, although it's worth noting that these comparisons are only at a syntactic level: they carry no semantic information (for that, use a generalization order, such as subsumption). For example, the formulas "mortal(X)", "mortal(socrates)", "mortal(plato)" are all at a tree edit distance of one (simply swap X for socrates and plato), but the first formula implies the two others, whereas neither of the two latter imply each other.</div><div><br /></div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-75711272803223356862014-02-05T19:02:00.000+08:002014-02-06T01:17:06.383+08:00Telescoping Sums<div dir="ltr" style="text-align: left;" trbidi="on">Consider generalizations to the closed form formula for arithmetic sums:<br />\[<br />\sum_{k=1}^n k = 1+2+\ldots + n = \frac{n(n+1)}{2}<br />\]<br />Here's the formula for the square sums:<br />\[<br />\sum_{k=1}^n k^2 = 1^2+2^2+\ldots + n^2 = \frac{n(n+1)(2n+1)}{6}<br />\]<br />And cube sums: \[<br />\sum_{k=1}^n k^3 = 1^3+2^3+\ldots + n^3 = \left( \frac{n(n+1)}{2} \right)^2 = \left(\sum_{k=1}^n k \right)^2<br />\] <br />These formulas, among many other closed forms (e.g. for recurrence relations and binomial/hyperbolic sums) can be obtained using a deceptively simple technique: telescope sums. In its general form, it amounts to the following observation. For any function \( f \):<br />\[<br />\sum_{k=1}^n f(k)-f(k-1) = f(n) - f(0)<br />\]<br />since all intermediate terms cancel out.<br /><div><div><br />Applying this simple principle to the function \( f(k)=k^2 \), we get:</div></div><div><div>\[<br />\sum_{k=1}^n (k^2-(k-1)^2) = n^2<br />\]<br />But we can also expand the square inside the sum:</div></div><div>\[<br />\sum_{k=1}^n (k^2-(k-1)^2) = \sum_{k=1}^n (2k - 1) = 2 \sum_{k=1}^n k - n<br />\]<br />Equating these two results, we obtain the closed form formula for arithmetic sums:<br />\[<br />n^2 = 2 \sum_{k=1}^n k - n \iff \sum_{k=1}^n k = \frac{n(n+1)}{2}<br />\]</div><div><br /></div><div>We can take the same approach to obtain sums of exponents. For example, the square sum is obtained using the telescoping principle on \( f(k) = k^3 \):</div><div><div>\[<br />\sum_{k=1}^n (k^3-(k-1)^3) = n^3<br />\]<br />Expanding the cube inside the sum:</div><div>\[<br />\sum_{k=1}^n (k^3-(k-1)^3) = \sum_{k=1}^n (3k^2-3k+1) = 3 \sum_{k=1}^n k^2 - 3\frac{n(n+1)}{2} + n<br />\]<br />Equating these two results, we obtain the desired closed form formula:<br />\[<br />n^3 = 3 \sum_{k=1}^n k^2 - 3\frac{n(n+1)}{2} + n \iff 6 \sum_{k=1}^n k^2 = 2n^3 + 3n(n+1) - 2n = n(n+1)(2n+1)<br />\]</div></div><div><br /></div><div>This gives a recursive method of computing \( \sum_{k=1}^n k^p \) for any positive integer \( p \).</div><div><br /></div><div>To obtain a non-recursive method (that doesn't require computing all sums with lower exponents), note that in general, the sum with exponent \( p \) will be a polynomial of degree \( p+1 \), so another way to solve the sums is to performs an ansatz and compute the polynomial coefficients from particular values. </div><div><br />Let's try the non-recursive algorithm to find the sum with \( p=4 \):<br />\[<br />\sum_{k=1}^n k^4 = 1^4+2^4+\ldots + n^4<br />\]<br />without first solving the sum for \( p=3 \).<br />For \( p=4 \), we know the closed form is a polynomial of degree \( p+1=5 \). So we set:<br />\[<br />\sum_{k=1}^n k^4 = 1^4+2^4+\ldots + n^4 = a_5 n^5 + a_4 n^4 + a_3 n^3 + a_2 n^2 + a_1 n + a_0<br />\]<br />We have 6 unknown coefficients, so we need (at least) 6 equations. These can be obtained by computing the quartic sum for \( n=1,2,\ldots,6 \):<br />\[<br />n=1 \implies a_5 + a_4 + a_3 + a_2 + a_1 + a_0 = 1<br />\]<br />\[<br />n=2 \implies 2^5 a_5 + 2^4 a_4 + 2^3 a_3 + 2^2 a_2 + 2 a_1 + a_0 = 17<br />\]<br /><div>\[<br />n=3 \implies 3^5 a_5 + 3^4 a_4 + 3^3 a_3 + 3^2 a_2 + 3^1 a_1 + a_0 = 98<br />\]</div><div>and so on. </div><div><br /></div><div>Solving the linear system, we obtain: \( a_5=1/5, a_4=1/2, a_3=1/3,a_2=a_1=0, a_0=-1/30 \). You can either do this by hand (time consuming), or using a computer algebra system, or <a href="http://wims.unice.fr/wims/wims.cgi" target="_blank">using this online calculator</a> by copy-pasting the following:</div><div><div>a + b + c + d + t + f = 1</div><div>32*a + 16*b + 8*c + 4*d + 2*t + f = 17</div><div>243*a + 81*b + 27*c + 9*d + 3*t + f = 98</div><div>1024*a + 256*b + 64*c + 16*d + 4*t + f = 354</div><div>3125*a + 625*b + 125*c + 25*d + 5*t + f = 979</div><div>7776*a + 1296*b + 216*c + 36*d + 6*t + f = 2275</div></div><div><br /></div><div>The identity is thus:</div><div>\[<br />\sum_{k=1}^n k^4 = 1^4+2^4+\ldots + n^4 = \frac{n^5}{5} + \frac{n^4}{2} + \frac{n^3}{3} - \frac{1}{30}<br />\]</div><div><br /></div><div><br /></div>This approach is taken far beyond sums of powers in the computer algebra development of <a href="http://www.math.upenn.edu/~wilf/AeqB.html" target="_blank">Wilf-Zeilberger pairs</a>.<br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-15876383191054702092013-12-09T23:48:00.003+08:002013-12-09T23:48:52.159+08:00Solving Logic Riddles using AI Programming<div dir="ltr" style="text-align: left;" trbidi="on"><div>Consider three persons, Caesar, Brutus, and Cassius. One is a king, another is a bureaucrat, and the third is a spy. The king never lies, whereas the bureaucrat always lies. The spy either lies or tells the truth. Caesar claims that Cassius is a bureaucrat, Brutus claims that Caesar is a king, Cassius claims that he is a spy.</div><br />Who is who?<br /><div><br /></div><h2>The solution, by hand</h2><div>Let's start by solving this without computer assistance. Cassius cannot be a king: he is asserting that he is a spy, so if he were king, he would be lying - which is impossible for the king to do. Likewise, Brutus cannot be king, since he claims Caesar is, and so would be lying. So Caesar is the king. Brutus is then telling the truth, so he cannot be a bureaucrat (the Bureaucrat always lies). So Brutus must be a spy, and, by exclusion, Cassius is a bureaucrat (which is consistent, since Cassius is lying when asserting he is a spy). To summarize, there is a unique solution, which is:</div><div><br /></div><div>Caesar: king.</div><div>Brutus: spy.</div><div>Cassius: bureaucrat.</div><div><br /></div><h2>The solution, AI programming</h2><div>How would you encode information about this problem in such a way as to let the computer solve it? Sure, in this case the solution was easy to obtain, but if you had 100+ rules and a lot more people it would not be easy to solve the problem by hand (think of the problem as an equation with values true/false). Logic programming, as the name suggest, provides an intuitive and natural way to express logical riddles (among other things). The following in Prolog:</div><div><br /></div><div>First, tell the computer Caesar, Brutus, and Cassius are persons:</div><div><br /></div><div><div>person(caesar).</div><div>person(brutus).</div><div>person(cassius).</div><div><br /></div></div><div>Next, we would like to assign King, Liar (Bureaucrat is too long to type), and spy to these persons:</div><div><div><br /></div><div>assign(K,L,S) :-</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>person(K), % king is a person</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>person(L), % liar is a person</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>person(S), % spy is a person</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>king(K,L,S), % constraints for king (to be defined later)</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>liar(K,L,S), % constraints for liar</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>spy(K,L,S), % constraints for spy</div><div><span class="Apple-tab-span" style="white-space: pre;"> </span>different(K,L,S). % king, liar, and spy must be different people</div></div><div><br /></div><div><div>We must thus defined the constraints imposed by virtue of being King, Liar, and Spy:</div><div><br /></div><div><div>king(K,L,S) :- says(K,K,L,S). % Whatever King says is true</div><div>liar(K,L,S) :- \+ says(L,K,L,S). % Whatever Liar says is false (think of "\+" as "not true")</div></div><div>spy(_,_,_). % Anybody can be a spy</div><div style="-webkit-text-stroke-width: 0px; color: black; font-family: 'Times New Roman'; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px;"><div style="margin: 0px;"><br /></div></div>Now onto the claims that each of them make:</div><div><br /></div><div><div>says(caesar,K,L,S) :- L = cassius. % Caesar says Cassius is Liar</div><div>says(brutus,K,L,S) :- K = caesar. % Brutus says Caesar is King</div><div>says(cassius,K,L,S) :- S = cassius. % Cassius says he himself is Spy</div></div><div><br /></div><div>Finally, we define what "different" means:<br /><br />different(X,Y,Z) :- X \= Y, X \= Z, Y \= Z. % Not the same: X,Y,Z </div><div><br /></div><div>Now we may query Prolog to obtain all solutions:</div><div><br /></div><div><div>?- assign(K,L,S).</div><div>K = caesar, <--- Caesar is King</div><div>L = cassius, <--- Liar is Cassius</div><div>S = brutus ; <--- Spy is Brutus</div><div>false. <---- This means there are no more solutions</div></div><div><br /></div><h2 style="text-align: left;">An interesting change</h2><div>To see that this really works, consider what would happen if we don't require that there must be a unique King, Liar, and Spy. For example, one person could then be both King and Spy. This is effectively done by removing the requirement that they are all different: either remove different(K,L,S) from assign(K,L,S), or define it trivially as different(_,_,_), meaning there are no constraints).</div><div><br /></div><div>Before executing the query, let's make some observations: a person cannot both the king and liar, assuming that person makes at least one claim. Since everybody makes a claim here, nobody can be both king and liar. So at least two people must be assigned the properties {king, liar, spy}. Clearly anybody can in principle be both king and spy, or liar and spy, since spy adds no new constraints. Furthermore, our original solution should still be part of the new solution set, since we've only removed one constraint.</div><div><br /></div><div>How many solutions do you count? This problem is clearly more difficult than the previous one. Our query will reveal all solutions:</div><div><div><br /></div><div>?- assign(K,L,S).</div><div>Solution 1:</div><div>K = S, S = caesar, <--- Caesar is both king and spy</div><div>L = cassius <--- Cassius is liar</div></div><div><br /></div><div><div>Solution 2 (This is the original solution):</div></div><div><div>K = caesar,</div><div>L = cassius,</div><div>S = brutus </div></div><div><br /></div><div>Solution 3:</div><div><div>K = S, S = cassius, <--- Cassius is king and spy</div><div>L = caesar <--- Caesar is Liar</div></div><div><br /></div><div>Solution 4:</div><div><div>K = S, S = cassius, <--- As above, Cassius is king and spy</div><div>L = brutus <--- This time Brutus is the liar (instead of Caesar)</div></div><div><br /></div><div>false. <--- No more solutions.</div><div><br /></div><div>There is an asymmetry of solutions: when Cassius is both king and spy (solution 3 and 4), the liar can be Caesar or Brutus. But when Caesar is king and spy (solution 1), only Cassius can be a liar, not Brutus. This is because Brutus asserts that Caesar is king, and liars are never allowed to tell the truth.</div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-52785295253060689762013-12-01T20:24:00.000+08:002013-12-02T18:35:27.959+08:00Which shape gives the largest area?<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">The Isoperimetric Inequality</h2>Given a rope of length L, say 100 meters, what is the largest closed area you can make?<br />If we try a square, each side has to be L/4 (25 meters), and we'll end up with an area of (L/4)^2 = 625 square meters. If we however try a circle, the circumference is L=2<span style="font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>R, where R is the radius. Thus R = L/(2<span style="font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>), and the area is <span style="background-color: white; color: #444444; font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>(L/2<span style="font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>)^2 = L^2/(4<span style="font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>) = 796 square meters, which is better than the square.<br /><br />But there's plenty more shapes to try, in fact, infinitely many. So which should we pick to maximize the area?<br />The ancients knew the answer but could not prove it: it's the circle. The proof of this is surprisingly simple using calculus. Let L be the length of the rope, and let A be the enclosed area.<br /><br />Note that for a circle, the relationship between A and L is L=2<span style="font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>R and A=<span style="font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>R^2, where R is the radius, so that:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=A&space;=&space;\pi&space;R^2&space;=&space;\pi&space;\left(\frac{L}{2\pi}\right)^2&space;=&space;\frac{L^2}{4\pi}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?A&space;=&space;\pi&space;R^2&space;=&space;\pi&space;\left(\frac{L}{2\pi}\right)^2&space;=&space;\frac{L^2}{4\pi}" title="A = \pi R^2 = \pi \left(\frac{L}{2\pi}\right)^2 = \frac{L^2}{4\pi}" /></a><br /><br />Given the belief that the circle gives the largest area, we expect that for any shape,<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=4\pi&space;A&space;\leq&space;L^2" target="_blank"><img src="http://latex.codecogs.com/gif.latex?4\pi&space;A&space;\leq&space;L^2" title="4\pi A \leq L^2" /></a><br /><br />This is known as the isoperimetric inequality.<br /><br /><h2 style="text-align: left;">Hurwitz's Proof</h2>Assume that the curve (x(t),y(t)) is parametrized so that it does one full revolution when t goes from 0 to <span style="background-color: white; color: #444444; font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>. The arc length is defined by s = Lt/(2<span style="background-color: white; color: #444444; font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>). Note that when t=0, s=0, and when t=2<span style="background-color: white; color: #444444; font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>, s=L.<br /><div><br /></div>We then have:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\int_{0}^{2\pi}&space;\left(\frac{dx}{dt}\right)^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;=&space;\int_{0}^{2\pi}&space;\left(\frac{ds}{dt}\right)^2&space;dt&space;=&space;\int_{0}^{2\pi}&space;\left(\frac{L}{2\pi}\right)^2&space;dt&space;=&space;\frac{L^2}{2\pi}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\int_{0}^{2\pi}&space;\left(\frac{dx}{dt}\right)^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;=&space;\int_{0}^{2\pi}&space;\left(\frac{ds}{dt}\right)^2&space;dt&space;=&space;\int_{0}^{2\pi}&space;\left(\frac{L}{2\pi}\right)^2&space;dt&space;=&space;\frac{L^2}{2\pi}" title="\int_{0}^{2\pi} \left(\frac{dx}{dt}\right)^2 + \left(\frac{dy}{dt}\right)^2 dt = \int_{0}^{2\pi} \left(\frac{ds}{dt}\right)^2 dt = \int_{0}^{2\pi} \left(\frac{L}{2\pi}\right)^2 dt = \frac{L^2}{2\pi}" /></a><br /><br />As for the area, we use Green's theorem:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=A&space;=&space;\int_{A}&space;dx&space;dy&space;=&space;\int_{\partial&space;A}&space;(-y)&space;dx&space;=&space;\int_{0}^{2\pi}&space;(-y)&space;\frac{dx}{dt}&space;dt" target="_blank"><img src="http://latex.codecogs.com/gif.latex?A&space;=&space;\int_{A}&space;dx&space;dy&space;=&space;\int_{\partial&space;A}&space;(-y)&space;dx&space;=&space;\int_{0}^{2\pi}&space;(-y)&space;\frac{dx}{dt}&space;dt" title="A = \int_{A} dx dy = \int_{\partial A} (-y) dx = \int_{0}^{2\pi} (-y) \frac{dx}{dt} dt" /></a><br /><br />Now, simply take the difference we want to show is always non-negative:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=L^2&space;-&space;4\pi&space;A&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}\right)^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;-&space;2\int_{0}^{2\pi}&space;(-y)&space;dx&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}\right)^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;+&space;2y\frac{dx}{dt}&space;dt&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}&space;+&space;y\right)^2&space;-&space;y^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}&space;+&space;y\right)^2&space;+&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dy}{dt}\right)^2&space;-&space;y^2&space;dt" target="_blank"><img src="http://latex.codecogs.com/gif.latex?L^2&space;-&space;4\pi&space;A&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}\right)^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;-&space;2\int_{0}^{2\pi}&space;(-y)&space;dx&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}\right)^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;+&space;2y\frac{dx}{dt}&space;dt&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}&space;+&space;y\right)^2&space;-&space;y^2&space;+&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;\\&space;=&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dx}{dt}&space;+&space;y\right)^2&space;+&space;2\pi&space;\int_0^{2\pi}&space;\left(\frac{dy}{dt}\right)^2&space;-&space;y^2&space;dt" title="L^2 - 4\pi A \\ = 2\pi \int_0^{2\pi} \left(\frac{dx}{dt}\right)^2 + \left(\frac{dy}{dt}\right)^2 dt - 2\int_{0}^{2\pi} (-y) dx \\ = 2\pi \int_0^{2\pi} \left(\frac{dx}{dt}\right)^2 + \left(\frac{dy}{dt}\right)^2 + 2y\frac{dx}{dt} dt \\ = 2\pi \int_0^{2\pi} \left(\frac{dx}{dt} + y\right)^2 - y^2 + \left(\frac{dy}{dt}\right)^2 dt \\ = 2\pi \int_0^{2\pi} \left(\frac{dx}{dt} + y\right)^2 + 2\pi \int_0^{2\pi} \left(\frac{dy}{dt}\right)^2 - y^2 dt" /></a><br /><br />We want to conclude that this expression is non-negative for any curve. The first integral is clearly non-negative, since its integrand is a square. As for the second integral, the result is not obvious until one consider its Fourier series (this is called Wirtinger's inequality):<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=y(t)&space;=&space;\frac{a_0}{2}&space;+&space;\sum&space;(a_n&space;\sin(nt)&space;+&space;b_n&space;\cos(nt))" target="_blank"><img src="http://latex.codecogs.com/gif.latex?y(t)&space;=&space;\frac{a_0}{2}&space;+&space;\sum&space;(a_n&space;\sin(nt)&space;+&space;b_n&space;\cos(nt))" title="y(t) = \frac{a_0}{2} + \sum (a_n \sin(nt) + b_n \cos(nt))" /></a><br /><br />As we will see soon, for this to work, we must place the curve in coordinates such that:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=a_0&space;=&space;\frac{1}{\pi}&space;\int_0^{2\pi}&space;y(t)&space;dt&space;=&space;0" target="_blank"><img src="http://latex.codecogs.com/gif.latex?a_0&space;=&space;\frac{1}{\pi}&space;\int_0^{2\pi}&space;y(t)&space;dt&space;=&space;0" title="a_0 = \frac{1}{\pi} \int_0^{2\pi} y(t) dt = 0" /></a><br /><br />This can always be done, as the placement of the curve is arbitrary: if the integral of y(t) is V, then the translated function y(t)-V/(2<span style="background-color: white; color: #444444; font-family: 'Lucida Sans Unicode', 'Lucida Grande', sans-serif; font-size: 18px; line-height: 23.390625px;">π</span>) will have integral 0.<br /><br />Next, differentiate y(t) to obtain:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{dy}{dt}&space;=&space;\sum&space;(na_n&space;\cos(nt)&space;-&space;nb_n&space;\sin(nt))" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{dy}{dt}&space;=&space;\sum&space;(na_n&space;\cos(nt)&space;-&space;nb_n&space;\sin(nt))" title="\frac{dy}{dt} = \sum (na_n \cos(nt) - nb_n \sin(nt))" /></a><br /><br />Then, by Parseval's identity:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\int_0^{2\pi}&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;=&space;\sum&space;n^2(a_n^2+b_n^2)&space;\geq&space;\sum&space;(a_n^2&space;+&space;b_n^2)&space;=&space;\int_0^{2\pi}&space;y(t)^2&space;dt" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\int_0^{2\pi}&space;\left(\frac{dy}{dt}\right)^2&space;dt&space;=&space;\sum&space;n^2(a_n^2+b_n^2)&space;\geq&space;\sum&space;(a_n^2&space;+&space;b_n^2)&space;=&space;\int_0^{2\pi}&space;y(t)^2&space;dt" title="\int_0^{2\pi} \left(\frac{dy}{dt}\right)^2 dt = \sum n^2(a_n^2+b_n^2) \geq \sum (a_n^2 + b_n^2) = \int_0^{2\pi} y(t)^2 dt" /></a><br /><br />which concludes the proof. (The requirement a0=0 is to ensure that the last inequality holds for n=0).<br />The inequality thus obtained is known as the isoperimetric inequality:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=4\pi&space;A&space;\leq&space;L^2" target="_blank"><img src="http://latex.codecogs.com/gif.latex?4\pi&space;A&space;\leq&space;L^2" title="4\pi A \leq L^2" /></a><br /><br /><h2 style="text-align: left;">Only the Circle Yields Maximum Area</h2><div>As we saw in the beginning, the circle turns the inequality into an equality, thus yielding the maximum possible area. But this still leaves open the possibility that other shapes may be equally good. We can show that only the circle yields max area by considering when the inequality becomes an equality: precisely when the two integrals are zero.<br /><br />Starting with the second integral, the inequality was obtained from comparing the square sums after applying Parseval's identity. To have equality, the terms in those sums must be 0 except when n=1, which means that<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=y(t)&space;=&space;a_1&space;\sin(t)&space;+&space;b_1&space;\cos(t)&space;=&space;C&space;\sin(t+\alpha)" target="_blank"><img src="http://latex.codecogs.com/gif.latex?y(t)&space;=&space;a_1&space;\sin(t)&space;+&space;b_1&space;\cos(t)&space;=&space;C&space;\sin(t+\alpha)" title="y(t) = a_1 \sin(t) + b_1 \cos(t) = C \sin(t+\alpha)" /></a><br /><br />The first integral must also be zero, which means that<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{dx}{dt}&space;=&space;-y(t)" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{dx}{dt}&space;=&space;-y(t)" title="\frac{dx}{dt} = -y(t)" /></a><br /><br /></div>Integrating, we get<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=x(t)&space;=&space;C&space;\cos(t+\alpha)&space;+&space;D" target="_blank"><img src="http://latex.codecogs.com/gif.latex?x(t)&space;=&space;C&space;\cos(t+\alpha)&space;+&space;D" title="x(t) = C \cos(t+\alpha) + D" /></a><br /><br />So indeed we must have a circle centered at (D,0).<br /><br />The fact that the circle must be centered around y=0 is due to our previous choice of fixing the integral of y(t) over one period. The center of mass must be such that equally much is positive and negative, which for a circle clearly means fixing it around y=0.<br /><br /><h2 style="text-align: left;">How general is this proof?</h2><div>The use of Green's theorem requires that we deal with a curve (x(t),y(t)) that has a continuous derivative except in finitely many points (piecewise smooth). Piecewise smoothness is also enough to ensure that the Fourier series can be differentiated term by term. Any area you can physically construct will certainly satisfy these conditions, as will anything you draw with a pen, since it is impossible to create or draw infinitely many edged points.<br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-17388436916507616652013-11-12T00:12:00.000+08:002013-11-12T00:12:07.820+08:00Mathematical Realism and Anti-realism: all the way down to the Nature of the Universe itself<div dir="ltr" style="text-align: left;" trbidi="on">There's this age old question of where mathematics come from: is it a product of the human mind, or does it exist independently of it? Here's the labels:<br /><br />Realism: A realist believes that math exists outside of our minds. We discover mathematics, we don't invent it. Mathematics is real.<br /><br />Anti-realism: An anti-realist believes that mathematics exists within our minds. We invent mathematics, we don't discover it. Mathematics is not physically real.<br /><br />Realists argue that mathematicians make discoveries long before they become useful in physics, and that can't be a coincidence. When Einstein wanted to develop general relativity, non-Euclidean geometry had already been developed. Quantum mechanics is based on Hilbert spaces (functional analysis), which had also already been developed by mathematicians.<br /><br />Anti-realists, on the other hand, argue that the truth of mathematical statements depend on assumptions, i.e. mathematics does not have any unconditional truths (apart from logical tautologies, such as "A implies A"). For example, let's reconsider geometry: there are various kinds of them, the Euclidean, and many non-Euclidean. For a long time the ancients saw Euclidean geometry as self evident, although we today know that this "self evident fact" is incorrect.<br /><br />But let's take one question very seriously: why are the properties of the universe so mathematical?<br /><br />It's almost as if the universe is made of mathematics, given that the laws of physics are mathematical relations (and moreover, they may all be computable). This hints towards a possible physical answer to the great debate between these two camps.<br /><br />The universe may appear to be mathematical because it IS mathematical. The universe is a formal axiomatic system, and we exist as part of those axioms. The assertion may in fact be physically provable: if we discover a theory of everything (i.e. the ultimate laws of physics), then the universe is obviously mathematical, and realism turns out to be true: pi is not in the sky, but in the formalization of the laws of nature. As a special case, we it turns out that all physical processes of nature are computable, then the universe is not only mathematical, but a Turing-equivalent computer.<br /><br />On the other hand, it may turn out that the universe is not mathematical, and we "discover" mathematical properties of the universe only because our mind creates them. This may sound crazy at first, but one may note that many concept are in fact purely man made (although indispensable to everyday life): concepts such as human, car, alive/dead, and temperature all rely on human interpretation of a huge amount of matter ("a macroscopic phenomena"). It is not fundamental to the universe: a car is just an arrangement of particles/energy ("a state"), no more special than a pile of junk. Put differently, the universe may be "blind" to our high level concepts, and that may actually include most of physics, possibly with the exception of quantum mechanics (although that is not certain either). From this perspective, the universe is not mathematical: we interpret it as having mathematical properties, and that approximate some things very well, but will never grasp the full extent of it (that is the case right now). In that case, anti-realists win: mathematics is only in the mind (yet very useful to our existence).<br /><br /></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com5tag:blogger.com,1999:blog-6767898351325918278.post-91956932545690508582013-10-21T17:41:00.000+08:002015-05-27T19:50:50.440+08:00How to iterate over the values of a std::map<div dir="ltr" style="text-align: left;" trbidi="on">EDIT: Changed example to better reflect the generality of the problem = title of this post. END EDIT.<br /><br />Now that C++ has lambdas, the C++ STL algorithms are practically useful, but how do you iterate over the values of a std::map?<br /><br />Say you have a function template:<br /><br /><pre><span style="color: maroon; font-weight: bold;">template</span> <span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">typename</span> Iter<span style="color: purple;">></span> <span style="color: maroon; font-weight: bold;">void</span> f<span style="color: #808030;">(</span>Iter curr<span style="color: #808030;">,</span> Iter end<span style="color: #808030;">)</span><span style="color: purple;">;</span></pre><br />f assumes the iterators point to std::strings:<br /><br /><pre><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">vector</span><span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">string</span><span style="color: purple;">></span> v<span style="color: purple;">;</span><br /></pre><pre><span style="color: purple;">f(v.begin(),v.end()); // ok</span></pre><pre><span style="color: #666616;"><br /></span></pre><pre><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">,</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">string</span><span style="color: purple;">></span> m<span style="color: purple;">;</span></pre><pre>f<span style="color: #808030;">(</span>m<span style="color: #808030;">.</span>begin<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>m<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// Won't work: *m.begin() is a pair<Key,Value></span></pre><pre></pre>What we thus need is a way of converting the map::iterator (and map::const_iterator) to an iterator to always dereferences the Value type of the pair.<br /><pre></pre><h2>Solution using Boost</h2>The solution is easy using boost:<br /><br /><pre><span style="color: #004a43;">#</span><span style="color: #004a43;">define</span><span style="color: #004a43;"> BOOST_RESULT_OF_USE_DECLTYPE</span><br /><span style="color: #004a43;">#</span><span style="color: #004a43;">include </span><span style="color: maroon;"><</span><span style="color: #40015a;">boost/iterator/transform_iterator.hpp</span><span style="color: maroon;">></span><br /><br /><span style="color: dimgrey;">// First, define a lambda to get the second element of a pair:</span><br /><span style="color: maroon; font-weight: bold;">auto</span> get_second <span style="color: #808030;">=</span> <span style="color: #808030;">[</span><span style="color: #808030;">]</span><span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">const</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">pair</span><span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">const</span> <span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">,</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">string</span><span style="color: purple;">></span><span style="color: #808030;">&</span> p<span style="color: #808030;">)</span><span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">return</span> p<span style="color: #808030;">.</span>second<span style="color: purple;">;</span> <span style="color: purple;">}</span><span style="color: purple;">;</span><br /><br /><span style="color: dimgrey;">// Then, we can convert a map iterator into an iterator that automatically dereferences the second element</span><br /><span style="color: maroon; font-weight: bold;">auto</span> beg <span style="color: #808030;">=</span> boost<span style="color: purple;">::</span>make_transform_iterator<span style="color: #808030;">(</span>m<span style="color: #808030;">.</span>begin<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>get_second<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><span style="color: maroon; font-weight: bold;">auto</span> end <span style="color: #808030;">=</span> boost<span style="color: purple;">::</span>make_transform_iterator<span style="color: #808030;">(</span>m<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>get_second<span style="color: #808030;">)</span><span style="color: purple;">;</span></pre><pre></pre><pre>f(beg,end); // ok, works!<br /><br /></pre><div>The line</div><div>#define BOOST_RESULT_OF_USE_DECLTYPE</div><div>is needed so inform the boost libraries that the result type (Value in this case) should be inferred using decltype(), rather than by requiring a result_of typedef (prior to C++11, decltype did not exist).</div><div><br /></div><h2 style="text-align: left;">Solution without Boost</h2><div>If you can't use boost, you'll need to implement the dereferencing by hand. Here's the code:</div><div><br /></div><div><pre><span style="color: #004a43;">#</span><span style="color: #004a43;">include </span><span style="color: maroon;"><</span><span style="color: #40015a;">map</span><span style="color: maroon;">></span><br /><span style="color: #004a43;">#</span><span style="color: #004a43;">include </span><span style="color: maroon;"><</span><span style="color: #40015a;">iterator</span><span style="color: maroon;">></span><br /><br /><span style="color: maroon; font-weight: bold;">template</span> <span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">typename</span> Iter<span style="color: purple;">></span><br /><span style="color: maroon; font-weight: bold;">class</span> map_iterator <span style="color: purple;">:</span> <span style="color: maroon; font-weight: bold;">public</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">iterator</span><span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>bidirectional_iterator_tag<span style="color: #808030;">,</span><span style="color: maroon; font-weight: bold;">typename</span> Iter<span style="color: purple;">::</span>value_type<span style="color: purple;">::</span>second_type<span style="color: purple;">></span> <span style="color: purple;">{</span><br /><span style="color: maroon; font-weight: bold;">public</span><span style="color: #e34adc;">:</span><br /> map_iterator<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><span style="color: purple;">}</span><br /> map_iterator<span style="color: #808030;">(</span>Iter j<span style="color: #808030;">)</span> <span style="color: purple;">:</span> i<span style="color: #808030;">(</span>j<span style="color: #808030;">)</span> <span style="color: purple;">{</span><span style="color: purple;">}</span><br /> map_iterator<span style="color: #808030;">&</span> <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">+</span><span style="color: #808030;">+</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: #808030;">+</span><span style="color: #808030;">+</span>i<span style="color: purple;">;</span> <span style="color: maroon; font-weight: bold;">return</span> <span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span><span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> map_iterator <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">+</span><span style="color: #808030;">+</span><span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">auto</span> tmp <span style="color: #808030;">=</span> <span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span><span style="color: purple;">;</span> <span style="color: #808030;">+</span><span style="color: #808030;">+</span><span style="color: #808030;">(</span><span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: maroon; font-weight: bold;">return</span> tmp<span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> map_iterator<span style="color: #808030;">&</span> <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">-</span><span style="color: #808030;">-</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: #808030;">-</span><span style="color: #808030;">-</span>i<span style="color: purple;">;</span> <span style="color: maroon; font-weight: bold;">return</span> <span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span><span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> map_iterator <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">-</span><span style="color: #808030;">-</span><span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">auto</span> tmp <span style="color: #808030;">=</span> <span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span><span style="color: purple;">;</span> <span style="color: #808030;">-</span><span style="color: #808030;">-</span><span style="color: #808030;">(</span><span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: maroon; font-weight: bold;">return</span> tmp<span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> <span style="color: maroon; font-weight: bold;">bool</span> <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">=</span><span style="color: #808030;">=</span><span style="color: #808030;">(</span>map_iterator j<span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">const</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">return</span> i <span style="color: #808030;">=</span><span style="color: #808030;">=</span> j<span style="color: #808030;">.</span>i<span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> <span style="color: maroon; font-weight: bold;">bool</span> <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">!</span><span style="color: #808030;">=</span><span style="color: #808030;">(</span>map_iterator j<span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">const</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">return</span> <span style="color: #808030;">!</span><span style="color: #808030;">(</span><span style="color: #808030;">*</span><span style="color: maroon; font-weight: bold;">this</span> <span style="color: #808030;">=</span><span style="color: #808030;">=</span> j<span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> reference <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">*</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">return</span> i<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: purple;">;</span> <span style="color: purple;">}</span><br /> <span style="color: #603000;">pointer</span> <span style="color: maroon; font-weight: bold;">operator</span><span style="color: #808030;">-</span><span style="color: #808030;">></span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">return</span> <span style="color: #808030;">&</span>i<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: purple;">;</span> <span style="color: purple;">}</span><br /><span style="color: maroon; font-weight: bold;">protected</span><span style="color: #e34adc;">:</span><br /> Iter i<span style="color: purple;">;</span><br /><span style="color: purple;">}</span><span style="color: purple;">;</span><br /><br /><span style="color: maroon; font-weight: bold;">template</span> <span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">typename</span> Iter<span style="color: purple;">></span><br /><span style="color: maroon; font-weight: bold;">inline</span> map_iterator<span style="color: purple;"><</span>Iter<span style="color: purple;">></span> make_map_iterator<span style="color: #808030;">(</span>Iter j<span style="color: #808030;">)</span> <span style="color: purple;">{</span> <span style="color: maroon; font-weight: bold;">return</span> map_iterator<span style="color: purple;"><</span>Iter<span style="color: purple;">></span><span style="color: #808030;">(</span>j<span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: purple;">}</span><br /><br /><br /><span style="color: dimgrey;">// We can now do:</span></pre><pre></pre><pre> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: purple;"><int,</span><span style="color: maroon; font-weight: bold;">std::string</span><span style="color: purple;">></span> m<span style="color: purple;">;</span></pre><pre> // ...<br /> <span style="color: #666616;">f</span><span style="color: #808030;">(</span>make_map_iterator<span style="color: #808030;">(</span>m<span style="color: #808030;">.</span>begin<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>make_map_iterator<span style="color: #808030;">(</span>m<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /></pre><pre><span style="color: purple;"><br /></span></pre><pre><span style="color: purple;"><br /></span></pre></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com2tag:blogger.com,1999:blog-6767898351325918278.post-74735270929601029052013-10-13T14:35:00.001+08:002017-06-06T02:53:24.586+08:00STL Container Performance<div dir="ltr" style="text-align: left;" trbidi="on"><br /><h2 style="text-align: left;">STL Container Performance Table</h2>There's already a good <a href="http://stackoverflow.com/questions/181693/what-are-the-complexity-guarantees-of-the-standard-containers" target="_blank">table at Stack Overflow</a> listing time complexity (in Big O notation) of common operations with C++ STL containers for comparison, although it's structured in a more abstract way and a little hard to read because it's not a HTML table. Here's my version, which also includes priority_queue (technically an adaptor).<br /><br />Persistent iterators means that the iterators are not invalidated by insertions and erase (except when erasing the element referred to by the iterator, which is necessarily invalidated).<br /><br /><table style="border-collapse: collapse; border: 1px solid black;"><tbody><tr><th style="border: 1px solid black; padding: 3px;">Container</th><th style="border: 1px solid black; padding: 3px;">Insertion</th><th style="border: 1px solid black; padding: 3px;">Access</th><th style="border: 1px solid black; padding: 3px;">Erase</th><th style="border: 1px solid black; padding: 3px;">Find</th><th style="border: 1px solid black; padding: 3px;">Persistent Iterators</th></tr><tr><td style="border: 1px solid black; padding: 3px;">vector / string</td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Back: O(1) or O(n)</span><br /><span style="white-space: nowrap;">Other: O(n)</span></td><td style="border: 1px solid black; padding: 3px;">O(1)</td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Back: O(1)</span><br /><span style="white-space: nowrap;">Other: O(n)</span></td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Sorted: O(log n)</span><br />Other: O(n)</td><td style="border: 1px solid black; padding: 3px;">No</td></tr><tr><td style="border: 1px solid black; padding: 3px;">deque</td><td style="border: 1px solid black; padding: 3px;">Back/Front: O(1)<br />Other: O(n)</td><td style="border: 1px solid black; padding: 3px;">O(1)</td><td style="border: 1px solid black; padding: 3px;">Back/Front: O(1)<br />Other: O(n)</td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Sorted: O(log n)</span><br />Other: O(n)</td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Pointers only</span></td></tr><tr><td style="border: 1px solid black; padding: 3px;">list / forward_list</td><td style="border: 1px solid black; padding: 3px;">Back/Front: O(1)<br />With iterator: O(1)<br />Index: O(n)</td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Back/Front: O(1)</span><br /><span style="white-space: nowrap;">With iterator: O(1)</span><br /><span style="white-space: nowrap;">Index: O(n)</span></td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">Back/Front: O(1)</span><br /><span style="white-space: nowrap;">With iterator: O(1)</span><br /><span style="white-space: nowrap;">Index: O(n)</span></td><td style="border: 1px solid black; padding: 3px;">O(n)</td><td style="border: 1px solid black; padding: 3px;">Yes</td></tr><tr><td style="border: 1px solid black; padding: 3px;">set / map</td><td style="border: 1px solid black; padding: 3px;">O(log n)</td><td style="border: 1px solid black; padding: 3px;">-</td><td style="border: 1px solid black; padding: 3px;">O(log n)</td><td style="border: 1px solid black; padding: 3px;">O(log n)</td><td style="border: 1px solid black; padding: 3px;">Yes</td></tr><tr><td style="border: 1px solid black; padding: 3px;">unordered_set / unordered_map</td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">O(1) or O(n)</span></td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">O(1) or O(n)</span></td><td style="border: 1px solid black; padding: 3px;"><span style="white-space: nowrap;">O(1) or O(n)</span></td><td style="border: 1px solid black; padding: 3px;">O(1) or O(n)</td><td style="border: 1px solid black; padding: 3px;">Pointers only</td></tr><tr><td style="border: 1px solid black; padding: 3px;">priority_queue</td><td style="border: 1px solid black; padding: 3px;">O(log n)</td><td style="border: 1px solid black; padding: 3px;">O(1)</td><td style="border: 1px solid black; padding: 3px;">O(log n)</td><td style="border: 1px solid black; padding: 3px;">-</td><td style="border: 1px solid black; padding: 3px;">-</td></tr></tbody></table><br /><h2 style="text-align: left;">Always O(1): begin(), end(), empty(), size(), push_back()</h2><div>The following operations are always O(1) when they exist:</div><div><ol style="text-align: left;"><li>begin(), end()</li><li>empty()</li><li>size() (note that list::size() was not necessarily O(1) prior to C++11)</li><li>push_front() (note that std::vector does not have push_front(), as it would not be O(1))</li><li>push_back()</li></ol><div><br /></div></div><h2 style="text-align: left;">Some Additional Notes</h2><h3 style="text-align: left;"></h3><h3 style="text-align: left;">Adaptors: queue and stack</h3>For std::queue and std::stack, complexity depends on the underlying container used (by default std::deque).<br /><br /><h3 style="text-align: left;">vector</h3>std::vector has constant time (O(1)) back insertion provided no reallocation needs to take place (use reserve/capacity to allocate/check). When reallocation is necessary, all elements are copied (<a href="http://john-ahlgren.blogspot.hk/2014/03/stdvector-with-constructors-that-throw.html">or moved, if possible</a>) to a need memory location. It is guaranteed that back insertion is amortized constant, meaning: "if we perform a large amount of back insertions, the average time for back insertion is constant".<br /><br />Insertion does not invalidate iterators as long as no reallocation is performed (when reallocating, all iterators become invalid). Deletion invalidates all iterators after the deleted element, iterators to elements before are still valid.<br /><br /><h3 style="text-align: left;">deque</h3>Insertion and deletion of elements in a std::deque may invalidate all its iterators. Pointers are however persistent. In practice accessing / iterating over a std::vector is faster than std::deque.<br /><br />All iterators may become invalid after an insertion or deletion, but pointers/references are always valid.<br /><br /><h3 style="text-align: left;">list</h3><div>If you have an iterator to an element, you can insert right after (or before) that element in constant time (O(1)). Of course, you can also erase it or access it directly (O(1)) using the iterator (or any adjacent element, as ++iterator / --iterator are constant time operations).</div><div><br /></div><div>If you only know the index, e.g. that you wish to insert/retrieve/erase the 4th element, you'll need to iterate the list until you reach that element. Put differently: std::list does not provide random access.<br /><br /></div><h3 style="text-align: left;">sorted vector and deque</h3><div>To search for an element in a sorted std::vector or std::deque, use std::equal_range. If only the first element is needed, there is std::lower_bound. If you only want to know whether an element exists or not, there is std::binary_search.<br /><br /></div><h3 style="text-align: left;">set and map</h3><div>Requires a less-than comparison function. Complexities also apply to multiset and multimap.<br /><br /></div><h3 style="text-align: left;">unordered_set and unordered_map (hash tables)</h3>unordered_set and unordered_map has constant time performance on all operations provided no collisions occur. When collisions occur, traversal of a linked list containing all elements of the same bucket (those that hash to the same value) is necessary, and in the worst case, there is only one bucket; hence O(n).<br /><br />Requires a hash function and equality comparison function. Complexities also apply to unordered_multiset and unordered_multimap.<br /><br />Deletion does not invalidate any iterators (other than erased element). Insertion keeps all iterators valid as long as no rehashing is done. When rehashing is performed, all iterators become invalid. Pointers/references to elements always remain valid.<br /><br /><h3 style="text-align: left;">multiset, multimap, unordered_multiset, unordered_multimap</h3>std::multiset and std::multimap follow the same rules as std::set and std::map.<br />std::unordered_multiset and std::unordered_multimap follow the same rules as std::unordered_set and std::unordered_map.<br /><div><br /></div><div>The only reason they are not listed in the table/throughout this document is to save space.</div><br /><h3>basic_string</h3>Strictly speaking std::string and std::wstring are typedefs for basic_string, which is a container. What applies to string above applies more generally to basic_string (and hence, to std::wstring too).<br /><br />Note that prior to C++11 basic_string was not required to store its elements (characters) contiguously. Now it acts as std::vector, except its only valid for POD types and some tricks that don't violate the constraints may be employed (in practice: <a href="http://john-ahlgren.blogspot.hk/2012/03/small-string-optimization-and-move.html" target="_blank">small string optimization</a>).<br /><br /><h3 style="text-align: left;">priority_queue</h3>Requires a less-than comparison function. Always gives the greatest element (according to comparison function) when top() is called. top() is constant time, but pop() requires O(log n) as the queue needs to be rearranged (to ensure next top() correctly gives greatest element).<br /><br /><br /></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com7tag:blogger.com,1999:blog-6767898351325918278.post-73108904878806696072013-10-06T20:36:00.000+08:002015-05-02T04:50:59.620+08:00How to Crack Door Codes and Suitcases Fast<div dir="ltr" style="text-align: left;" trbidi="on"><h3 style="text-align: left;">Cracking Code Locks</h3><br />When I was still a student in Linköping University, my friends were staying in student apartments where the main entrance used a code lock. The lock required a four digit passcode, but there was something unusual about it: it lacked the green "OK"/"Enter" button to confirm once choice. Instead, the code was automatically verified as the digits were inserted. This may not seem like such a big deal, but, as we will see, it is actually.<br /><br />Consider that the code is 1234. Let us now say that you are returning home on a Saturday morning at 3.30 AM and wish to enter your building. Since you are not at your best, you happen to start by pressing a "2" instead of the "1". It's not such a big deal, you think, and you now press "1234", upon which the door opens. It works as you expected, but there is one crucial detail here: you actually inserted the string "21234", so the first four digits are in fact "2123", which is not the right code. When you then inserted the "4", the verification mechanism clearly knew that it should check the last three digits "123", and append your newly inserted "4".<br /><br />Clearly then, you have just tried two code: "2123" and "1234", and you have in fact done so with only 5 key presses instead of the 4*2=8 you would normally expect. This is due to the missing "OK" button (and perhaps a "Cancel"/"Restart" button).<br /><br />This convenience thus comes at a price: if someone wishes to crack the lock (by guessing the code that is, not physically breaking it), that cracker could reuse previously inserted digits as part of the new code. For instance, consider the following. First, we insert "0000", which tries precisely that code. Then, we insert a "1", upon which we are trying the code "0001". If we insert a "0" again now, we are trying "0010" (as the code lock is constantly remembering the last three digits inserted). We can clearly take shortcuts, but how many? If we could constantly insert new codes, without ever having to return to a previous code, we would effectively try all 10^4 = 10000 codes by pressing only 10003 digits (the 10000 codes + 3 digits we need to start the process).<br /><br />At first glance, it's not even clear if it's possible to find such a sequence. Consider a code that does not deal with digits from 0 to 9, but only the binary 0 and 1. If the code is two digits long, then there are four different possible codes: 00, 01, 10, and 11 (not a very useful code lock for practical purposes, but it serves as an easy example for us to understand the problem). Normally, trying each code means we need to press 2*4=8 buttons (excluding the "OK" in between each). But if we don't have the "OK" button, so that the code lock uses a memory, we can in fact try all four codes by pressing "00110" (five presses instead of eight). To see why, consider each two consecutive digits in the sequence: 00, 01, 11, 10.<br /><br />Now consider using binary digits again (only two buttons are available: 0 and 1), but with a code of length 3. We now have 8 possible codes: 000, 001, 010, 011, 100, 101, 110, 111. If we start with 000, we need to then append a 1 (otherwise we retry 000), so we begin with 0001. We could then start with 0 again, as the sequence 00010 would try 000, 001, 010 (just look at three consecutive digits in the sequence). Let's add another 0 and see what happens. The sequence is now 000100, which tries 000, 001, 010, 100. These are unique codes, which is great. However, we now run into problems, as the last 2 digits are 00, exactly what we began with. We have already tried both 000 and 001, so we are now forced into retrying a code!<br /><br />The point is that some sequences will repeat previously tried codes, which is a waste of time. The sequences that do <b>not</b> repeat previous codes are known as De Bruijn sequences. So with binary digits and a length of 2, the sequence 00110 is a De Bruijn sequence, because it tries the codes: 00, 01, 11, 10, that is, all possible combinations exactly once each. The sequence 00011 is not a De Bruijn sequence, as it tries 00 two times (in the first two trials) and does not try the code 10.<br /><br />Put differently, De Bruijn sequences are the shortest possible sequence of button presses needed to try all codes on doors with code memory (without "OK" button).<br /><br />Do De Bruijn sequences exist for all number of digits and lengths (k and n)? We succeeded in finding such for binary digits with length 2 (k=2, n=2), but not for binary digits with length 3 (k=2,n=3). And what about the real world door codes, with has 10 digits and typically length 4 (k=10, n=4)?<br /><br />It turns out that De Bruijn sequences exist for all possible digits k and length n. This means that a real world door code can be cracked in 10003 key presses instead of the expected 40000 (4 per code, 10000 codes). That is, it can be cracked four times faster!<br /><br />To see that De Bruijn sequences always exist (and how to find them), we first make some observations:<br /><ol style="text-align: left;"><li>At any point, we have a <b>state</b>, which are the n-1 digits previously inserted (the "memory").</li><li>We then make k different choices, each giving us a code. Put differently: at each point (after initializing with n-1 digits), we have n-1 digits in state and we choose one more digit (k choices) to obtain a new code of length n.</li><li>To get a code, say 1234, there is in fact only one way to reach it: from the state 123, and adding the digit 4.</li></ol><div>We can picture this process in a graph, where the <b>nodes represent states</b>, and the <b>links represent codes</b>. Thus, note that attempts to try a code are on the links (edges), not nodes (vertices). Below you can see such a graph for k=2 (only binary digits) and n=4 (all codes are composed for four binary digits). The correct code might thus be e.g. 1001 or 0111.</div><div><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://3.bp.blogspot.com/-CwFn-sk5zcA/VUPnJOkOr3I/AAAAAAAA5Ck/RF2zTh8haE0/s1600/De_bruijn_graph-for_binary_sequence_of_order_4.svg.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://3.bp.blogspot.com/-CwFn-sk5zcA/VUPnJOkOr3I/AAAAAAAA5Ck/RF2zTh8haE0/s1600/De_bruijn_graph-for_binary_sequence_of_order_4.svg.png" height="320" width="228" /></a></div><br /></div><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div class="separator" style="clear: both; text-align: center;"><br /></div><div>Clearly, the question is now: can we traverse this graph in a way that passes each link once and only once? Note that we may be in the same state multiple times. In fact this is always necessary: to try both 0000 and 0001, we need to be in the state 000 first. However, we don't want to try any code more than once, so each link should only be visited once. This is known as a Eulerian path (or cycle, if we finish at the same node as we started). Now, each node in the graph will have exactly k links going out and k links going in (k=2 in the depicted graph), since there are k digits to choose from in each state, and k ways to throw away a digit to obtain the state. For example, in the graph above, there are two ways out of the state 000: by adding another 0, or by adding a 1. This corresponds to trying code 0000 and 0001. There are also two ways to reach 000: from 0000, and from 1000, as we throw away the left most digit (oldest digit).</div><div><br /></div><div>Since there are equally many links pointing in and out (namely, k), we can be sure there is an Eulerian cycle, so our problem is indeed always solvable! In fact, all solutions are given precisely by all Eulerian cycles, so any algorithm to find cycles will do (<a href="http://en.wikipedia.org/wiki/Eulerian_path#Hierholzer.27s_algorithm" target="_blank">Hierholzer's algorithm</a> is not only an efficient and intuitive way of finding such cycles, but also provides a proof of existence).</div><div><br /></div><h3 style="text-align: left;">Cracking Suitcases</h3><div><br /></div><div>Cracking suitcases is similar (k=10, n=3), except that we can now rotate each wheel independently. Thus, in one "click" we can go from 000 to 001, 010, 100, 009, 090, and 900. These six "next codes" corresponding to rotate one of the three wheels one step "up" or "down" (to their adjacent values). If we were to depict the solutions as a graph, where each node again corresponds to state and links to solutions, we have an important difference: the state is now the previous solution (we don't throw away any digits). From each state there are 6 links going out, and in fact, all these links are also going in, since we can always turn the wheel back one step. Thus we may consider the undirected graph where each node has 6 links, and, since every node has an even degree, it is guaranteed that an Eulerian cycle exists.</div><div><br /></div><div>Note that the "obvious" solution of enumerating does not work: 000, 001, 002, 003, ... 008, 009 is fine, but going from 009 to 010 requires two switches: turning the least significant 9 to 0, and the middle 0 to 1.</div><div><br /></div><div>In this manner, we only need to rotate the "wheels" a total of 1000 times, as each rotation tries a new code (and there are precisely 1000 codes, namely 000 to 999).</div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com7tag:blogger.com,1999:blog-6767898351325918278.post-64437385092699908502013-09-26T21:41:00.003+08:002013-09-27T00:40:54.451+08:00Measuring Execution Time Accurately and Setting Deadlines<div dir="ltr" style="text-align: left;" trbidi="on">If you do benchmarking, a common task is to measure the execution time of a particular block of code.<br /><br />The canonical and portable way to do this in C++ is as follows:<br /><br /><pre><span style="color: #004a43;">#</span><span style="color: #004a43;">include </span><span style="color: maroon;"><</span><span style="color: #40015a;">chrono</span><span style="color: maroon;">></span></pre><pre><span style="color: dimgrey;">// Starting timepoint</span><br /><span style="color: maroon; font-weight: bold;">const</span> <span style="color: maroon; font-weight: bold;">auto</span> start_time <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><span style="color: dimgrey;">// ... Code here ...</span><br /><span style="color: maroon; font-weight: bold;">double</span> time_in_seconds <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>duration_cast<span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>milliseconds<span style="color: purple;">></span><br /> <span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">-</span> start_time<span style="color: #808030;">)</span><span style="color: #808030;">.</span><span style="color: #603000;">count</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">/</span> <span style="color: green;">1000.0</span><span style="color: purple;">;</span></pre><pre></pre>This will measure execution time in seconds, with millisecond precision. If you want better precision, you can of course go for say microsecond resolution:<br /><div><br /></div><div><pre><span style="color: dimgrey;">// Starting timepoint</span><br /><span style="color: maroon; font-weight: bold;">const</span> <span style="color: maroon; font-weight: bold;">auto</span> start_time <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><span style="color: dimgrey;">// ... Code here ...</span><br /><span style="color: maroon; font-weight: bold;">double</span> time_in_seconds <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>duration_cast<span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>microseconds<span style="color: purple;">></span><br /> <span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">-</span> start_time<span style="color: #808030;">)</span><span style="color: #808030;">.</span><span style="color: #603000;">count</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">/</span> <span style="color: green;">1000000.0</span><span style="color: purple;">;</span></pre></div><div><span style="color: purple;"><br /></span></div>If your code block may throw exceptions, and the execution time is actually updated to some variable that is outside of this scope, you may want to catch the exception:<br /><div><br /></div><div><pre><span style="color: dimgrey;">// time_in_seconds reference to variable declared outside this scope</span><br /><span style="color: dimgrey;">// Starting timepoint</span><br /><span style="color: maroon; font-weight: bold;">const</span> <span style="color: maroon; font-weight: bold;">auto</span> start_time <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><span style="color: maroon; font-weight: bold;">try</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// ... Code here ...</span><br /><span style="color: purple;">}</span> <span style="color: maroon; font-weight: bold;">catch</span> <span style="color: #808030;">(</span><span style="color: #808030;">.</span><span style="color: #808030;">.</span><span style="color: #808030;">.</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// Update execution time and rethrow</span><br /> time_in_seconds <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>duration_cast<span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>milliseconds<span style="color: purple;">></span><br /> <span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">-</span> start_time<span style="color: #808030;">)</span><span style="color: #808030;">.</span><span style="color: #603000;">count</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">/</span> <span style="color: green;">1000.0</span><span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">throw</span><span style="color: purple;">;</span><br /><span style="color: purple;">}</span><br /><span style="color: dimgrey;">// Normal execution ended, update time</span><br />time_in_seconds <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>duration_cast<span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>milliseconds<span style="color: purple;">></span><br /> <span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">-</span> start_time<span style="color: #808030;">)</span><span style="color: #808030;">.</span><span style="color: #603000;">count</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">/</span> <span style="color: green;">1000.0</span><span style="color: purple;">;</span></pre><br />steady_clock is a monotonic clock: its value never decreases. This can be compared to std::chrono::system_clock, which can in fact decrease if the user changes the value. There's also std::chrono::high_resolution_clock which may or may not be monotonic (as the name suggests, it is intended to primarily be used as a high resolution clock, i.e. ideally with nanosecond precision).</div><div><br /></div><div>The start_time is a timepoint, and taking the difference of two timepoints (the two now()s) yields a duration. We then convert the duration into millisecond "ticks", which are obtained using count(). Alternatively, we can convert the duration into microsecond "ticks" (as shown above), or even nanosecond ticks. Finally, we refactor the numerical value from milliseconds to seconds (this step is obviously not needed, but is used because it is many times easier to use SI units).</div><div><br /></div><div><b>Deadlines </b>can be created in a similar way:</div><div><br /></div><div><pre><span style="color: dimgrey;">// Define type used for deadline</span><br /><span style="color: maroon; font-weight: bold;">typedef</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>time_point deadline_t<span style="color: purple;">;</span><br /><br />deadline_t soon <span style="color: #808030;">=</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">+</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>minutes<span style="color: #808030;">(</span><span style="color: #008c00;">2</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><span style="color: dimgrey;">// or, equivalently: now() + std::chrono::seconds(2*60);</span><br /><br /><span style="color: maroon; font-weight: bold;">for</span> <span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">const</span> <span style="color: maroon; font-weight: bold;">auto</span><span style="color: #808030;">&</span> e <span style="color: purple;">:</span> vec<span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: maroon; font-weight: bold;">if</span> <span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>chrono<span style="color: purple;">::</span>steady_clock<span style="color: purple;">::</span>now<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">></span> soon<span style="color: #808030;">)</span> <span style="color: maroon; font-weight: bold;">throw</span> out_of_time<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: dimgrey;">// ... process vector elements ...</span><br /><span style="color: purple;">}</span></pre><pre><span style="color: purple;"><br /></span></pre><pre><span style="color: purple;"><br /></span></pre></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com3tag:blogger.com,1999:blog-6767898351325918278.post-62489827484040161602013-08-19T18:53:00.000+08:002013-08-19T18:53:15.753+08:00Fast (and pretty easy) Currency Conversion without Calculators<div dir="ltr" style="text-align: left;" trbidi="on"><div dir="ltr" style="text-align: left;" trbidi="on"><div dir="ltr" style="text-align: left;" trbidi="on"><div dir="ltr" style="text-align: left;" trbidi="on"><div dir="ltr" style="text-align: left;" trbidi="on"><div dir="ltr" style="text-align: left;" trbidi="on">With the prevalence of mobile phones, it's easy to convert between currencies by simply using the phone's calculator. If you're like most people, however, you won't be doing this while shopping on the street markets of Thailand or anywhere else abroad, provided the sum is not too large (say anything below 100 USD).<br /><br />Here's a practical way to convert between currencies fast and easy, all done in your head.<br /><br />Consider some practical examples.<br /><br /><b>USD to EUR conversion</b><br />We want to convert from USD to Euros (EUR).<br />Google tells me that today, 1 USD = 0.75 EUR.<br />How much is 67 USD in EUR?<br /><br />The answer is 67*0.75 EUR, but how do you actually do that calculation in your head?<br />Note that 0.75 = 3/4, so we have 67 * 3 / 4. From a mathematical point of view, it doesn't matter if we multiply 67 by 3 first, and then divide it by 4, or if we divide 67 by 4 first, then multiply by 3. It is however much easier from humans (and computers!) to deal with small numbers, so we start by dividing to get a smaller number.<br /><br />How do you divide 67 by 4? The answer is that you half it two times: 67 / 4 = 67 / (2*2) = 67 / 2 / 2.<br />How do you halve 67? 67 is 60+7, and if we take half of each and then add them, we'll have the correct answer. Half of 60 is 30, and half of 7 is approximately 4 (we stick to integers). Thus the answer is 30+4 = 34. Repeating once more (remember, we halve two times), 34 = 30+4, and halving each gives 15+2 = 17.<br /><br />Now, we multiply 17 by 3 using the same technique of splitting 17 into 10+7 and then multiplying (instead of dividing) by 3: 3*10 = 30, and 3*7 = 21, so you get 30+21 = 51.<br /><br />In one go:<br /><br /><br /></div><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{67}{2\cdot&space;2}\cdot&space;3&space;=&space;\frac{60+7}{2\cdot&space;2}&space;\cdot&space;3&space;\approx&space;\frac{30+4}{2}\cdot&space;3&space;=&space;(15+2)\cdot&space;3&space;=&space;17&space;\cdot&space;3&space;=&space;(10+7)\cdot&space;3&space;=&space;30+21&space;=&space;51" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{67}{2\cdot&space;2}\cdot&space;3&space;=&space;\frac{60+7}{2\cdot&space;2}&space;\cdot&space;3&space;\approx&space;\frac{30+4}{2}\cdot&space;3&space;=&space;(15+2)\cdot&space;3&space;=&space;17&space;\cdot&space;3&space;=&space;(10+7)\cdot&space;3&space;=&space;30+21&space;=&space;51" title="\frac{67}{2\cdot 2}\cdot 3 = \frac{60+7}{2\cdot 2} \cdot 3 \approx \frac{30+4}{2}\cdot 3 = (15+2)\cdot 3 = 17 \cdot 3 = (10+7)\cdot 3 = 30+21 = 51" /></a> <br /><br />67 USD is approximately 51 EUR. The exact answer is 50.25 EUR.<br /><br />What you need to remember is thus:<br /><ol style="text-align: left;"><li>The general operations: halve the number two times, then multiply by 3</li><li>How to halve and multiply by 3 using the "splitting" trick.</li></ol><div>Let's we want to do the opposite now: convert euros into USD. 1/0.75 = 1.33, so 1 EUR = 1.33 USD.</div><div><br /></div><div>Let's say we have 247 EUR. This is 247*1.33 USD, but how do we calculate it easily?</div><div><br /></div><div>First, we omit the second decimal, i.e. 1.33 is approximately 1.3. So we only need to do 247 * 1.3. First, using the splitting trick, we see that:</div><div><br /></div><a href="http://www.codecogs.com/eqnedit.php?latex=247&space;\cdot&space;1.3&space;=&space;247&space;\cdot&space;(1+0.3)&space;=&space;247&space;+&space;247\cdot&space;0.3" target="_blank"><img src="http://latex.codecogs.com/gif.latex?247&space;\cdot&space;1.3&space;=&space;247&space;\cdot&space;(1+0.3)&space;=&space;247&space;+&space;247\cdot&space;0.3" title="247 \cdot 1.3 = 247 \cdot (1+0.3) = 247 + 247\cdot 0.3" /></a><br /><div><br /></div><div>So the hard part is a little easier now: we need to estimate 247*0.3. Noting that 0.3 = 3/10, and doing the division before multiplication as before, we get:<br /><br /></div></div><a href="http://www.codecogs.com/eqnedit.php?latex=247\cdot&space;0.3&space;=&space;247\cdot&space;\frac{3}{10}&space;\approx&space;25\cdot&space;3&space;=&space;75" target="_blank"><img src="http://latex.codecogs.com/gif.latex?247\cdot&space;0.3&space;=&space;247\cdot&space;\frac{3}{10}&space;\approx&space;25\cdot&space;3&space;=&space;75" title="247\cdot 0.3 = 247\cdot \frac{3}{10} \approx 25\cdot 3 = 75" /></a><br /><br />So the answer is 247 + 75. A rough estimate is given by 250+75 = 250 + 50 + 25 = 325.<br /><br />So 247 EUR is approximately 325 USD. The exact answer is 328.51 USD.<br /><br />It seems then that the trick is to come up with ways to multiply and divide by decimal numbers, such as 0.75 and 0.3. This can always be done, since all conversion rates are rational numbers, meaning they can be written in the form Integer/Integer. By using a fairly simple numerator and denumerator, we can then perform the estimation fast. Multiplication is in general easy, division much harder. Here's a couple of tricks:<br /><br /><b>Divide by 2 (halving)</b><br />This is simply taking half the quantity. If you need to divide say 573 by 2, then use the splitting technique: 573 = 500 + 70 + 3. Halving each gives approximately 250 + 35 + 2 = 250 + 37 = 287.<br /><br /><b>Divide by 4 and 8 (halving many times)</b><br />Also very easy: dividing by 4 is the same as halving twice. Dividing by 8 is the same as halving three times.<br /><br /><b>Divide by 10</b><br /><div>The easiest to do. 128 divided by 10 is 12.8. In general, XYZ divided by 10 will be XY.Z, where X,Y and Z are digits (0,1,2,3,...,9). We then simply round to nearest integer> 12.8 is closest to 13.</div><div><b><br /></b></div><b>Divide by 5 ("divide by 10 and double")</b><br />This one's really easy too: x/5 = x/10*2, so all you need to do is divide by 10 (using the method above), and then double the result. Example: 128/5 can be calculated first dividing 128 by 10, giving 13. Doubling gives 26.<br /><br /><b>Divide by 3 ("take average of 2 and 5")</b><br />This one's a bit harder. A nice way to estimate x/3 that builds on the techniques already mentioned is to take the average of x/2 and x/5.<br />Example: 175 divided by 3 can be approximated by noting that:<br />175/5 = 18*2 = 36<br />175/2 = (100 + 75 + 5) / 2 = 50 + 37 + 3 = 50 + 40 = 90<br />Taking averages, we get (36+90)/2 = 126/2 = (100+20+6)/2 = 50 + 10 + 3 = 63.<br />The exact answer is 58.333.<br />In general, this technique gives (x/2 + x/5)/2 = 7x/20 = 0.35x, instead of 0.3333x.<br />(It may seem more intuitive to take average of x/2 and x/4, but this gives a worst estimate of 0.375x.)<br /><br /><b>Divide by 7 ("take average of 5 and 10")</b><br />Also a bit harder. When doing x/7, we can take the average of dividing by 5 and 10.<br />384/7 is somewhere between:<br />384/10 = 38<br />384/5 = 38*2 = (30+8)*2 = 60 + 16 = 76<br />Taking averages, we get (38+76)/2 = (100 + 14)/2 = 50 + 7 = 57<br /><div>The exact answer is close to 55.</div>This techniques calculates (x/5 + x/10)/2 = 3x/20 = 0.15x instead of x/7 = 0.143, so it's a good estimate for relatively small sums of money.<br /><br />Here's another example. 1 THB (Thai Baht) is 0.032 USD. How much is the saleswoman asking for in USD when she says 650 THB? The answer is 650*0.032 USD. To estimate this quantity, we manipulate the conversion rate 0.032 into rules of the form above. Note that 0.032 = 32/1000 which we approximate as 30/1000 = 3/100 (32/1000 is (30+2)/1000 = 3/100 + 2/1000, so we're skipping a term containing two parts in 1000).<br /><br />Calculating 650 * 3/100 is easy: do the division first (dividing by 100 is dividing by 10 two times, since 100 = 10*10). We then get 6.5, which we round to 7. Multiplying by 3 gives 21. Here's the full estimate in one line:<br /><br /></div><a href="http://www.codecogs.com/eqnedit.php?latex=650\cdot&space;3/100&space;=&space;65&space;\cdot&space;3/10&space;\approx&space;7&space;\cdot&space;3&space;=&space;21" target="_blank"><img src="http://latex.codecogs.com/gif.latex?650\cdot&space;3/100&space;=&space;65&space;\cdot&space;3/10&space;\approx&space;7&space;\cdot&space;3&space;=&space;21" title="650\cdot 3/100 = 65 \cdot 3/10 \approx 7 \cdot 3 = 21" /></a><br /><br />Let's now convert from GBP (Brittish pound) to USD. 1 GBP = 1.56 USD. How much is Harrods asking you in USD when they say 489 GBP?<br /><br />First we deal with the conversion rate. 1.56. This is close to 1.6, which we instead use. 1.6 = 16/10, and looking at prime factors we have:<br /><br /></div></div><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{16}{10}&space;=&space;\frac{2^4}{2\cdot&space;5}&space;=&space;\frac{2^3}{5}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{16}{10}&space;=&space;\frac{2^4}{2\cdot&space;5}&space;=&space;\frac{2^3}{5}" title="\frac{16}{10} = \frac{2^4}{2\cdot 5} = \frac{2^3}{5}" /></a><br /><br />This works: we can divide by 5 and then double 3 times. It's however rather tedious, as if we allow a little more rounding error, 1.56 is also close to 1.5, which gives:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{15}{10}&space;=&space;\frac{3\cdot&space;5}{2\cdot&space;5}&space;=&space;\frac{3}{2}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{15}{10}&space;=&space;\frac{3\cdot&space;5}{2\cdot&space;5}&space;=&space;\frac{3}{2}" title="\frac{15}{10} = \frac{3\cdot 5}{2\cdot 5} = \frac{3}{2}" /></a><br /><div><br /></div><div>A much simpler calculation: we half, then multiply by 3.</div><div>Thus, estimating 489 GBP is done as follows:</div><div>489/2 = (400 + 80 + 9)/2 = 200 + 40 + 5 = 245.</div><div>245 * 3 = (200 + 40 + 5)*3 = 600 + 120 + 15 = 720 + 15 = 735.</div><div><br /></div><div>489 GBP is approximately 735 USD (although for such large amounts I would suggest you use a calculator!).</div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-13666870869241201192013-06-24T04:34:00.000+08:002013-06-24T04:38:34.592+08:00Intuitive Explanation for False Positives and False Negatives<div dir="ltr" style="text-align: left;" trbidi="on">Let's say you go to your favorite local bar, and do what most people do in bars: propose marriage to anyone of the opposite sex (that you find attractive enough, whatever that means to you).<br /><br />When everything's said and done, there's only two ways it can go: Yes (success) or No (failure).<br />Furthermore, let us, for the sake of argument, assume that you only approach those you believe will say yes (perhaps to save yourself some embarrassment, although the reason doesn't matter).<br /><br />Let's also say you'd like to keep track of your success rate, so you count the number of people you've approached, as well as the number of "Yes" you got (you don't stop as soon as you get a Yes, perhaps because you want the best Yes possible).<br /><br />The <b>hit rate </b>is the proportion of success you've had:<br /><br />Hit Rate = (Number of Yes) / (Number of Yes + Number of No) = (Number of Yes) / (Number of People Asked)<br /><br />This seems like a good measure for success, except you've forgot something crucial: there's two ways things can go wrong:<br /><ol style="text-align: left;"><li>You asked someone and got a No (that's the above situation).</li><li>You didn't asked someone that would have said Yes.</li></ol><div><br /></div><div>When you ask someone and they say Yes, that's called a <b>true positive</b>. When you ask someone and they say No, it's a <b>false positive</b>: positive because you thought they would say Yes, and false because you were wrong.</div><div><br /></div><div>When you don't ask someone (because you think they will say No), and they would have said no, it's called a <b>true negative</b>. If, however, you never asked but they would have said Yes, that's a missed opportunity, a <b>false negative</b>: negative because you thought they would say No, false because you were wrong.</div><div><br /></div><div>It may seem strange to talk about what would happen if we did something we never did, but the context in which this is performed in science is usually an instrument that is supposed to detect something. For example, if we do a test for a certain disease, a "positive" means that the diseases is there, and a "negative" means that the disease is not there (which to call "positive" and "negative" is largely a matter of definition, it is just a binary classification). We could thus verify the correctness of the "guesses" by using samples for which we know the true state (or by repeating the experiment multiple times, assuming we know the probability of success is more than 50%).</div><div><br /></div><div>If the experiment succeeds it's a "true", otherwise "false". We can visualize the four options: The instrument (or human) "guesses", and the classification happens according to that guess and the real world state.</div><div><br /></div><table border="1"><tbody><tr><td></td><td>True</td><td>False</td></tr><tr><td>Positive</td><td>Guess Yes, Real Yes</td><td>Guess Yes, Real No</td></tr><tr><td>Negative</td><td>Guess No, Real No</td><td>Guess No, Real Yes</td></tr></tbody></table><div><br /></div><div><br />In this more formal language, the hit rate is defined as:<br /><br />Hit Rate = (True Positives) / (True Positives + False Negatives) = (True Positives) / |"Everything Actually Positive"|.<br /><br /></div><div>The danger in situations like the marriage proposals is that we don't observe the Outcome of the Predicted No, since we never experience that alternate reality. Thus the True Negatives and False Negatives are hidden from view, and we can't measure the proportion of missed opportunities:<br /><br /></div><div>Miss Opportunities = (False Negatives) / (False Negatives + True Negatives).<br /><br />This measure is interesting in our bar example but usually not used in other contexts. In general, a <a href="http://en.wikipedia.org/wiki/Confusion_matrix" target="_blank">table of confusion</a> can be used to list all prediction/outcomes. What should be clear is that the Hit Rate is not necessarily the best measure of success in general, as it doesn't account for False Positives.<br /><br />Likewise, an instrument that is good at detecting a disease has few False Negatives, but what about the amount of people that are healthy but the instrument incorrectly classifies as having the disease (False Positives)? This is usually less of a priority in this case, since it is seems less bad to classify healthy people as sick than sick people as healthy.<br /><br />Whether it is the False Positives or False Negatives we would like to detect best depends on how we describe the problem: in this case we call healthy people "negatives" and sick people "positives" (it thus has nothing to do with having a "positive" or "negative" outlook on things!).<br /><br />It is usually a matter of prioritizing: if the machine can aggressively detect traces of a disease, it may also incorrectly find patterns that are in fact not due to the disease. Put differently, the machine is <b>sensitive, </b>meaning that has a high hit rate.<br /><br />The "opposite" of sensitivity (hit rate) is <b>specificity</b>: how many healthy people that are classified as being healthy.<br /><br />Specificity = |"Guessed Healthy"| / |"All Healthy"| = (True Negative) / (True Negative + False Positive)<br /><br />One measure that accounts for everything (False Positives as well as False Negatives) is <b>Accuracy</b>:<br /><br />Accuracy = (True Positives + True Negatives) / (True Positives + True Negatives + False Positives + False Negatives) = |"Correct Predictions"| / |"All Predictions"|<br /><br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-70334885231849072632013-05-09T19:54:00.000+08:002013-05-09T21:02:47.304+08:00How to Estimate the Number of Candies in a Jar (and Pack Tightly)<div dir="ltr" style="text-align: left;" trbidi="on">Two related questions:<br /><ol style="text-align: left;"><li>How do you pack balls (spheres) as tightly as possible?</li><li>How do you estimate the number of candies in a jar?</li></ol><div><br /></div><h2 style="text-align: left;">Packing Spheres Tightly</h2><div>The relevant quantity to study is the following: given a certain amount of volume of space T, how much volume of balls S can we put into that space? The ratio S/T is called the packing factor, and it gives the proportion of space that is occupied by balls.</div><div><br /></div><div>There are various ways to pack. The following picture shows three suggestions (the balls are sliced so that you only see the parts that are inside the cube):</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="http://4.bp.blogspot.com/-kUpgCK5BHUE/UYt_aWuKsOI/AAAAAAAAzdY/EqW5dLzzyx8/s1600/packing.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://4.bp.blogspot.com/-kUpgCK5BHUE/UYt_aWuKsOI/AAAAAAAAzdY/EqW5dLzzyx8/s1600/packing.jpg" /></a></div><div><br /></div><div>In the left most figure (simple cubic), we see the "corners" of eight different balls; the packing is essentially putting balls on top of each other (and next to each other) in a grid like fashion.</div><div><br /></div><div>In the right most (body centered cubic), we put one ball int he middle and surround it by eight other balls. So instead of stacking balls like a grid, they "sink in to each other".</div><div><br /></div><div>The middle one (face centered cubic) is similar to body centered cubic in that the balls "sink in", but here we "smash them together" not only at their corners, but also on the sides (the blue balls).</div><div><br /></div><div>It should be intuitively clear that the simple cubic is not the best way to pack balls as densely as possible.</div><div>In fact, the picture suggests that the face centered cubic is better than the body centered cubic.</div><div>More interesting, we would like to know the percentage of wasted space (or equivalently, used space) for each configuration, as well as the best we can in principle do.</div><div><br /></div><div>Define some useful variables:</div><div>L = Length of one side in the cube.</div><div>R = Radius of a ball.</div><div><br /><h4 style="text-align: left;">Packing Factor for Body Centered Cubic</h4></div><div>Now we calculate the packing factor for the body centered cubic:</div><div><br /><a href="http://www.codecogs.com/eqnedit.php?latex=T%20=%20L^3" target="_blank"><img src="http://latex.codecogs.com/gif.latex?T = L^3" title="T = L^3" /></a><br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=S%20=%20\frac{4\pi%20R^3}{3}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?S = 2\cdot \frac{4\pi R^3}{3}" title="S = 2\cdot \frac{4\pi R^3}{3}" /></a></div><div><br />The factor 2 in S comes the fact that inside the volume T, we have 2 balls: the one in the middle, and the 8 other pieces, which are one eight in side each (it is quite easy to see that they can be put together into a ball by looking at the picture).<br /><br />Dividing S by T should give the answer, except that we haven't figured out the relationship between the length of the cube, L, and the radius of the balls, R.<br /><br />The diagonal from side corner to the opposite (farthest away) will be of length <a href="http://www.codecogs.com/eqnedit.php?latex=\sqrt{3}%20L" target="_blank"><img src="http://latex.codecogs.com/gif.latex?D = \sqrt{3} L" title="D = \sqrt{3} L" /></a>, which can be seen as follow: a diagonal across one surface of the cube is <a href="http://www.codecogs.com/eqnedit.php?latex=\sqrt{2}%20L" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\sqrt{2} L" title="\sqrt{2} L" /></a> by the Pythagorean theorem, and so <a href="http://www.codecogs.com/eqnedit.php?latex=D^2%20=%20L^2%20@plus;%20(\sqrt{2}%20L)^2%20=%203%20L^2" target="_blank"><img src="http://latex.codecogs.com/gif.latex?D^2 = L^2 + (\sqrt{2} L)^2 = 3 L^2" title="D^2 = L^2 + (\sqrt{2} L)^2 = 3 L^2" /></a>.<br /><br />Looking at the picture (c) above, the balls on each corner touch the middle one, so looking at a straight line from one corner to the other (of length D), we are traversing 4R of length: <a href="http://www.codecogs.com/eqnedit.php?latex=D%20=%204L" target="_blank"><img src="http://latex.codecogs.com/gif.latex?D = 4R" title="D = 4R" /></a>. From our two identities involving D, we get:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=L%20=%20\frac{4L}{\sqrt{3}}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?L = \frac{4L}{\sqrt{3}}" title="L = \frac{4L}{\sqrt{3}}" /></a><br /><br />We can now express our T in terms of R too, which gives:<br /><div><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{S}{T}%20=%20\frac{2\cdot%204\pi%20R^3%20\cdot%203\sqrt{3}}{3\cdot%204^3%20R^3}%20=%20\frac{\sqrt{3}\pi}{8}%20\approx%2068\%" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{S}{T} = \frac{2\cdot 4\pi R^3 \cdot 3\sqrt{3}}{3\cdot 4^3 R^3} = \frac{\sqrt{3}\pi}{8} \approx 68\%" title="\frac{S}{T} = \frac{2\cdot 4\pi R^3 \cdot 3\sqrt{3}}{3\cdot 4^3 R^3} = \frac{\sqrt{3}\pi}{8} \approx 68\%" /></a><br /><br /><br />That is, using body centered cubic packing (picture c), we use up about 68% of the available space. Equivalently, we thus waste 32%.<br /><br /><h4 style="text-align: left;">Packing Factor for Face Centered Cubic</h4>Looking at the blue balls in picture b, we see that there's 6 of them (four on the sides, one on top and bottom). Each is half a ball, so this gives 3 balls. We get one more ball by putting together the 8 corner pieces, for a total of 4 balls. Thus:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=S%20=%20\frac{4\pi%20R^3}{3}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?S = 4\cdot \frac{4\pi R^3}{3}" title="S = 4\cdot \frac{4\pi R^3}{3}" /></a><br /><br />The relationship between T and R may however have changed, as we are no longer centering around a specific ball. This time, going from one corner to another (D above) won't work since there are gaps in between the balls when doing so (this can be seen in the picture). However, looking at one side, we see that the side diagonal crosses a length of 4R (with no gaps), and this length is also <a href="http://www.codecogs.com/eqnedit.php?latex=\sqrt{2}L" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\sqrt{2}L" title="\sqrt{2}L" /></a>. We thus get:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=\frac{S}{T}%20=%20\frac{4\cdot%204\pi%20R^3%20\cdot%202%20\sqrt{2}}{3%20\cdot%204^3%20R^3}%20=%20\frac{\sqrt{2}\pi}{6}%20\approx%2074\%" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{S}{T} = \frac{4\cdot 4\pi R^3 \cdot 2 \sqrt{2}}{3 \cdot 4^3 R^3} = \frac{\sqrt{2}\pi}{6} \approx 74\%" title="\frac{S}{T} = \frac{4\cdot 4\pi R^3 \cdot 2 \sqrt{2}}{3 \cdot 4^3 R^3} = \frac{\sqrt{2}\pi}{6} \approx 74\%" /></a><br /><br />Face centered packing is thus more compact than body centered (as the picture suggests), using 6% more of the space.<br /><br /><h4 style="text-align: left;">Can we do even better?</h4></div></div><div>There are other ways of packing which gives the 74% above, but are there even better ways?</div><div>Kepler postulated that the answer should be no, and there is a computer generated proof (a proof by exhaustion) that this is impossible: Hales' proof. Thus the answer is no, except that mathematicians are somewhat suspicious of the proof given the prevalence of bugs in computer code (the non-computer parts of the proof are considered reliable).</div><div><br /></div><div><div>So to answer: almost surely not.</div></div><div><br /></div><div><br /></div><h2 style="text-align: left;">How do you estimate the number of candies in a jar?</h2><div>Ignoring more intrusive methods, such as weighting the jar or x-raying the bottle and counting each candy, we obtain a way based on packing factors:</div><div><ol style="text-align: left;"><li>Estimate the volume of the jar</li><li>Estimate the percentage of the jar that is filled with candies (the packing factor defined above)</li><li>This gives the volume of candies. Divide that volume by the estimated volume of one candy.</li></ol></div><div>You may be tempted to use the 74% above as your packing factor in step 2, but this is in fact not usually the right thing to do, because there's no guarantee that pouring down (round) candies into a jar will give a perfect packing. In fact, it isn't so: simply pouring down balls will give a packing factor of around <b>64%</b>, that is, closer to something like the body rather than face centered packing. So it is 64% you should use as your packing factor - under the reasonable assumption that candies were simply poured into the jar, as opposed to, say, carefully put into their optimal position one by one.</div><div><br /></div><div>Also, if the candies are not round, the packing factor may be somewhat altered. The exact mathematics easily becomes forbiddingly complex (say for teddybear shaped candies), so you're left with two choices: use the approximation 64% (perhaps adjusted according to some reasonable logic), or buy the same jar, fill it with the same candy, pour out all the candies again, and count them.</div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-83888203815767235352013-04-29T23:29:00.003+08:002013-04-30T00:17:37.448+08:00Intuitive Explanation why Horn and Dual Horn Clause Logics are Linear Time Solvable<div dir="ltr" style="text-align: left;" trbidi="on"><h2 style="text-align: left;">Propositional Logic</h2>In propositional logic (equivalently, boolean algebras), a common problem is to find a true/false assignment to all boolean variables such that all formulas are satisfied.<br /><br />In mathematics, we would call this a <b>solution</b>, in logic, it's called a <b>model</b>:<br /><br />2x + 3y = 16 has an integer solution {x=5, y = 2} (there are infinitely many integer solutions).<br />x and not y has a model {x = true, y = false}<br /><br />In general, the problem of finding models to propositional formulas is NP-complete.<br />It is by many believed to be exponential: in the worst case, the best known algorithms do no better than the truth table method of trying out all 2^N possibilties, where N is the number of boolean variables (in practice DPLL based algorithm like Chaff perform significantly better across almost all problems).<br /><br />There are however a few subclasses of problems that are linear time solvable. Two such that have practical value are Horn and Dual Horn Clause logic, which we consider here.<br /><br />If you don't know it yet, all propositional formulas can be written in <a href="http://en.wikipedia.org/wiki/Conjunctive_normal_form" target="_blank">Conjunctive Normal Form</a>, which you'll need to know about before reading on.<br /><br /><h2 style="text-align: left;">Horn Clause Logic (propositional)</h2><div>Horn clauses are clauses that contain at most one positive literal.</div><div>I'll assume that the empty clause is not part of the database: if it is, there's obviously no model (this can be checked in linear time by simply traversing all clauses).</div><div><br /></div><div>Let's start with the case where all clauses have no positive literals. Then all literals in all clauses are negative, so there is no conflict between any literals: all we need to do in order to satisfy all clauses is set all variables to false.</div><div><br /></div><div>Now assume some (or all) clauses have a positive literal. If they also have a negative literal, we're back to our previous situation: by setting all literals to false, all clauses (include these new ones) are satisfied.</div><div><br /></div><div>So the problem is when we have unit clause with one positive literal (and therefore, no negative literals). Clearly, we have no choice but to set these literals to true. This triggers a cascade of inferences by which other literals become unit. For example:</div><div><br /></div><div>If we have the unit clause [X] and [ -X, Y ], we have no choice but to set X=1.</div><div>Then the second clause reduces to [ Y ], which in turn propagates Y=1.</div><div>And so on. This process is known as <b>unit propagation</b>.</div><div><br /></div><div>After we've finished unit propagating, all necessary variable assignments have been performed. For the remaining variables, we have the same situation as before: they occur in non-unit clauses, so we can trivially handle them by assigning everything else to false. The algorithm is thus:</div><div><br /></div><div><ol style="text-align: left;"><li>Unit Propagate (this may fail, e.g. if we have [X] and [-X], whereby there is no model).</li><li>Set remaining variables to false.</li></ol></div><div><br /></div><h2 style="text-align: left;">Linear Time Unit Propagation</h2><div>The obvious algorithm for unit propagation is to pick all unit clauses, put the literal in a queue (or stack, it doesn't matter), and then go through the queue of literals doing:</div><div><br /></div><div>1. Eliminate all clauses containing the literal (this step can be omitted if we use two literal watch, see below).</div><div>2. Remove negated literal from all clauses.</div><div>3. Scan for new unit clauses, put literal in queue and add to model.</div><div>4. Repeat until queue is empty.</div><div><br /></div><div>This is a loop nested inside another loop: for each literal (in the queue), we linearly scan all clauses to simplify them (step 1 and 2 above). Thus the complexity is O(N^2). The second step, of scanning through all literals, can be avoided by keeping a mapping from all literals (positive and negative literals are distinct) to clauses.</div><div><br /></div><div>For example:</div><div>C1 = [X]</div><div>C2 = [ -X, Y ]</div><div>C3 = [ -X, -Y ]</div><div><br /></div><div>Will give the table:</div><div>X : C1</div><div>Y : C2</div><div>-X: C2, C3</div><div>-Y: C3</div><div><br /></div><div>In step 1 and 2 above, we can thus directly go to the clauses containing the literal (step 1) and its negation (step 2). Note that if this table is stored as a std::map, it will have lookup time log(N). If we use a hash map (with linear chaining, i.e. std::unordered_map), we will ideally get O(1) but worst case will be O(N). A way to get a truly O(1) lookup is to store the literals in a jump table so that we can immediately get a list of the clauses (std::vector<std::vector<clause*>>).</div><div><br /></div><div>This presumes that we remove clauses/literals. If we use lazy datastructures, e.g. two literal watch, the obviously implementation of scanning for another available literal to watch yields a worst case of O(N) operations per literal deletion, so we end up having a O(N^2) worst case anyway. (That is, the obvious algorithm that had O(N^2) would actually be O(N^3) using two literal watch.) This is perhaps mostly academic, as collisions between assignments and watch literals are rare in large clauses. One could add another table to keep track of all available literals for each clause, making updating watched literals O(1). However, this benefit probably translate into worst performance when using the full DPLL algorithm on general problems, as backtracking would be more expensive: now we need to add/remove literals from this new table during every assignment/backtracking.</div><div><br />There is paper about linear time unit propagation <a href="http://www.cfdvs.iitb.ac.in/download/Docs/verification/papers/sat/original-papers/aim96.pdf" target="_blank">here</a>. It uses two literal watch and assumes that literals are updated in constant time.</div><h2 style="text-align: left;">What about Dual Horn Logic?</h2><div>In dual Horn logic, all clauses have at most one negative literal (instead of one positive, as in Horn logic).</div><div>So if we have non-unit literals, there must be at least one positive literal, and setting all variable assignments to true will thus be a model. If we have unit clauses, we unit propagate. The algorithm is:</div><div><br /></div><div><ol style="text-align: left;"><li>Unit Propagate (may fail, see Horn clause example).</li><li>Assign True to all remaining variables.</li></ol></div><div>So the only difference from Horn logic is that after unit propagation, we assign true to all remaining literals instead of false.</div><div><br /></div><h2 style="text-align: left;">Can DPLL solve Horn and Dual Horn Logic in Linear time?</h2><div>Yes, but only if we use the selection strategy that (first) sets all variables to false for Horn logic, and true for dual Horn logic. This works since the DPLL is essentially:</div><div><ol style="text-align: left;"><li>Unit Propagate.</li><li>If propagation fails, backtrack. Go to step 3.</li><li>If there are no more literals to assign, return the model.</li><li>Select unassigned variable (according to selection strategy). Go to step 1.</li></ol></div><div>If DPLL's backtrack fails, it will exit with a "no models available". It's easy to see that DPLL properly handled the Horn (Dual Horn) logic cases for these particular selection strategies, although with an overhead: after each assignment in step 4, DPLL tries to unit propagate, although no such propagation actually takes place. It then re-enters step 4 and assigns false (or true) to the next literal. This process is repeated until all remaining literals are set to false (true). In the efficient algorithms above: all remaining variables are directly set to false (true) with no unnecessary unit propagation between every assignment.<br /><br />As mentioned before, there may be issues with true linear time solvability when using two literal watch with schemes that are efficient for backtracking (O(N) updating) vs schemes that are efficient for quickly updating watched literals (O(1) updating, but constant time penalty when propagating and backtracking).</div><div><br /></div><div>The problem is that DPLL doesn't know that these assignments are safe in the Horn and Dual Horn cases (they are not "safe" in general, unit propagation is necessary, and this may trigger backtracking or solve the problem entirely).</div><div><br /></div><div>So intuitively, the reason why Horn and Dual horn logic are linear time solvable whereas the general case is exponential is because of choice points and backtracking: in the general case, we need to make guesses, and if these guesses are wrong, we need to backtrack and try the other branch. So we are forking the possible paths repeatedly. With Horn and Dual horn logic, no guesses need to be made, and hence, no backtracking occurs.</div><div><br /></div><h2 style="text-align: left;">What about First-order Horn Clause Logic?</h2><div>It is semi-decidable.</div><div><br /></div><div>Proof: one can encode integers and basic arithmetic operations in Prolog. Thus general diophantine equations can be expressed, and a query can be made that succeeds if and only if it has a solution. If queries would be decidable, this would give a decision procedure for diophantine equations, contradicting Matiyasevich's theorem. So it is undecidable. That it is semi-decidable now follows from the Church-Turing theorem.</div><div><br /></div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com2tag:blogger.com,1999:blog-6767898351325918278.post-18374250667358527252013-04-22T21:50:00.000+08:002013-12-09T23:52:14.212+08:00Why Math is Useful in Natural Sciences and not in Social Sciences<div dir="ltr" style="text-align: left;" trbidi="on">I'll touch on some fundamental questions here:<br /><ol style="text-align: left;"><li>Why use Mathematics in Science?</li><li>Why does it work so poorly in Social Sciences (e.g. Economics, Psychology)?</li><li>Why does it work so well in Natural Sciences (e.g. Physics and Molecular Biology)?</li></ol><div><br /></div><h2 style="text-align: left;">Why use Mathematics in Science?</h2><div>You might think that mathematics is just about being more precise than words can convey. For examples, instead of conveying nuances such as "it's very warm today" and "it's a little warm today", we can be more precise and say "it's 32 degrees today" and "it's 23 degrees today". So it seems like numbers are useful to remove the subjective interpretation of "very warm" and "a little warm".</div><div><br /></div><div>But this is not the main strength of using numbers. The main strength is not that you can quantify things more precisely, but that you can reach conclusions that are otherwise completely inaccessible. Consider this:</div><div><br /></div><div>An artillery piece fires a shell right up into the sky. Will the shell return back to earth? You might think it's obvious that it will, in which case you'd be wrong. There's two opposing things here: gravity, which pulls the shell back to earth, and the velocity at which the shell was fired.</div><div><br /></div><div>And here's where the true power of science comes in: one can calculate that the velocity needed to escape earth's gravity is 40 000 km/h. That's a lot, but not impossibly much: in 1957, during Operation Plumbbob, the Pascal-B test detonated an extremely weak nuke (300 tons) which propelled a metal plate into the sky at 240 000 km/h, about 6 times escape velocity. That piece of metal is believed to have disintegrated in the atmosphere, but one things for sure: it did not return to earth.</div><div><br /></div><div>This is the strength of using quantities: we can tell that a pistol bullet will come back to earth, whereas a nuke propelled piece of metal will not. There's two opposing "forces": gravity and velocity, up and down. And because we have a formula to balance them, we can actually get an answer.</div><div><br /></div><div>Now compare this to the following:</div><div>Alice is offered $20 to eat a fresh cockroach. Does she say yes or no? Two opposing forces: money (good) and eating fresh cockroach (bad). Naturally, everybody's different, so the answer depends on the person. There's certainly validity to that argument, and I'll return to this issue when answering the next question.</div><div><br /></div><div>For now, I want to point out why quantitative science is useful: not only because we can quantity results, but because we would not even be able to reach some conclusions without using quantitative science. In particular, we can't balance two or more opposing things without using quantities.</div><div><br /></div><div>Sometimes numbers are not necessary. Here's an example from evolutionary psychology: on average, who is a person closest to among these four people: his/her maternal grandmother, maternal grandfather, paternal grandmother, or paternal grandfather? Which is a person the least close to? Evolutionary psychology is based on a set of basic assumptions (much like Newton's laws, except they're not quantitative but rather purely logical), from which one can derived that people tend to be closest to their maternal grandmothers, and least close to their paternal grandfathers? And indeed the predictions are correct. What is the logic behind this? That people know for sure that their mothers are theirs, but not so for their fathers. Thus there is a certainty gap that arises every time a parent is a male. Because people tend to favor their own kin (also a pillar of evolutionary psychology, follows from the modern insight that the unit of survival and replication is the gene), the conclusion follows.</div><div><br /></div><div>The argument is statistical in nature, but no quantities are involved (it is qualitative). You might argue that, due to the genetic nature of the assumptions, the argument is essentially one of probabilities over shared genes. It would be so if the assumption of evolutionary psychology where that narrow, but they are not: evolutionary psychology studies the interaction of genes in environments (and thus the assumptions are weaker too, but presumably more accurate).</div><div><br /></div><div>So in this example it works great, but we kindly run into dead ends again if we start balancing different aspects that are in conflict. The problem is that here we could conclude that a person should be close to their mother than father, and that the parent should be closer to their mother than father. So:</div><div><br /></div><div>closeness(mother(A)) > closeness(father(A)) => closeness(mother(mother(A))) > closeness(father(father(A)))</div><div></div><div>In short, there is no conflict: the inequalities go in the same direction both times. In form, it is similar to when arguing that if Alice is taller than Bob and Bob taller than Eve, then Alice is taller than Eve:</div><div><br /></div><div>height(A) > height(B) and height(B) > height(E) => height(A) > height(E).</div><div><br /></div><div>This holds true of numbers too, but numbers are not necessary for such arguments to hold, such as in this case, where we don't know the actual heights of anyone, yet can still draw a conclusion based on the fact that heights are ordered. Logically, a partial order is sufficient for this reasoning to go through (by definition, any transitive relation).</div><div><br /></div><div>In general, the more we assume, the more we can conclude. And quantitative sciences (those based on mathematics) make far more assumptions about their domain of study. Consequently, the theories are richer in predictions, and, as a result, research becomes more fruitful as more experiments can be performed.</div><div><br /></div><div>This is the reason why social sciences do not live up to the impressive achievements of physics: they are not quantitative. This, of course, is something to blame social scientists for. Or is it?</div><div><br /></div><h2 style="text-align: left;">Why does math work so poorly in Social Sciences?</h2><div>We know that social sciences cannot make strong predictions when many opposing features need to be considered together, due to their lack of quantitative precision. But why can't we get a quantitative precision?</div><div><br /></div><div>Here's a couple of explanations:</div><div><ol style="text-align: left;"><li>Social scientists are less smart than natural scientists, so they can't find a mathematical foundation.</li><li>Social sciences are more complex than natural sciences.</li><li>Social sciences are by nature non-mathematical, no mathematical foundation exists.</li></ol></div><div>The main reason why explanation (1) doesn't work is that a lot of mathematicians have worked in the fields of economic theory and behavioral sciences. In particular, John von Neumann, widely regarded as one of the greatest mathematicians to ever live, attempted to create a solid mathematical foundation for social sciences: Game Theory. He even showed that randomly bluffing in poker is not a non-mathematical aspect of the game, but actually falls out from the calculations as the optimal strategy.</div><div><br /></div><div>Von Neumann is not the only person who has made signifcant theoretical contributions to (theoretical studies of) social sciences, so we can safely rule out (1).</div><div><br /></div><div>This laid a foundation upon which countless economists have built their academic careers, but there's a reason why almost none have made it in the industry: it's is useless for practical purposes. It may well be that game theory can capture all of social behavior, the problem lies in the fact that it is not much easier to formalize the "game" correctly than it is to simply write down mathematical equations for human behaviors. We have created a mathematical framework that is no simpler than mathematics itself in terms of describing human behavior.</div><div><br /></div><div>So is there a mathematical foundation? Everybody knows how well math works in physics. The laws of nature seem to be mathematical. At the very least they are very well approximated by mathematics. Here's the key insight: social sciences have a mathematical foundation if, and only if, the laws of nature are truly mathematical (i.e. there are laws that not only approximate reality well but actually describe reality exactly). This follows from straightforward reduction of everything (brain states, body movements) to the laws of nature. This is of course a horrible theory for social sciences due to the microscopic view of things (humans are seen as a pile of atoms), and would be completely useless. But it gives a partial answer to (3).</div><div><br /></div><div>In fact, (3) isn't even the right question to ask. It may be that the laws of nature are not mathematical, but we still have good approximations today. Perhaps there are no exact laws of social science, but we should nevertheless be able to find good approximations.</div><div><br /></div><div>An important comparison can be made to physics: the laws of physics are not necessarily "real". Temperature is not a real concept of nature: it is the speed at which atoms bump into each other. Viewed from a microscopic perspective (imagine we seen every atom individually) there is no "heat", just movement of atoms. From a macroscopic perspective, it is the amount of such movements that causes heat in a gas. But "macroscopic perspective" is certainly a human centric view. It may even be a natural point of view for all intelligent biological lifeforms, but it is certainly a creation of our perception, a way to simplify down reality.</div><div><br /></div><div>The same goes for many other quantities: pressure is the amount of "push" atoms have towards the enclosing surface (e.g. the bottle containing the gas). Again, it's all about the atoms and their movements. </div><div><br /></div><div>Why can't we define concepts like "temperature" and "pressure" in social sciences? We could, it's just that they tend not to make any non-trivial predictions. So why is that? Because human brains are not like atoms. In classical physics, atoms only have three useful properties: location, mass, and velocity. Brains need significantly more information than a couple of numbers to describe it.</div><div><br /></div><div>In mathematics, there is a notion of complexity as "the smallest possible computer program that describes it". This captures the intuitive notion that the string "1212121212" has lower complexity than "7387904583".</div><div>The reason is that the first string is described by a computer program that says "output 12 5 times", whereas the second says "output 7387904583". Intuitively, we need a very short program to describe the state of an atom ("location = (x,y,z), mass = m, velocity = v") but a very long and complicated description for the state of a brain.</div><div><br /></div><div>Complexity in this sense is not the reason. Atoms are memoryless: what an atom did yesterday does not affect its current state. But brains have memory: they remember the past, and their behavior today may significantly have been altered by the past.</div><div><br /></div><div>A third factor is that atoms influence each other on a very basic level: they knock each other around (they can also be entangled in quantum mechanics, in which they have a relatively simple relation, e.g. opposite spin, which is only one number that is correlated). Brains can have enormously complex influence on each other.</div><div><br /></div><div>Once you mix plenty of atoms, you get a predictable behavior. The idea that, once you mix plenty of brains, you get predictable behavior, well, is simply not true, as is made obvious to anyone who has invested on the stock market.</div><div><br /></div><div>We don't need to move all the way from atoms to brains to see the effects of complex interactions. An example from statistics will illuminate the problem: if you want to know the behavior of throwing 100 dices, that's pretty straightforward: the sum will be approximated by a normal distribution with average 350 and standard deviation 0.292. But if you start mixing different kind of dices (some land on 6 more often than 1, or vice versa) as well as interactions between the dices (if one gets a 4, then the others are more likely to land on the same number), then you can't say anything about even what the average is going to be. The reason you can't is because you would need to specify the distribution of probabilities of each dice and how they interact.</div><div><br /></div><div>The easiness of solving the first case lies in one hidden assumption that is extremely useful: the outcomes of each individual dice does not affect the others (the dices are independent). Once you have dependency, you have two choices: if you know all the correlations between different outcomes, there's still a chance you might get a mathematical solution. If not, you're out of luck. For the sum of 100 brains (whatever that means, say voting), you're going to have B1+B2+...+B100 and that's where the math begins and ends: no useful interpretation, no useful conclusion.</div><div><br /></div><div> It is in this sense that I think explanation (2) is the correct one: social sciences are more complex than natural sciences. It's not that it is in impossible to analyze such complex situations: it may be possible with a mind that can handle extremely complicated equations and long, counter-intuitive lines of reasoning by combining a wide variety of techniques (numerical optimization, computer algebra systems, automated theorem provers, simulations, and lots of methods that don't even exist today). It's just that our brains are incapable of such herculean tasks, or we would have done it already - it's not for lack of trying we've failed.</div><div><br /></div><div><br /></div><h2 style="text-align: left;">Why does it work so well in Natural Sciences?</h2><div>Nobody knows the answer to this. We live in a world filled with science and technology (it is tempting to call it <b>the</b> age of science and tech, except that we'll have even more in the future). We've gotten so used to the fact that science can accurately predict the motion of planets, when a comet is going to pass by and whether it will be visible to the naked eye, that thousands of commercial airliners can take off and land everyday (they also achieve this without a pilot), satellites orbit our planet pinpointing our exact location, and robots crawl on the surface of Mars.<br /><br />All of this is possible because of one thing: the laws of nature are possible to approximate with mathematics. Put differently: we are capable of creating an approximate representation of the real world in our minds. We can compress reality.<br /><br />We could turn the question and instead ask: what could go wrong when trying to represent reality in our minds?<br /><br />For one thing, the laws of nature could be hopelessly complex ("incompressible"), just like social sciences are. All of the laws of nature as we know them are simple, in the sense that they admit a very short description: Newton's laws, Maxwell's equations, Schrödinger's wave equation, the equations of general relativity. Now, the laws of nature are at the fundamental level of reality: there is by definition nothing "underneath" them. But why would they have short descriptions? Apparently the description is not as short as we once thought, since Newton's laws had to be replaced by general relativity and quantum mechanics. The latter two contain more equations, and the equations have longer descriptions.<br /><br />It is in principle conceivable that there is are no final laws of nature: the universe is not mathematical, we only approximate it by <b>inventing</b> (not discovering) laws that are useful for our purposes. Science is essentially an engineering task from that point of view: all that counts is what works. That's not so bad, since it apparently works really well, but more interesting question now is: How would it be otherwise?<br /><br />It would be that the universe is actually mathematical. The laws of physics as we currently <b>are not</b> nature itself; they only approximate nature. But perhaps that's only because we haven't found them yet, and we're on the right track (that's two different assumptions: mathematical correct laws of nature exist, and we will find them in the future). This would be the case if the universe worked like a computer program (perhaps because we truly live in a Turing-complete simulation). Another explanation, provided by Tegmark, is that all logically consistent theories are actualized as a separate universe. One problem with this explanation, however, is that the vast majority of universes would have extremely complex logical/mathematical laws, making it impossibly unlikely for us to end up in our current universe with simple laws (assuming they ultimately turn out to be simple, perhaps they're not).<br /><br />This is essentially the concern Wigner had when we wrote a paper titled "The Unreasonable Effectiveness of Mathematics in the Natural Sciences". Why does math work so well in physics?<br /><br />Clearly, even if the universe is non-mathematical, there's the mystery of why mathematics approximates reality so well. An interesting thought is that perhaps it doesn't: perhaps all mathematics does is describe our understanding of the universe, as we perceive it with our limited minds. As mentioned before, temperature and pressure are not properties of the universe, they are properties of the world as we perceive it macroscopically. A hypothetical divine being that perceives quantum states directly may not have any notion of temperature, since that would not be needed: it is after all about compressing lots of information into just one number, so we can make sense of it. We keep repeating this pattern over and over: in statistics, "all information" is all the available samples. We then compress this data into mean, median, standard deviation, etc. We even replace real world data with known statistical distributions because they approximate them well. But that is something we do to comprehend reality, the fact that it throws away information makes it perfectly obvious that it is man made.<br /><br />But those notions are not the most fundamental. At the most basic level, we have something like quantum mechanics (general relativity is not the correct fundamental theory). The rest of science (even physics) is built on top of that. The wave equation describing reality does not throw away information. The laws of physics seem reversible, so that we can freely "move" between past and future in the mathematical formulation. That makes it clear that we're no longer throwing away information, so perhaps this is something like reality itself. So even though most of science is "engineering" in the sense that it defines concepts useful to humans, it may also be the case that reality is mathematical: if we ever find unified mathematical laws that describe everything in the universe, so there is no need for further refinements, we will know that the answer is "yes, the universe is mathematical". The question of why, however, may still remain a mystery.<br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-34609547051098666022013-04-17T01:04:00.000+08:002013-04-17T18:41:36.908+08:00Paradoxes and their Solutions<div dir="ltr" style="text-align: left;" trbidi="on">You've almost certainly heard some of these (logical) paradoxes. They're fun brain puzzles, and people like to share them. Surprisingly, most people who share them do not know their solution, nor try to solve them. Here are some of the most well known paradoxes and their solutions.<br /><div><br /></div><div>Note that these are paradoxes of more logical nature (i.e. "mind puzzles"), and consequently, their solutions explore logic and language.<br /><br />There are also paradoxes within scientific theories, with scientific solutions, which I might write about another time.</div><div><br /></div><h2 style="text-align: left;">Liar Paradox</h2><h3 style="text-align: left;">Problem</h3><div><br /></div><div>Perhaps the most well known paradox. The version of the paradox that is the most clear is the following. Consider the sentence: "This sentence is false". We are wondering whether the sentence itself is true or false.</div><div><br /></div><div>Assume the sentence is true.</div><div>Then clearly the sentence is false, a contradiction.</div><div><br /></div><div>Now assume the sentence is false.</div><div>Then the sentence is true, also a contradiction.</div><div><br /></div><div>It seems we cannot assign a binary truth value (true/false) to the sentence.</div><div><br />This paradox is sometimes given in the form: A person says that he is lying. Is he telling the truth or a lie?</div><div><br /><h3 style="text-align: left;">Solution</h3></div><div>There are two things that are unusual about the sentence.</div><div><br /></div><div>First, it makes a reference to itself, by using the word "this sentence". This is known as self reference, and the statement is said to be self-referential.</div><div><br /></div><div>Second, the sentence assumes that it can express the truth/falsehood of other statements. </div><div><br /></div><div>To see this, imagine that we were to express the sentence in a more formal language. It would look something like:</div><div><br /></div><div>Sentence 1: not is_true(get_sentence(1))</div><div><br /></div><div>In other words, we are assuming that we can get a sentence by it's label ("get_sentence(1)", which is the sentence itself, since we gave it label 1) and that we have a predicate "is_true" which returns true/false accurately (we need not assume anything about the computability of these functions, only that they are definable within the language).</div><div><br /></div><div>If you haven't studied logic you might be inclined to think that the first point is a pretty serious problem whereas the second is trivial. You'd be wrong, however, because it's the other way around: assuming we can express "is false" is a much more problematic issue than using self references.</div><div><br /></div><div>Here's why:</div><div>Self-referential statements are everywhere. I can easily get rid of the self reference in the liar paradox by doing the following:</div><div><br /></div><div>Sentence 1: "Sentence 2 is true"</div><div>Sentence 2: "Sentence 1 is false"</div><div><br /></div><div>If Sentence 1 is true, then both are true, which in turn means that Sentence 1 is false, a contradiction.</div><div>If Sentence 2 is false, then both are false, which in turn means that Sentence 1 is true, a contradiction.</div><div><br /></div><div>You might feel like this point is moot, cause all I've done is replaced "this" with a reference to another sentence, which in turn references back to the first sentence. After all, decoupling the sentence does solve much since we can still express everything using get_sentence and is_true:</div><div><br /></div><div>Sentence 1: is_true(get_sentence(2))</div><div>Sentence 2: not(is_true(get_sentence(1)))</div><div><br /></div><div>That's a good point, except that the problem doesn't go away even if you forbid all kinds of references (to the language itself). In particular, almost any logical system can be re-encoded so that the sentences can be seen to implicitly contain self references. This is done using something called Gödel encodings, and it is roughly the equivalent of assigning every sentence in the English alphabet a unique number in a systematic way. get_sentence is easy to construct, even in languages where it was not intended to exist.</div><div><br /></div><div>The same cannot be said for is_true: it does not in general exist. For example, arithmetic does not admit a truth predicate so there is no way of expressing "1+2=3 is true" in arithmetic. This result is known as Tarski's undefinability theorem.</div><div><br /></div><div>You might wonder what all of this has to do with the English language, in which the paradox is presented. Well the point is that if we allow unrestricted use of natural languages (such as English), we can express pretty much anything. The problem is that many things we can express in English won't even be logical sentences, in the sense that they will not have truth values (such as the sentence under consideration). Unrestricted use of language also opens up the possibility for expressing nonsense.</div><div><br /></div><div>What we can learn from the liar paradox is that we must clearly handicap our "language" if we wish to have a "no nonsense" language (e.g. predicate logic, modal logics, Horn clause logic). How do we handicap such languages to make sure the liar paradox doesn't appear? Gödel has shown us that there isn't much point in trying to remove self references; they occur very easily. Taski has shown us that a truth predicate "is true" is not usually available, so this is certainly more of a comfort.</div><div><br /></div><div>In arithmetic, you cannot express sentences such as the liar paradox, not because it is self referential, but because it asserts the truth of arithmetical sentences. Put differenty, in arithmetic, is_true is not definable.</div><div><br /></div><h2 style="text-align: left;">Barber's Paradox / Russell's Paradox</h2><h3 style="text-align: left;">Problem</h3><div>The barber's paradox arises from the following assumption and question: A barber shaves everyone who doesn't shave themselves. Does he shave himself?</div><div><br /></div><div>As we did for the liar paradox, we check to see which is true:</div><div><br /></div><div>Assume the barber shaves himself.</div><div>Then he doesn't have himself, since he only shaves those that don't shave themselves.</div><div>This is a contradiction.</div><div><br /></div><div>Now assume the barber doesn't shave himself.</div><div>Then he shaves himself, a contradiction again.</div><div><br /></div><h3 style="text-align: left;">Solution</h3><div><div>So this sentence is neither true nor false. As mentioned in the liar paradox, that is not so much of a problem in English, because we really shouldn't expect all sentences to even make sense. There is no problem in imagining a barber that shaves everyone who don't shave themselves, except that he shaves himself too. Or a barber that goes to another barber. The problem is within the confines of language, physically, there is no way to create the contradiction.</div></div><div><br /></div><div>The lesson to learn from this was a hard one historically. Russell's paradox is a more formal (mathematical) version of the paradox, which has to do with set theory. Imagine that a set is a box. We can have an empty box, or put boxes inside empty boxes. So denote the empty box with { }. If we have something, X, inside a box, we'll write {X}. And if we have two things, like X and Y, we'll write {X,Y} and so on. Assume we're allowed to create boxes by (perhaps infinitely) repeating the following:</div><div><br /></div><div>The empty box {} is a box.</div><div>{X,Y,...} is a box if X,Y,... are all boxes.</div><div><br /></div><div>So we know that {} is a box, and { {} } is a box (the box that contains an empty box). Then we can put that box inside a box: { {{}}}. But these are not the only ways we can make boxes. We can put all three of these boxes into one box: { {}, {{}}, {{{}}} }. If you're wondering why this is useful to do in mathematics, it's because from this you can create all numbers. For example, define:</div><div><br /></div><div>0 = {}</div><div>1 = {0} = { {} }</div><div>2 = {0,1} = { {}, {{}} }</div><div>3 = {0,1,2} = { {}, {{}}, {{},{{}}} }</div><div>...</div><div><br /></div><div>These seems all good, until you define the following box (set):</div><div><br /></div><div>B is the box containing all boxes that do not contain themselves. Does it contain itself?</div><div><br /></div><div>What is meant by a box that contains itself? It quite literally means that you can find a copy of the box inside the box. If you think that's impossible, it's because you're thinking only about finite boxes. If we have infinite nesting, such as {{{{...}}}}, then if we "box it", i.e. wrap it around one more { }, we can see that the box has a "copy of itself". So let X = {{{....}}}. Then {X} is a box containing itself, but so is {X,X}, {X,{}}, {X,{},{{}}}, and so on. So there's not just one box that contains itself, but an infinite number.</div><div><br /></div><div>Now back to the paradox: Does B contain itself?</div><div>Assume it does. Then B is inside B, i.e. B = {B,...}. But B is the box containing all boxes (say X) that do not contain themselves. Since B contains B, B cannot be X here. Therefore B does not contain itself.</div><div>Now assume B does not contain B. Then, since B contains all X that do not have X, and B is precisely such an X, B contains itself.</div><div><br /></div><div>So we have: B in B if and only if B not in B, i.e. the set contains itself if and only if it does not contain itself. It's usually said to be a contradiction, but it is technically not: you can't prove "B in B" or "B not in B", only that "B in B iff B not in B". It's only a contradiction if you assume that the sentence must be true or false.</div><div><br /></div><div>Regardless, it is an undesirable feature of a logical system to say the least.</div><div>So these simple rules to create boxes actually lead to the possibility of creating a sentence that is neither true nor false (or both). What set theorists learnt from this is that the creation of boxes (sets) must be restricted (see Von Neumann Universe for three intuitive rules that create sets without this paradox).</div><div><br /></div><div>Technical note: Boxes are defined so that they can contain multiple copies, e.g. { {}, {} }, whereas (non-fuzzy) sets either contain or do not contain an element. Boxes are in fact multisets. None of this changes the argument however: from sets one can create multisets, and vice versa.</div><div><br /></div><h2 style="text-align: left;">Zeno's Paradox</h2><h3 style="text-align: left;">Problem</h3><div>Zeno's paradox has many forms, here's one that is easy to understand: A man is going to walk 100 meters. To reach there, he must first cross 50 meters. But to reach that, in turn, he needs to cross 25 meters. It's easy to see that when we keep halving the distances, we get increasingly smaller numbers (but never exactly zero). This suggests that the man most move "infinitely much" in order to reach the 100 meters (in fact, in order to reach anywhere at all!).</div><h3 style="text-align: left;">Solution</h3><div>The paradox itself makes some questionable assumptions about the nature of reality. More specifically, it assumes that we can divide distances into halves indefinitely. The existence of a smallest measurable length, the Planck length, certainly puts doubt on this.</div><div><br /></div><div>But even if we accept the premiss, one could ask where the problem really is anyway.</div><div>Is it that we find it unreasonable that he can move an infinite number of times? We have already accepted that there's infinitely much space to move in, so perhaps we should also accept that he can move infinitely much (in finite time), especially since the distances are getting smaller and smaller. Mathematically, he must move: 100/2 + 100/4 + 100/8 + 100/16 + ... = 100 meters. So if we can complete the infinite series, he'll be just fine. From this point of view, Zeno's paradox really illuminates its own assumptions:</div><div><br /></div><div>1. Space is continuous (we have infinitely many steps to move through).</div><div>2. Time is discrete (we are not allowed to pass through all infinite steps).</div><div><br /></div><div>Modern science suggests both are discrete, in which case there's no problem: at each discrete time step, the person moves forward a finite number of discrete steps. And as shown above, if both are continuous, there isn't any problem either.</div><div><br />Turning the paradox against itself, we can observe that we can indeed move, so the combination of continuous space and discrete time seems incorrect as far as reality goes.<br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com2tag:blogger.com,1999:blog-6767898351325918278.post-44410736402722313702013-04-09T00:12:00.000+08:002013-04-09T00:31:08.419+08:00Physics of Projectiles: What You Always Wanted to Learn in School<div dir="ltr" style="text-align: left;" trbidi="on">Here's what you've always wanted to know about physics but were never taught in school.<br /><br />In an attempt to make things more interesting, I'll first state the conclusion, and then explain why it is so.<br /><br />We'll consider a firearm bullet, but the same physics can be used for kinetic energy penetrators ("tank bullets"), asteroid impacts, rail guns, and pretty much any projectile that is not propelled by an engine. This excludes rocket propelled grenades and missiles.<br /><br /><h3 style="text-align: left;">Fact 1: The recoil you feel when firing your gun is exactly as big as the stagger that will be exerted on your target.</h3><br />This comes from a conservation law, more precisely, conservation of momentum. Momentum is effectively "push" (the shooter calls it recoil). Momentum is mass times velocity (speed with direction).<br />If you double the mass of the bullet, recoil is twice as big. Likewise, if you double the speed of the bullet, recoil will be twice as big.<br /><br />Conservation of momentum means that the push exerted on you by the bullet as it leaves your gun is the same as the recoil you will feel (the direction of the bullet's and your velocity have to be opposite):<br /><br />bullet momentum = your momentum (recoil)<br /><br />Which, according to what I just said, is:<br /><br />(bullet mass) x (bullet velocity) = (your mass) x (your velocity)<br /><br />A 9mm bullet weights approximately 9 grams. Muzzle velocity (speed upon exiting the muzzle of the gun) is about 365 meters per second. From this, we can calculate at what speed you would move backward when firing the gun:<br /><br />(your velocity) = (bullet mass) x (bullet velocity) / (your mass)<br /><br />If you weigh 75 kg, this gives a speed of 4 cm / second.<br /><br />Assuming that the bullet doesn't slow down before hitting your target (which it will due to air resistance), the bullet's momentum will be the same upon impact, and again by the law of conservation, this momentum will be transferred to your target (assuming the bullet stops upon impact, otherwise only the proportion of the momentum that was lost is transferred).<br /><br />Yes, this means that if your target weights 75 kg, it is going to be pushed back by precisely the same amount as the recoil you felt.<br /><br />It also means that if your favorite bulletproof superhero is hit by a main tank gun, he's gonna fly back quite the distance through the skyline. The reason a tank doesn't fly back when hit is because it weighs a lot.<br /><br />Momentum is not what causes damage upon impact. It is only the "push" exerted by the projectile.<br /><br />Momentum = Recoil (or Stagger)<br /><br />Technically, air resistance slows down a bullet, thus staggering the target even less than your recoil.<br /><br /><h3 style="text-align: left;">Fact 2: Transfer of Energy is what causes damage, mainly through conversion from kinetic energy (due to movement) into shock waves and heat release.</h3>Energy, like momentum, can be derived from Newton's laws. Like momentum, kinetic energy is dependent on mass and speed, but the relationship is different:<br /><br /><a href="http://www.codecogs.com/eqnedit.php?latex=Energy%20=%20\frac{mass%20\cdot%20(velocity)^2}{2}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?Energy = \frac{mass \cdot (velocity)^2}{2}" title="Energy = \frac{mass \cdot (velocity)^2}{2}" /></a> <br /><br />The fact that Energy depends on the square of bullet velocity (velocity times itself) means the following:<br />when doubling the mass of the bullet, we double the kinetic energy (= the damage caused, as we will see soon). This is just the same as for momentum. But when doubling the speed of the bullet, we get four times more energy!<br /><br />If we get ten times more mass, we get ten times more energy. But, again due to the square, if we get ten times more speed, we get 100 times (=10x10) more energy!<br /><br />There is a law of conservation of energy. It is very fundamental, and it means that, assuming the bullet stops upon impact, all of the energy from the bullet will be delivered to the target. In practice, this means that the kinetic energy ("movement energy") from the bullet is entirely converted into shock waves and heat.<br /><br />"Energy = Damage".<br /><br />Technically, it's not really true that energy = damage, but (energy per time unit) = damage. If you convert nuclear fuel into electricity slowly, well, that's good cause it gives us heat in our homes. If you however release all that energy fast (nuclear detonation), then it's destructive. The difference being that in one case we release the energy over a long period of time, whereas in the latter we release a similar amount of energy over a very short period of time. However, all projectiles cause damage very fast, so this is not really an issue with bullets (as the example above makes clear, it does however matter when it comes to explosions).<br /><br />Power = Energy per time = Damage.<br /><br />Here's an interesting question: if the bullet's momentum is transferred to the shooter when firing, why isn't the energy also transferred to the shooter? The answer is that bullets are fired using a modern day version of gun powder, and the energy needed to speed up the bullet is taken from that powder (chemically stored energy).<br /><br />Speeding up a bullet is thus done by converting energy stored inside the cartridge into kinetic energy. You however still feel recoil due to the bullet having mass (weight). It would in principle be possible to avoid that recoil by firing another bullet backwards at the same time, thus preserving momentum, but this is not done for obvious reasons. When (if?) if we develop handheld high energy firearms (e.g. lasers), we may get (almost entirely) recoil free weapons.<br /><br /><h3 style="text-align: left;">Fact 3: Firearms bullets are small and fast because that minimizes recoil while maximizing damage output.</h3><br />But for now, we don't have high energy weapons, so we have an optimization problem (constraint problem) to solve:<br />Minimize recoil.<br />Maximize damage.<br /><br />Which, after what we've said:<br />Minimize (bullet mass) x (bullet velocity) [Momentum]<br />Maximize (bullet mass) x (bullet velocity)^2 [Energy]<br /><br />The solution should be obvious: since we much faster increase damage by increasing velocity (as opposed to mass), we should:<br />Minimize bullet mass.<br />Maximize bullet velocity.<br /><div><br />There's one catch however: too fast traveling bullets means it could go right through the target, thus not delivery all of its kinetic energy onto the target, and risking to wound bystanders. When hunting large game, this is not usually an issue: a moose, being very thick, is likely to absorb all the kinetic energy. Some police forces uses hollow point bullets (see below), which decreases risk of firing through a target.<br /><br /></div><div>Ways to increase bullet mass include using larger calibers (and longer bullets) and very dense materials (such as tungsten or depleted uranium, not used in firearms). Increasing velocity includes using full metal jacket (pointy) bullets and rifling. More on these below.</div><div><br /></div><h3 style="text-align: left;">Fact 4: Large Caliber = Bigger Hole = More Damage.</h3><div>Sir Isaac Newton took us pretty far down the road, but there's more to the story. As you may know, caliber seems to be the single most mentioned aspect when it comes to firearms. In part this is because caliber constrains mass, but that's not all. A bigger hole causes more bleeding, which is obviously more damaging to organic targets. If you're in the forest facing a bear, the difference between a 9 mm bullet and a 12 gauge shotgun slug makes the difference between you being alive and dead.</div><div><br /></div><div>Moreover, large calibers means more absorption of the kinetic energy upon impact, which in turns means more likelihood of not passing through (recall that if the bullet doesn't stop, not all energy will be delivered), and also that the time for transferring energy from kinetic to damage will be shorter: recall that damage = energy per second, so to maximize energy we need to transfer the energy as fast as possible.</div><div><br /></div><div><h3>Fact 6: Hollow Point bullets = More Damage. Pointy bullets = More Accuracy.</h3></div><div>You've most likely seen pointy bullets in movies. Those are "standard", but not necessarily the best. Pointy means they penetrate the air better (i.e. they are less affected by air resistance). The downside, is that they can be too good at penetrating, going right through the target, thus not delivering all energy.</div><div><br /></div><div>One solution is to use non-pointy bullets (i.e. bullets that are "hollow", so that it easily gets stuck in solids). This quickens the energy transfer (just as for large calibers above, but even more dramatically so), makes it more likely that the bullet will get stuck, but also decreases accuracy as it flies through the air less well.</div><div><br /></div><div>A bullet type that attempts to take the best from both worlds is the plastic-tipped bullet: essentially a hollow point bullet with a plastic tip that easily falls off upon impact.</div><div><br /></div><h3 style="text-align: left;">Fact 7: Rifling = More Accuracy.</h3><div>The aerodynamics of a bullet can be significantly improved by imparting a spin on the bullet. This is due to the fact that bullets travel through air (sometimes water), and air is thick compared to the resistance free vacuum. Air resistance slows down the bullet, which means less kinetic energy (indeed, the energy is absorbed by the air). By spinning the bullet, it "cuts" through the air, much like a drill spins to penetrate wood or metal. Imparting spin on a bullet is achieved by simply rifling the barrel's inside.</div><br /><br /></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com1tag:blogger.com,1999:blog-6767898351325918278.post-2273768891674265882013-04-01T16:46:00.000+08:002013-12-09T23:52:14.214+08:00Discoveries that Killed Entire Philosophies<div dir="ltr" style="text-align: left;" trbidi="on">EDIT: Just realized it's 1st of April today. Had I known that before I wrote this post, I may actually have tried to come up with some clever trick. However, that didn't happen: this post is completely serious, no hidden jokes or traps.<br /><br />Every belief - no matter how innocent looking - implies certain conclusions. Put differently, every belief tells us something about what to expect (or not expect) in a certain situation. This makes it possible to make predictions based on beliefs, that is, to experimentally test our beliefs. This lies at the heart of the scientific method.<br /><br />Yesterday's philosophy is today's science. Physics used to be called "natural philosophy". Cosmology has traditionally been considered part of religion and philosophy. Today they are quantitative science, making accurate predictions down to decimals. This transition occurred because we humans have historically underestimated how much we would one day be able to discover by rational means.<br /><br />Here's some big philosophical beliefs that have been shattered by scientific discoveries.<br /><br /><h2 style="text-align: left;">Clockwork Universe - A Predetermined World</h2><div><b>Birth: 1687 (Newton)</b></div><div><b>Death: 1915-1930 (Einstein, Planck, Bohr)</b></div><div><b>History:</b></div><div>After Newton discovered Newtonian Mechanics in 1687, it became clear from his equations that any future state of the universe could in principle be predicted from the current state. This is best explained by Laplace in 1814:</div><div><br /></div><div>"We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes."</div><div><br /></div><div>A clockwork universe is thus characterized by determinism.</div><div>Einstein's discovery of Special Relativity in 1915 put doubts of this world view, as it coupled space and time, and the discovery of Quantum Mechanics finally destroyed it, as it became clear that, on atomic scale, the best can do is give probabilities for location / momentum of particles. A fundamentally non-deterministic universe had to be accepted.</div><div><br /></div><div>The implications of Quantum Mechanics (QM) are still a matter of ongoing research (it will be so until all but one interpretation of QM has been eliminated). Bell's theorem demonstrates that reality is either a multiverse (all possible outcomes happen in separate universes) or non-local (instantaneous transfer of information between one end of the universe to the other is possible).</div><div><br /></div><div><h2 style="text-align: left;">Divine Origin - The Gift of Life</h2><div><b>Birth: 500 B.C. (Empedocles)</b></div><div><b>Death: 1859 (Darwin)</b></div></div><div><b>History:</b></div><div>The idea that man was created in its full form by divine power is very old. It can be traced back to the ancient Greek philosopher Empedocles, although the idea most likely existed long before that. The beginning of the end for such a world view came with Darwin's discovery that species evolve into other species in time, as they adapt to their environment. Man, as all other species, was thus created from some "seed of life".</div><div><br /></div><div>Darwin also destroyed another idea: the belief that physical traits can be altered throughout an organism's life. Lamarckian evolution is a theory in which Giraffes grow longer necks because they keep trying to reach higher fruits (they stretch out their necks). Darwin made this idea obsolete, by correctly explaining that Giraffes with necks too short simply die out, so that long necked Giraffes become more common in the next generation.</div><div><br /></div><div>Today, evolution can be observed directly in microorganism (as their evolution is much faster than that of larger species). The mechanism by which the information gets passed on was discovered to be the double helix DNA strands. The decoding of DNA information (genotypes) into observable traits (phenotypes) is ongoing research.</div><div><br /></div><h2 style="text-align: left;">Formalism - A Dream of Mechanization</h2><div><b>Birth: 1920 (Hilbert)</b></div><div><b>Death: 1931 (Gödel)</b></div><div><b>History:</b></div><div>At a time when mathematicians rose to the challenge of mechanizing all of mathematical reasoning (and, in doing so, founded what is today known as Artificial Intelligence), Hilbert had a vision: a system of rules to go from one fact to the next, thus discovery all truths. The end result? A machine that, given some input such as "does <a href="http://www.codecogs.com/eqnedit.php?latex=x^n@plus;y^n=z^n" target="_blank"><img src="http://latex.codecogs.com/gif.latex?x^n+y^n=z^n" title="x^n+y^n=z^n" /></a> have any integer solutions?", the machine would output "yes".<br /><br /></div><div>This dream came to an abrupt end in 1931 when Gödel proved two things: first, the machine Hilbert envisioned would never be able to answer some questions, and, second, that machine cannot <b>reliably</b> answer the question "Is every answer you give me correct?". These are Gödel's famous incompleteness theorems.</div><div><br /></div><div>One may then ask an interesting question: If the machine cannot answer some question, what would happen? We can imagine that in the best case, the machine would answer with "Don't Know". So for every question, we have three possible answers: "Yes", "No", and "Don't Know". Church and Turing later proved that even this is not possible: in the "Don't Know" case, the machine will "think" forever (this is known as the Church-Turing theorem).</div><div><br /></div><h2 style="text-align: left;">Static Universe - Eternal Existence</h2><div><b>Birth: 1917 (Einstein)</b></div><div><b>Death:1929 (Hubble)</b></div><div><b>History:</b></div><div>The idea of a universe that has always existed, and will always exist, is probably very old, but Einstein made it explicit when he introduced the so called "Cosmological Constant" into his equations to force the universe to be static in size (non-shrinking and non-growing). This made for great disappointment in 1929, when Hubble observed that everything is moving away from us in space - the universe is expanding!</div><div><br /></div><div>If we reverse the expansion of the universe, we see that it all began at a single point, and so Hubble also discovered the Big Bang theory.</div><div><br /></div><div>It was initially thought that the expansion of the universe would be slowing down due to the discovery of dark matter. This left open the possibility that the universe would one day stop expanding, and start contracting back into the big bang. Later discoveries showed the opposite of such expectations: the expansion of the universe is speeding up all the time. The reason for this is completely unknown, and called "dark energy" (or sometimes also "cosmological constant", although it needs not be constant). What is clear, however, is that the universe will thus continue expanding forever, implying higher entropy / lower temperature / lower energy per cubic meter of space. In short, the universe is dying.</div><div><br /></div><div><br /></div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com2tag:blogger.com,1999:blog-6767898351325918278.post-39353582826658500332013-03-07T22:20:00.000+08:002013-05-13T02:05:40.705+08:00Efficient Memoization using Partial Function Application<div dir="ltr" style="text-align: left;" trbidi="on">Memoization is any technique of storing previously computed values for fast retrieval, rather than recomputing the same value multiple times:<br /><br /><br /><pre style="background-color: white;"><span style="color: maroon; font-weight: bold;">bool</span> f<span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// defined elsewhere</span><br />f<span style="color: #808030;">(</span><span style="color: #008c00;">2</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// compute f(2)</span><br />f<span style="color: #808030;">(</span><span style="color: #008c00;">2</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// retrieve previously stored value for f(2) instead of recomputing it</span></pre><br /><br />This is useful when calling f is expensive.<br /><br />In C++, the most obvious way to implement memoization is perhaps using a std::map:<br /><br />(I'm assuming correctness is important here, hence, storing only hashes and return values won't do.)<br /><div><br /></div><br /><pre style="background-color: white;"><span style="color: maroon; font-weight: bold;">bool</span> f<span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// defined elsewhere</span><br /><span style="color: maroon; font-weight: bold;">bool</span> memo_f<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: #808030;">,</span><span style="color: maroon; font-weight: bold;">bool</span><span style="color: purple;">></span><span style="color: #808030;">&</span> memo_tbl<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">int</span> i<span style="color: #808030;">)</span><br /><span style="color: purple;">{</span><br /> <span style="color: maroon; font-weight: bold;">auto</span> at <span style="color: #808030;">=</span> memo_tbl<span style="color: #808030;">.</span><span style="color: #603000;">find</span><span style="color: #808030;">(</span>i<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">if</span> <span style="color: #808030;">(</span>at <span style="color: #808030;">=</span><span style="color: #808030;">=</span> memo_tbl<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// Not previously computed, compute and store</span><br /> at <span style="color: #808030;">=</span> memo_tbl<span style="color: #808030;">.</span>insert<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>make_pair<span style="color: #808030;">(</span>i<span style="color: #808030;">,</span>f<span style="color: #808030;">(</span>i<span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">.</span>first<span style="color: purple;">;</span><br /> <span style="color: purple;">}</span><br /> <span style="color: maroon; font-weight: bold;">return</span> at<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: purple;">;</span><br /><span style="color: purple;">}</span></pre><pre style="background-color: white;"></pre><br />This works fine for any function from a domain X (int above) to another domain Y (bool above).<br />But what if we want to compute a function that takes multiple arguments?<br />Say we have:<br /><br />bool f(int,int); // defined elsewhere<br /><br />We could then use a std::map<std::pair<int,int>,bool>, where the pair represents the two arguments (more generally, we could use a std::tuple, or even a std::vector for variable size arguments).<br />But these approaches all have one problem that becomes apparent if we replace our int's with heavier objects:<br /><br /><br /><pre style="background-color: white;"><span style="color: maroon; font-weight: bold;">bool</span> f<span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">const</span> Matrix<span style="color: #808030;">&</span><span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> Graph<span style="color: #808030;">&</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><br /><span style="color: maroon; font-weight: bold;">bool</span> memo_f<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: purple;"><</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">pair</span><span style="color: purple;"><</span>Matrix<span style="color: #808030;">,</span>Graph<span style="color: purple;">></span><span style="color: #808030;">,</span><span style="color: maroon; font-weight: bold;">bool</span><span style="color: purple;">></span><span style="color: #808030;">&</span> memo_tbl<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> Matrix<span style="color: #808030;">&</span> m<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> Graph<span style="color: #808030;">&</span> g<span style="color: #808030;">)</span><br /><span style="color: purple;">{</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">pair</span><span style="color: purple;"><</span>Matrix<span style="color: #808030;">,</span>Graph<span style="color: purple;">></span> p<span style="color: #808030;">(</span>m<span style="color: #808030;">,</span>g<span style="color: #808030;">)</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// <-- ouch! Copies the entire matrix and graph!</span><br /> <span style="color: maroon; font-weight: bold;">auto</span> at <span style="color: #808030;">=</span> memo_tbl<span style="color: #808030;">.</span><span style="color: #603000;">find</span><span style="color: #808030;">(</span>p<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">if</span> <span style="color: #808030;">(</span>at <span style="color: #808030;">=</span><span style="color: #808030;">=</span> memo_tbl<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// Not previously computed, compute and store</span><br /> at <span style="color: #808030;">=</span> memo_tbl<span style="color: #808030;">.</span>insert<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>make_pair<span style="color: #808030;">(</span>i<span style="color: #808030;">,</span>f<span style="color: #808030;">(</span>i<span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">.</span>first<span style="color: purple;">;</span><br /> <span style="color: purple;">}</span><br /> <span style="color: maroon; font-weight: bold;">return</span> at<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: purple;">;</span><br /><span style="color: purple;">}</span></pre><br /><div><br /></div><div>I'm assuming that all objects can be ordered. The argument works equally well using std::unordered_map with a hash function for all objects. How to order or hash objects is not the concern here, but rather, avoiding unnecessary copying of large objects.</div><div><br /></div><div>Using points to solve the problem isn't a good idea:</div><div><br /></div><div><br /><pre><span style="background-color: white; color: maroon; font-weight: bold;">bool</span><span style="background-color: white;"> f</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: maroon; font-weight: bold;">const</span><span style="background-color: white;"> Matrix</span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white;"> </span><span style="background-color: white; color: maroon; font-weight: bold;">const</span><span style="background-color: white;"> Graph</span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /></span><span style="background-color: white; color: maroon; font-weight: bold;">typedef</span><span style="background-color: white;"> </span><span style="background-color: white; color: #666616;">std</span><span style="background-color: white; color: purple;">::</span><span style="background-color: white; color: #603000;">pair</span><span style="background-color: white; color: purple;"><</span><span style="background-color: white; color: maroon; font-weight: bold;">const</span><span style="background-color: white;"> Matrix</span><span style="background-color: white; color: #808030;">*</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white; color: maroon; font-weight: bold;">const</span><span style="background-color: white;"> Graph</span><span style="background-color: white; color: #808030;">*</span><span style="background-color: white; color: purple;">></span><span style="background-color: white;"> inputs</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /></span><span style="background-color: white; color: dimgrey;">// pair_ptr_cmp defined elsewhere, dereferences the pointers inside the pair and compares them</span><span style="background-color: white;"><br /><br /></span><span style="background-color: white; color: maroon; font-weight: bold;">bool</span><span style="background-color: white;"> memo_f</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #666616;">std</span><span style="background-color: white; color: purple;">::</span><span style="background-color: white; color: #603000;">map</span><span style="background-color: white; color: purple;"><</span><span style="background-color: white;">inputs</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white; color: maroon; font-weight: bold;">bool</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white;">pair_ptr_cmp</span><span style="background-color: white; color: purple;">></span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white;"> memo_tbl</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white;"> </span><span style="background-color: white; color: maroon; font-weight: bold;">const</span><span style="background-color: white;"> Matrix</span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white;"> m</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white;"> </span><span style="background-color: white; color: maroon; font-weight: bold;">const</span><span style="background-color: white;"> Graph</span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white;"> g</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white;"><br /></span><span style="background-color: white; color: purple;">{</span><span style="background-color: white;"><br /> inputs p</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white;">m</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white; color: #808030;">&</span><span style="background-color: white;">g</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /> </span><span style="background-color: white; color: maroon; font-weight: bold;">auto</span><span style="background-color: white;"> at </span><span style="background-color: white; color: #808030;">=</span><span style="background-color: white;"> memo_tbl</span><span style="background-color: white; color: #808030;">.</span><span style="background-color: white; color: #603000;">find</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white;">p</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /> </span><span style="background-color: white; color: maroon; font-weight: bold;">if</span><span style="background-color: white;"> </span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white;">at </span><span style="background-color: white; color: #808030;">=</span><span style="background-color: white; color: #808030;">=</span><span style="background-color: white;"> memo_tbl</span><span style="background-color: white; color: #808030;">.</span><span style="background-color: white;">end</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white;"> </span><span style="background-color: white; color: purple;">{</span><span style="background-color: white;"><br /> </span><span style="background-color: white; color: dimgrey;">// Not previously computed, compute and store</span><span style="background-color: white;"><br /> </span><span style="background-color: white; color: maroon; font-weight: bold;">bool</span><span style="background-color: white;"> ans </span><span style="background-color: white; color: #808030;">=</span><span style="background-color: white;"> f</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #808030;">*</span><span style="background-color: white;">m</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white; color: #808030;">*</span><span style="background-color: white;">g</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /> </span><span style="background-color: white; color: dimgrey;">// Note safe to store soft pointers to external objects, so we need to copy</span><span style="background-color: white;"><br /> p </span><span style="background-color: white; color: #808030;">=</span><span style="background-color: white;"> inputs</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: maroon; font-weight: bold;">new</span><span style="background-color: white;"> Matrix</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #808030;">*</span><span style="background-color: white;">m</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white; color: maroon; font-weight: bold;">new</span><span style="background-color: white;"> Graph</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #808030;">*</span><span style="background-color: white;">g</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: #808030;">)</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /> at </span><span style="background-color: white; color: #808030;">=</span><span style="background-color: white;"> memo_tbl</span><span style="background-color: white; color: #808030;">.</span><span style="background-color: white;">insert</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white; color: #666616;">std</span><span style="background-color: white; color: purple;">::</span><span style="background-color: white;">make_pair</span><span style="background-color: white; color: #808030;">(</span><span style="background-color: white;">p</span><span style="background-color: white; color: #808030;">,</span><span style="background-color: white;">ans</span><span style="background-color: white; color: #808030;">))</span><span style="background-color: white; color: #808030;">.</span><span style="background-color: white;">first<span style="color: purple;">;</span></span></pre><pre><span style="background-color: white;"> </span><span style="background-color: white; font-style: italic; font-weight: bold;">}</span><span style="background-color: white;"><br /> </span><span style="background-color: white; color: maroon; font-weight: bold;">return</span><span style="background-color: white;"> at</span><span style="background-color: white; color: #808030;">-</span><span style="background-color: white; color: #808030;">></span><span style="background-color: white;">second</span><span style="background-color: white; color: purple;">;</span><span style="background-color: white;"><br /></span><span style="background-color: white; font-style: italic; font-weight: bold;">}</span></pre><pre style="background-color: white;"><span style="background-color: #dd0000; background-position: initial initial; background-repeat: initial initial; color: white; font-style: italic; font-weight: bold;"><br /></span></pre></div><div>This works, except we need to define the awkward pair_ptr_cmp, and, the entire block starting right after our declaration of memo_tbl (not shown above) until it won't be used anymore will have to be wrapped in a try - catch (...) block:</div><div><br /></div><div><br /><pre style="background-color: white;"><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: purple;"><</span>inputs<span style="color: #808030;">,</span><span style="color: maroon; font-weight: bold;">bool</span><span style="color: #808030;">,</span>pair_ptr_cmp<span style="color: purple;">></span> memo_tbl<span style="color: purple;">;</span><br /><span style="color: maroon; font-weight: bold;">try</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// Want to call f with Matrix m and graph g</span><br /> <span style="color: maroon; font-weight: bold;">bool</span> b <span style="color: #808030;">=</span> memo_f<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">,</span> m<span style="color: #808030;">,</span>g<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: dimgrey;">// ...</span><br /><span style="color: purple;">}</span> <span style="color: maroon; font-weight: bold;">catch</span> <span style="color: #808030;">(</span><span style="color: #808030;">.</span><span style="color: #808030;">.</span><span style="color: #808030;">.</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: maroon; font-weight: bold;">for</span> <span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">auto</span><span style="color: #808030;">&</span> p <span style="color: purple;">:</span> memo_tbl<span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: maroon; font-weight: bold;">delete</span> p<span style="color: #808030;">.</span>first<span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">delete</span> p<span style="color: #808030;">.</span>second<span style="color: purple;">;</span><br /> <span style="color: purple;">}</span><br /> <span style="color: maroon; font-weight: bold;">throw</span><span style="color: purple;">;</span><br /><span style="color: purple;">}</span></pre><pre style="background-color: white;"><span style="color: purple;"><br /></span></pre></div><div>Needless to say, this is hard to read, error-prone, and just bad C++.</div><div><br /></div><div>A good solution I've found is to use the concept of partial function application. The idea is simple: we may curry a 2 dimensional function (X,Y) -> f(X,Y) as X -> (Y -> f(X,Y)). That is, instead of taking two arguments from domain (of type) X and Y, it takes one argument from X and returns a function. This new function, takes the second argument, Y, and returns the desired f(X,Y).</div><div>Some code will make the point clear, and also demonstrate its elegance:</div><div><br /></div><div><div><br /><pre style="background-color: white;"><span style="color: maroon; font-weight: bold;">typedef</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">string</span> arg1_type<span style="color: purple;">;</span><br /><span style="color: maroon; font-weight: bold;">typedef</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">vector</span><span style="color: purple;"><</span><span style="color: maroon; font-weight: bold;">int</span><span style="color: purple;">></span> arg2_type<span style="color: purple;">;</span><br /><span style="color: maroon; font-weight: bold;">typedef</span> <span style="color: maroon; font-weight: bold;">bool</span> ret_type<span style="color: purple;">;</span><br /><br /><span style="color: dimgrey;">// Function to call</span><br /><span style="color: maroon; font-weight: bold;">bool</span> function<span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">const</span> arg1_type<span style="color: #808030;">&</span> a1<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> arg2_type<span style="color: #808030;">&</span> a2<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><br /><span style="color: dimgrey;">// Memoization of function</span><br /><span style="color: maroon; font-weight: bold;">typedef</span> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: #808030;"><</span>arg1_type<span style="color: #808030;">,</span><span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: #808030;"><</span>arg2_type<span style="color: #808030;">,</span>ret_type<span style="color: #808030;">></span><span style="color: #808030;">></span> memo_tbl<span style="color: purple;">;</span><br />ret_type memo_call<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">&</span> f<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> arg1_type<span style="color: #808030;">&</span> a1<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> arg2_type<span style="color: #808030;">&</span> a2<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><br />ret_type memo_call<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">&</span> f<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> arg1_type<span style="color: #808030;">&</span> a1<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> arg2_type<span style="color: #808030;">&</span> a2<span style="color: #808030;">)</span><br /><span style="color: purple;">{</span><br /> ret_type ans<span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">auto</span> at <span style="color: #808030;">=</span> f<span style="color: #808030;">.</span><span style="color: #603000;">find</span><span style="color: #808030;">(</span>a1<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">if</span> <span style="color: #808030;">(</span>at <span style="color: #808030;">!</span><span style="color: #808030;">=</span> f<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> <span style="color: maroon; font-weight: bold;">auto</span> at2 <span style="color: #808030;">=</span> at<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: #808030;">.</span><span style="color: #603000;">find</span><span style="color: #808030;">(</span>a2<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">if</span> <span style="color: #808030;">(</span>at2 <span style="color: #808030;">!</span><span style="color: #808030;">=</span> at<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: #808030;">.</span>end<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: purple;">{</span><br /> ans <span style="color: #808030;">=</span> at2<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: purple;">;</span><br /> <span style="color: purple;">}</span> <span style="color: maroon; font-weight: bold;">else</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// map for a1 exists but value of a2 not stored</span><br /> ans <span style="color: #808030;">=</span> function<span style="color: #808030;">(</span>a1<span style="color: #808030;">,</span>a2<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> at<span style="color: #808030;">-</span><span style="color: #808030;">></span>second<span style="color: #808030;">.</span>insert<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>make_pair<span style="color: #808030;">(</span>a2<span style="color: #808030;">,</span>ans<span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: purple;">}</span><br /> <span style="color: purple;">}</span> <span style="color: maroon; font-weight: bold;">else</span> <span style="color: purple;">{</span><br /> <span style="color: dimgrey;">// map for a2 (and a2) does not exist</span><br /> ans <span style="color: #808030;">=</span> function<span style="color: #808030;">(</span>a1<span style="color: #808030;">,</span>a2<span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">map</span><span style="color: purple;"><</span>arg2_type<span style="color: #808030;">,</span>ret_type<span style="color: purple;">></span> m<span style="color: purple;">;</span><br /> m<span style="color: #808030;">.</span>insert<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>make_pair<span style="color: #808030;">(</span>a2<span style="color: #808030;">,</span>ans<span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> f<span style="color: #808030;">.</span>insert<span style="color: #808030;">(</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>make_pair<span style="color: #808030;">(</span>a1<span style="color: #808030;">,</span><span style="color: #666616;">std</span><span style="color: purple;">::</span>move<span style="color: #808030;">(</span>m<span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /> <span style="color: purple;">}</span><br /> <span style="color: maroon; font-weight: bold;">return</span> ans<span style="color: purple;">;</span><br /><span style="color: purple;">}</span></pre><pre style="background-color: white;"><span style="color: purple;"><br /></span></pre></div></div><div>We can now test it:</div><div><br /></div><div><div><br /><pre style="background-color: white;"><span style="color: maroon; font-weight: bold;">static</span> <span style="color: maroon; font-weight: bold;">int</span> counter <span style="color: #808030;">=</span> <span style="color: #008c00;">0</span><span style="color: purple;">;</span> <span style="color: dimgrey;">// function keeps track of number of calls</span><br /><span style="color: maroon; font-weight: bold;">bool</span> function<span style="color: #808030;">(</span><span style="color: maroon; font-weight: bold;">const</span> arg1_type<span style="color: #808030;">&</span> a1<span style="color: #808030;">,</span> <span style="color: maroon; font-weight: bold;">const</span> arg2_type<span style="color: #808030;">&</span> a2<span style="color: #808030;">)</span><br /><span style="color: purple;">{</span><br /> <span style="color: #808030;">+</span><span style="color: #808030;">+</span>counter<span style="color: purple;">;</span><br /> <span style="color: maroon; font-weight: bold;">return</span> a1<span style="color: #808030;">.</span>size<span style="color: #808030;">(</span><span style="color: #808030;">)</span> <span style="color: #808030;">=</span><span style="color: #808030;">=</span> a2<span style="color: #808030;">.</span>size<span style="color: #808030;">(</span><span style="color: #808030;">)</span><span style="color: purple;">;</span><br /><span style="color: purple;">}</span><br /><br /><span style="color: maroon; font-weight: bold;">int</span> <span style="color: #400000;">main</span><span style="color: #808030;">(</span><span style="color: #808030;">)</span><br /><span style="color: purple;">{</span><br /> memo_tbl memo_tbl<span style="color: purple;">;</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">cerr</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> memo_call<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">,</span>arg1_type<span style="color: #808030;">(</span><span style="color: maroon;">"</span><span style="color: #0000e6;">Hello World!</span><span style="color: maroon;">"</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>arg2_type<span style="color: #808030;">(</span><span style="color: #008c00;">10</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0f69ff;">\n</span><span style="color: maroon;">"</span><span style="color: purple;">;</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">cerr</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> memo_call<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">,</span>arg1_type<span style="color: #808030;">(</span><span style="color: maroon;">"</span><span style="color: #0000e6;">Hello World!</span><span style="color: maroon;">"</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>arg2_type<span style="color: #808030;">(</span><span style="color: #008c00;">12</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0f69ff;">\n</span><span style="color: maroon;">"</span><span style="color: purple;">;</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">cerr</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> memo_call<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">,</span>arg1_type<span style="color: #808030;">(</span><span style="color: maroon;">"</span><span style="color: #0000e6;">Hello World!</span><span style="color: maroon;">"</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>arg2_type<span style="color: #808030;">(</span><span style="color: #008c00;">10</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0f69ff;">\n</span><span style="color: maroon;">"</span><span style="color: purple;">;</span><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">cerr</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> memo_call<span style="color: #808030;">(</span>memo_tbl<span style="color: #808030;">,</span>arg1_type<span style="color: #808030;">(</span><span style="color: maroon;">"</span><span style="color: #0000e6;">Bye World!</span><span style="color: maroon;">"</span><span style="color: #808030;">)</span><span style="color: #808030;">,</span>arg2_type<span style="color: #808030;">(</span><span style="color: #008c00;">10</span><span style="color: #808030;">)</span><span style="color: #808030;">)</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0f69ff;">\n</span><span style="color: maroon;">"</span><span style="color: purple;">;</span><br /><br /> <span style="color: #666616;">std</span><span style="color: purple;">::</span><span style="color: #603000;">cerr</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0000e6;">Counter: </span><span style="color: maroon;">"</span> <span style="color: #808030;"><</span><span style="color: #808030;"><</span> counter <span style="color: #808030;"><</span><span style="color: #808030;"><</span> <span style="color: maroon;">"</span><span style="color: #0f69ff;">\n</span><span style="color: maroon;">"</span><span style="color: purple;">;</span><br /><span style="color: purple;">}</span></pre></div></div><div><br /></div><div>Output is:</div><div><div>0</div><div>1</div><div>0</div><div>1</div><div>Counter: 3</div></div><div><br /></div><div>The method generalizes to n-arity functions in the obvious way: just nest std::maps inside each other.<br /><br /><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com1tag:blogger.com,1999:blog-6767898351325918278.post-14157806124461075322013-02-17T02:53:00.000+08:002013-02-18T15:46:29.338+08:00Misconceptions you may have<div dir="ltr" style="text-align: left;" trbidi="on"><br /><h2 style="text-align: left;">If the Universe is infinite, there must be a copy of me out there.</h2>False. An infinite universe doesn't mean that everything necessarily repeats.<br /><br />The incorrect assumption here is that <b>all </b>repetitions must occur. If the universe is infinite but this is the only populated planet (all others contain no life, or at least no human life), then clearly there is no copy of you.<br /><br />What is true is that if there are only a limited number of configurations per space unit (say, all possible quantum states), then at least one such configuration must repeat infinitely in an infinite universe. However, it is possible that the state of total vacuum repeats infinitely much, so that outside a finite region of the universe (our observable one), there is only empty space. Universe is infinite, but most of it is empty and there is only one copy of you.<br /><br />Whether these scenarios are plausible or not is another question, but logic and the laws of physics (as we understand them today) doesn't force multiple copies of you just because the universe is infinite.<br /><br /><h2 style="text-align: left;">If something has non-zero probability of occurring, it will eventually happen for sure.</h2>False.<br /><br />Well, if the event has <b>fixed</b> non-zero probability (it doesn't change with time), then the event is bound to happen eventually.<br /><br />If, however, the event has a time-dependent probability (it changes with time, as is usually the case in real life), this is false, even if the probabilities are always non-zero. The trick is that if the time-dependent probability is lowered fast enough, there is not necessarily a 100% chance of it happening at any point in time. Consider the event "World War breaks out", with probability <a href="http://www.codecogs.com/eqnedit.php?latex=1-e^{-\frac{1}{2^{k@plus;1}}}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?1-e^{-\frac{1}{2^{k+1}}}" title="1-e^{-\frac{1}{2^{k+1}}}" /></a> of breaking out in year k (it may happen more than once, but the events across years do not affect each other for the sake of simplicity). For the first year, the probability is <a href="http://www.codecogs.com/eqnedit.php?latex=1-e^{-\frac{1}{2^{2}}}%20\approx%2022\%" target="_blank"><img src="http://latex.codecogs.com/gif.latex?1-e^{-\frac{1}{2^{2}}} \approx 22\%" title="1-e^{-\frac{1}{2^{2}}} \approx 22\%" /></a>. In the second year, the probability drops to <a href="http://www.codecogs.com/eqnedit.php?latex=1-e^{-\frac{1}{8}}%20\approx%2012\%" target="_blank"><img src="http://latex.codecogs.com/gif.latex?1-e^{-\frac{1}{8}} \approx 12\%" title="1-e^{-\frac{1}{8}} \approx 12\%" /></a>, then to <a href="http://www.codecogs.com/eqnedit.php?latex=1-e^{-\frac{1}{16}}%20\approx%206\%" target="_blank"><img src="http://latex.codecogs.com/gif.latex?1-e^{-\frac{1}{16}} \approx 6\%" title="1-e^{-\frac{1}{16}} \approx 6\%" /></a> in the third year, and so on. The probability that World War never breaks out is then <a href="http://www.codecogs.com/eqnedit.php?latex=\prod%20e^{-\frac{1}{2^{k@plus;1}}}%20=%20e^{\ln%20\prod%20\exp\left(-\frac{1}{2^{k@plus;1}}\right)}%20=%20e^{\sum%20-\frac{1}{2^{k@plus;1}}}%20=%20e^{-\frac{1/2}{1-1/2}}%20=%20e^{-1}%20\approx%2037\%" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\prod e^{-\frac{1}{2^{k+1}}} = e^{\ln \prod \exp\left(-\frac{1}{2^{k+1}}\right)} = e^{\sum -\frac{1}{2^{k+1}}} = e^{-\frac{1/2}{1-1/2}} = e^{-1} \approx 37\%" title="\prod e^{-\frac{1}{2^{k+1}}} = e^{\ln \prod \exp\left(-\frac{1}{2^{k+1}}\right)} = e^{\sum -\frac{1}{2^{k+1}}} = e^{-\frac{1/2}{1-1/2}} = e^{-1} \approx 37\%" /></a>, so the probability that World War will break out (at least once) is 63%. Not for sure at all.<br /><br /><h2 style="text-align: left;">Everything that has a beginning must have an end.</h2><div>False. The natural numbers have a beginning at 0, but, as there is no largest number, there is no end:</div><div>beginning ---> 0,1,2,3,4,5,6,.... ---> just keeps counting, no end.</div><div>Time itself may have a beginning in the big bang, but not necessarily an end (even when the universe reaches its maximal entropy state, quantum fluctuations ensure that change still occurs).</div><div><br /></div><h2 style="text-align: left;">There's only one size of infinity.</h2>False. There's an infinity of infinities. You can't count infinities, but you can pair one value from each of the two infinities, and see if there's anything left in either afterwards. This is analogous to determining whether there are more passengers or seats in an airplane by asking everything to set down, and check whether some people are still standing, or some seats available after the process. You don't count, you only compare relatively.<br /><br />In this sense the set of natural numbers is smaller than the set of real numbers. That's two different infinities. To keep going, one shows that the power set is always larger than the original (infinite) set. The power set is the set of all (unordered) permutations of the original set. For example, the power set of {1,2} is { {}, {1}, {2}, {1,2} }. Hence an infinite hierarchy of sets.<br /><br />Another example: when you enumerate numbers as 0,1,2,3,4,... you will eventually reach the first infinite number (ordinal), but logically speaking, there's nothing preventing you from continuing by doing infinity+1, infinity+2,... until you reach 2*infinity. By the same logic, then, you can have infinity*infinity, infinity^infinity, and so on.<br /><br />Do these "higher" infinities exist in the real world? Maybe not. But in that case, the real numbers are not real at all, since they belong to a higher infinity...<br /><br />See ordinal and cardinal numbers for more information.<br /><br /><h2>Computers only deal with numbers. They can't know that <a href="http://www.codecogs.com/eqnedit.php?latex=\sqrt{\pi^2}=\pi" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\sqrt{\pi^2}=\pi" title="\sqrt{\pi^2}=\pi" /></a>, as we do.</h2>False. This is based on the assumption that the only way a computer can compute the equality is by first squaring pi, and then taking the square root of that number.<br /><br />Now, pi has an infinite non-repeating decimal sequence (it is an irrational number; in fact transcendental), so it is not even possible for a computer to store all the decimals of pi. This reasoning is correct; what is incorrect is the assumption that the computer is limited to only performing such calculations when determining the equality.<br /><br />Just like humans, the computer can store the algebraic rewrite rules<br /><a href="http://www.codecogs.com/eqnedit.php?latex=\sqrt{X}%20\longrightarrow%20X^{\frac{1}{2}}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\sqrt{X}%20\longrightarrow%20X^{\frac{1}{2}}" title="\sqrt{X} \longrightarrow X^{\frac{1}{2}}" /></a> , <a href="http://www.codecogs.com/eqnedit.php?latex=\left(X^p\right)^q%20\longrightarrow%20X^{pq}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\left(X^p\right)^q%20\longrightarrow%20X^{pq}" title="\left(X^p\right)^q \longrightarrow X^{pq}" /></a> , <a href="http://www.codecogs.com/eqnedit.php?latex=X%20\frac{1}{Y}%20\longrightarrow%20\frac{X}{Y}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?X%20\frac{1}{Y}%20\longrightarrow%20\frac{X}{Y}" title="X \frac{1}{Y} \longrightarrow \frac{X}{Y}" /></a> , <a href="http://www.codecogs.com/eqnedit.php?latex=\frac{X}{X}%20\longrightarrow%201" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\frac{X}{X}%20\longrightarrow%201" title="\frac{X}{X} \longrightarrow 1" /></a>, and <a href="http://www.codecogs.com/eqnedit.php?latex=X^1%20\longrightarrow%20X" target="_blank"><img src="http://latex.codecogs.com/gif.latex?X^1%20\longrightarrow%20X" title="X^1 \longrightarrow X" /></a><br /><br /> from which it can easily deduce the equality: <a href="http://www.codecogs.com/eqnedit.php?latex=\sqrt{\pi^2}%20\longrightarrow%20(\pi^2)^\frac{1}{2}%20\longrightarrow%20\pi^{2\frac{1}{2}}%20\longrightarrow%20\pi^{\frac{2}{2}}%20\longrightarrow%20\pi^1%20\longrightarrow%20\pi" target="_blank"><img src="http://latex.codecogs.com/gif.latex?\sqrt{\pi^2} \longrightarrow (\pi^2)^\frac{1}{2} \longrightarrow \pi^{2\frac{1}{2}} \longrightarrow \pi^{\frac{2}{2}} \longrightarrow \pi^1 \longrightarrow \pi" title="\sqrt{\pi^2} \longrightarrow (\pi^2)^\frac{1}{2} \longrightarrow \pi^{2\frac{1}{2}} \longrightarrow \pi^{\frac{2}{2}} \longrightarrow \pi^1 \longrightarrow \pi" /></a><br /><br />In other words, the computer can do it in just the same way as humans do it. And just like the computer can't store the infinite decimal sequence of pi and square that, neither can we.<br /><div><br /></div><h2 style="text-align: left;"><div style="text-align: left;">All numbers have a unique representation. Thus 0.99999... = 1 is not true.<br /><div style="text-align: left;"><span style="font-family: inherit; font-size: small;"><br /></span></div></div><div style="font-weight: normal; text-align: left;"><span style="font-family: inherit; font-size: small;">False. When you write 0.999..., the ellipsis is intended to mean "the 9s continue forever". This is thus a compact way to write 0.9 + 0.09 + 0.009 + ... This infinite series equals 1, so 0.99999... = 1 is in fact true. (Mathematically, the sentence asserts that the partial sums converge to 1, which is true.)</span><br /><span style="font-family: inherit; font-size: small;"><br /></span><span style="font-family: inherit; font-size: small;">The number 1 can be represented in many other (decimal) ways: 1.0, 1.00, 1.000, and so on. If we only use infinite decimals, there are only two ways to represent each number. For example, the number 12.345 can be represented as 12.3450000... or as 12.3449999...</span><br /><div style="font-size: medium;"><br /></div></div></h2><h2 style="text-align: left;">If we can compute an infinity of things in finite time, we can break any problem.</h2>False. Assume for instance we have a hypercomputer blackbox that decides whether any given algorithm (not using the hypercomputer) terminates on all inputs or not (it solves the Halting problem). Such a hypercomputer cannot actually decide whether machines equipped with its own blackbox functionality terminates on all inputs, and thus, it cannot decide this particular "hyper" halting problem.<br /><br />See arithmetical hierarchy for more information.<br /><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com9tag:blogger.com,1999:blog-6767898351325918278.post-34991606876741031542013-01-31T01:25:00.001+08:002013-01-31T03:26:39.910+08:00Modes of Reasoning: Deductive, Inductive, and Abductive<div dir="ltr" style="text-align: left;" trbidi="on">Basically, there's three ways of reasoning about things: deductively, inductively, and abductively.<br /><div><br /></div><div>In this post I'll explain the difference between them and how they fit into mathematics, science, and the scientific method.</div><div><br /></div><div>Deduction: Given assumptions, reach a conclusion.</div><div>Example. We know (Prolog code shown in brackets):</div><div>1. All men are mortal. [ mortal(X) :- man(X). ]</div><div>2. Socrates is a man. [ man(socrates). ]</div><div>Thus, we can conclude:</div><div>Socrates is mortal. [ mortal(socrates). ]</div><div><br /></div><div>Put differently, whenever we look at a logical entailment of the form P => Q, deduction is the process of reaching Q (whatever that may be) given P.</div><div><br /></div><div>Mathematics is largely deductive: we start with a set of assumptions (called axioms). At the lowest level these encode the nature of numbers, sets, or other objects that are fundamental to whatever branch of mathematics we want to study (the most well studied axioms are Peano Arithmetic and ZFC).</div><div><br /></div><div>This set of axioms is our P. In Peano Arithmetic, for example, one axiom is:</div><div>A1. "0 is a natural number" [ natural(0). ].</div><div>Another is:</div><div>A2. "if X is a natural number, so is its successor" [ natural(s(X)) :- natural(X). ]</div><div><br /></div><div>Taken together, these define a sequence (0,s(0),s(s(0)),s(s(s(s(0)))),...) which correspond to the natural numbers (0,1,2,3,4,...). Thus 4 is just seen as a convenient way to express the object s(s(s(s(0)))).</div><div><br /></div><div>From these two axioms, we can prove "2 is a natural number" [ natural(s(s(0))). ], as follow:</div><div>Assume 2 is not a natural number [ not natural(s(s(0))). ]</div><div>Then (by A2), 1 is not a natural number [ not natural(s(0)). ]</div><div>Then (by A2), 0 is not a natural number [ not natural(0). ]</div><div>But by A1, 0 is a natural number, a contradiction.</div><div>Thus, 2 is a natural number.</div><div>(I choose a proof by contradiction rather than the more obvious direct proof to also illustrate exactly how Prolog would prove it mechanically.)</div><div><br /></div><div>So from our axioms A1 and A2 (our P in P => Q), we have prove that "2 is a natural number" (our Q).</div><div><br /></div><div>Here's an important point: mathematical induction is not inductive reasoning. Mathematical induction, in fact, is a form of deduction. More specifically, mathematical induction is a rule for drawing a conclusion from known facts (called an inference rule), of the following form:</div><div>For any predicate p(X) (here p(X) could stand for "X is prime" or pretty much anything), if we know that:</div><div>1. p(0) [ p(0). ]</div><div>2. For all naturals X, if p(X), then P(X+1). [ p(s(X)) :- natural(X), p(X). ]</div><div>Then, we may conclude that:</div><div>For all naturals X, P(X). [ p(X) :- natural(X). ]</div><div><br /></div><div>For example, take p(X) to mean "X is equal or greater than 0". First, we observe that p(0) is trivially true (since 0 is equal to 0"). Next, assume we know that X is equal or greater than 0. Then, since X+1 is bigger than X, which in turn is equal or greater than 0, will also be equal or greater than 0 (in fact, always greater).</div><div>Thus, if p(X), it is also true that p(X+1). These are the two conditions that must be satisfied to apply mathematical induction. Now we may conclude that p(X) is true for all natural numbers X, that is, that all natural numbers are equal or greater than 0. (Put differently, the natural numbers are lower bounded by 0, in contrast to, say, the integers which have infinite descent.)</div><div><br /></div><div>So deductive reasoning is finding the Q in the entailment "P => Q".</div><div>Abductive reasoning, is the opposite: finding the P given the Q.</div><div>Simply put: Looking at "P => Q":</div><div>Deductive reasoning: P known. Goal: find Q.</div><div>Abductive reasoning: Q known. Goal: find P.</div><div><br /></div><div>Example:</div><div>1. All men are mortal. [ mortal(X) :- man(X). ]</div><div>2. ??? </div><div>Conclusion:</div><div>Socrates is mortal. [ mortal(socrates). ]</div><div><br /></div><div>Abductive reasoning finds that the ??? might be: Socrates is a man [ man(socrates). ]</div><div><br /></div><div>Note that, unlike deductive reasoning, abduction doesn't guarantee that we get the right hypothesis P.</div><div>It may very well be that Socrates is a cat, and, since all cats are mortal too (but this is never stated above), Socrates is mortal.</div><div><br /></div><div>Mathematicians sometimes employ abductive reasoning too. This is because in practice mathematics is not (at least not as of now) done purely deductively, especially not when dealing with very abstract math. For example, to prove the Hahn-Banach theorem in its most general form, one needs to employ an axiom known as the Axiom of Choice (the C in ZFC). This can be considered a form of abductive reasoning:</div><div><br /></div><div>We know that the Q (conclusion) should be the Hahn-Banach theorem (what it says exactly is not relevant here). The P is the ZF axiom system, which is a set of assumptions upon which mathematics is founded.</div><div>So we have "ZF + ? => Hahn-Banach Theorem". We are asking for something "?" that needs to be added to ZF in order to prove the Hahn-Banach Theorem. One such "?" is the Axiom of Choice.</div><div><br /></div><div>Abductive reasoning, in its more formal sense, can thus be seen as the question "What assumption needs to be made in order for this proof to go through?". Or, more informally: "What assumption would explain these observations?".</div><div><br /></div><div>This leads us to the third mode of reasoning: inductive reasoning. This one's easy now: inductive reasoning is a special case of abductive reasoning, namely, the case where you want to add an assumption which is, logically speaking, the same predicate as the conclusion. Example:</div><div><br /></div><div><div>1. All men are mortal. [ mortal(X) :- man(X). ] <-- predicate is "mortal"</div><div>2. ??? <-- part of P.</div><div>Conclusion:</div><div>Socrates is a mortal [ mortal(socrates). ] <-- Q is about mortality</div></div><div><br /></div><div>Possible explanation: ??? should be "Socrates is a man" [ man(socrates). ] <-- P is about manhood.</div><div>As you can see, our explanation P is about manhood (the predicate man), whereas the conclusion deals with mortality (the predicate mortal). This is an example of abductive reasoning that is NOT inductive reasoning.</div><div><br /></div><div>Here's one that is inductive (and also abductive, as all induction is abduction):</div><div><br /></div><div>1. ??? <--- P</div><div>Conclusion</div><div><div><div>1. Swan1 is white. <-- first part of Q, about whiteness</div></div></div><div>2. Swan2 is white. <-- second part of Q, about whiteness</div><div>3. Swan3 is white. <-- last part of Q, about whiteness</div><div><br /></div><div>One possible explanation P is "All swans are white" [ white(X) :- swan(X). ].</div><div>As the explanation is also about whiteness, just like the conclusions, we have an instance of inductive reasoning.</div><div><br /></div><div>The scientific method creates hypothesis (P's) from observations (Q's). Thus, it is a form of abductive reasoning. It tries to find explanations (P) for what is observed (Q).</div><div><br /></div><div>However, these explanations (P) are then tested in experiments in an attempt to falsify them or give them more credibility. The experiment must then test for a specific prediction that can be drawn from this hypothesis P. Thus, we need to produce a prediction (conclusion) C from P, i.e. we are doing P => C, where we know P and are looking for C. In other words, this is a deductive step.</div><div><br /></div><div>The act of performing the experiment on C is not covered by reasoning: it is usually an empirical task covered by engineering skills. The hypothesis P is then either rejected (if the experiment shows that C is not true) or further refined (more abduction) and new predictions made (more deduction). Thus, the scientific method interleaves abduction and deduction, as well as requires that "step into reality" during experimentation.</div><div><br /></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-14233856289410774102012-10-16T20:17:00.000+08:002014-05-27T22:15:17.559+08:00The Big Questions, Answered<div dir="ltr" style="text-align: left;" trbidi="on"><br />For as long as we've existed, mankind has been asking the difficult questions of our place in this world: How did it all begin? When? How big is the universe? How will it end? When?<br /><br />Still today, people ask themselves the same questions. What many may not know, is that most of these questions have meaningful scientific answers now.<br /><br /><h2>How did it all begin?</h2><div>Everything in the universe (including spacetime itself) was compressed to an extremely tiny pea, which "exploded" out into our existence. I'm writing "explosion" in quotes because space itself expanded, there wasn't (necessarily) anything outside of the expansion. This "explosion" is known as the Big Bang.</div><div><br /></div><div><h2>When?</h2></div><div>13.7 billion years ago.</div><div><br /></div><h2>How big is the Universe?</h2><div>The exact size is not known, but at the very least 90 billion light years in diameter (from one end to the other). Put differently, at least 45 billion light years in every direction from us (radius). A light year is the distance it takes for light to travel for one year, which is exactly 9460730472580800 meters. Multiply that by 45 billion to get the radius.</div><div><br />(If you expected the answer to be 13.7 billion light years in radius, the reason it's not is due to the expansion of the universe. See the answer to the next question.)<br /><br /></div><div>The perhaps most promising current theory, eternal inflation, suggests that the universe is infinite when viewed from the inside (as we see it), but is in fact only a "bubble universe" that looks finite to an outside observer. This discrepancy is due to Einstein's General Relativity. (The infinity of things we can see from the inside, that an outside observer does not see, is explained by the fact that those things are spread across different times to the outside observer. Roughly, with infinite time, the outsider sees the infinity of things we insiders see.)</div><div><br /></div><h2>How will it end?</h2><div>The universe is still expanding since the big bang. Not only that, the expansion is going faster and faster (accelerating). The expansion doesn't "move out the edge of the universe". Rather, it stretches out space (but not within galaxies, as their gravitational pull overcomes this stretch).<br /><br /></div><div>Expansion of space means that the universe is cooling down (temperature wise), implying that the universe will freeze to death. More specifically, the disorder (entropy) increases due to the second law of thermodynamics, which, ultimately, will reach its stable state of complete disorderness. Except for some minor quantum fluctuations, the universe has quite literally come to a halt. The universal clock stops.</div><div><br /></div><h2>When?</h2><div>Unknown, but existence of intelligence is most likely impossible after <a href="http://www.codecogs.com/eqnedit.php?latex=10^{100}" target="_blank"><img src="http://latex.codecogs.com/gif.latex?10^{100}" title="10^{100}" /></a> years (a 1 followed by 100 zeros. For comparison, a million is 1 followed by 6 zeros, and a billion is 1 followed by 9 zeros). After that time, the last black holes will have evaporated (no suns or other shining objects are left either), and the universe enters the dark era.</div><div><br /></div><div></div></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0tag:blogger.com,1999:blog-6767898351325918278.post-90428651175647196172012-10-01T06:10:00.000+08:002012-10-05T02:22:51.456+08:00The Logic of Showing up Late<div dir="ltr" style="text-align: left;" trbidi="on">I've been pondering about the logic of showing up on time or late to casual meetings. In my own experience, people tag others as "shows up late" vs "shows up on time" and try to mirror this behavior. This might seem a little odd since it would perhaps be both simpler and better for everybody just to show up in time. But is it odd?<br /><br />Consider the following situation (in the sense of game theory, it is a "game"):<br /><br />We have two people that have decided to meet at a certain time. Let us call them A and B. They can each choose to arrive on time, or come late.<br /><br />If both come on time, nobody has to wait for the other, so there is no "suffering" involved. Hence they both receive 0 "happiness points" each. This is usually called "utility" in more formal contexts.<br /><br />If A shows up in time but B is late, A will have to wait for B, which is less pleasant. However, A will be in a worse mood, which will also make B enjoy their time together less. So A, because he has to wait, will suffer a penalty of -5. B, because A will be in a worse mood, will receive a utility of -2.<br />The exact numbers don't really matter for now: what's relevant is the relative order they have. What we are effectively stating is that a person is better off when not having to wait (0) compared to when he has to wait (-5). Being in worse company, due to having dragged down his or her mood by making them wait, falls in between (-2). Because the situation (game) is symmetric, the roles are reversed when A is late and B on time.<br /><br />If A and B are both late, nobody has to wait for the other, so nobody suffers. They both receive a utility of 0. (Note that this is simplified: it's more realistic to assume that they will not be late by equal amounts, so that one may suffer more than the other. What we're saying is that as both show up late, nobody has to wait for the other for a "significant" amount of time, making it equivalent to both showing up in time for all practical purposes.)<br /><br />To summarize:<br /><table border="1"><tbody><tr><td>Strategy</td><td>B In Time</td><td>B Late</td></tr><tr><td>A In Time</td><td>A: 0, B: 0</td><td>A: -5, B: -2</td></tr><tr><td>A Late</td><td>A: -2, B: -5</td><td>A: 0, B: 0</td></tr></tbody></table><br />The assumption in how they choose strategies is as follows: they are both aware of all the information contained in the table above (the utilities that follow from different combinations of the choices both make). Furthermore, they both try to maximize their own utility, they both know the other is, they both know that the other knows this, etc. In other words, the assumptions are the obvious ones: if I come late and you don't, you're not gonna be happy. I know this. You know this. I know you know it. You know I know it. And so on. (This is called "common knowledge".)<br /><br />What we want to know is the following: which strategy do we select (in time or late) to maximize our own happiness? Looking at the table, it should be clear to you that if I come late, you will want to come late too. If you come in time, you'll have to wait for me, receiving -5. If you come late too, you'll receive 0, which is better. On the other hand, if I come in time, you will want to come in time too, so as not to upset me by making me wait. In other words, this is a coordination game: all we need to do is make sure we both do the same thing.<br /><br />Indeed, if we are both used to come in time when we meet, there is no benefit for any of us to start deviating from this behavior (without informing the other). If you show up late, we'll both be worse off, since you'll ruin my mood and that's gonna bite you back. The same goes for me. If we're both used to showing up late, nobody benefits from coming in time (again, without informing the other), since that just means waiting for the other one to show up. This means that both coming in time, and both coming late, are Nash equilibrium - it is the best each of us can do given the circumstances. Note that the same cannot be said of the two other situations: if I tend to come late and you tend to come in time, we can both be better off by either you starting to come late too, or me starting to show up in time. Thus the asymmetric options are not Nash equilibria.<br /><br />In fact, the equilibria are both strict, meaning switching is always worse. Hence this is not only an equilibria in the sense that it requires rational foresight: any insect could "figure it out" by virtue of having the behavior (strategy) hardwired into its nervous system. (Game theorists call this an Evolutionarily Stable Strategy, or ESS.)<br /><br />This answers the question I posed in the beginning of this post: is it odd that people "remember" which friends tend to come in time and not, only to mirror the behavior? The answer is: no. The reason is: because it doesn't matter which is it - in time or late - as long as both do the same thing, both are better off.<br /><br />You might think that this is an artifact of me having chosen the same utility for both coming in time and both coming late. It is not so. We could change the utility of both showing up late to say -1 for both, reflecting the fact that there is less time to enjoy the meeting compared to both arriving in time. We could even assign a utility of 1 to both arriving late, reflecting less stress in getting ready (or whatever other reason you might find). The point is that the utility of both coming late or in time does not matter to our conclusion: it is identical as long as both being in time or both being late is better than the asymmetric cases.<br /><br />In fact, these coordination situations (games of two players) always have another solution: a randomized strategy ("mixed strategy") of sometimes being on time, and sometimes being late. Not surprisingly, however, such a strategy is not a very good idea, because in this situation the parties do not coordinate their arrivals (if they did, it would not be a randomized strategy, but the cases we just looked at). In game theory jargon, there is a mixed Nash equilibrium, but it is not an ESS. The latter is obvious, since those who randomize are worse off than those who always manage to coordinate their arrivals.<br /><br />Finally, there's the question of which is the better solution: for both to come late, or both to come in time? This depends on the utilities received by the two equilibria. In my table above, both receive 0 when in time, and 0 when late. So both solutions are equally good. However, if both would receive 1 when in time (and still 0 when both are late), it would be better for both to arrive in time. If both were to arrive late, realizing they would both benefit from switching to arriving in time (which is the Pareto optimal solution), they could simply both agree on showing up in time, knowing that the other one would for the simple fact that it benefits him/her more too. For example, when two people dislike to stress, it may be preferable for both to arrive late. For people that have scarcity of time, it may be preferable to arrive in time.<br /><br />However, if the game is not symmetric, there need not be such a Pareto optimal solution: if by both being in time, A receives 1 and B 0, whereas when both are late, it is the opposite: A receives 0 and B 1, well, then A would prefer the solution of both arriving in time, and B the solution of both arriving late. Note that both are still worst off by not coordinating, so the Nash equilibria are still the same: both in time, or both late. All that has changed is that it is no longer ubiquitous which of the two is the "better one", simply because what's better for A is worse for B, and vice versa. However, having an asymmetric situation means that there is a fundamental difference between A and B, each player now knows whether it is "A" or "B" (they can't just be swapped arbitrarily), so it does not represent a situation where "everything is the same for each person". For example, A may prefer to arrive late because she is sensitive to stress. But B may prefer to arrive on time, because he has very little time for procrastination. What Nash equilibria can tell us, is that they will (well, should) coordinate so that they both arrive in time or late. However, which of the two they pick, in this situation (of asymmetry), is a complicated issue not answers by game theory (as far as I know). Nobody said agreeing would always be easy.<br /><br />UPDATE: My friend Mads discussed some (psychological) possibilities for an equilibrium in the asymmetric cases (one is one time, the other is late). We can analyze such situations (of psychological preferences) simply by changing the person's utilities. Let's say that B really hates being on time. The utilities are like before, except that B receives an additional punishment of -3 for coming on time:<br /><br /><table border="1"><tbody><tr><td>Strategy</td><td>B In Time</td><td>B Late</td></tr><tr><td>A In Time</td><td>A: 0, B: -3</td><td>A: -5, B: -2</td></tr><tr><td>A Late</td><td>A: -2, B: -8</td><td>A: 0, B: 0</td></tr></tbody></table><br />One can immediately note that B's strategy of being late is better than being in time, no matter what A does: if A is on time, B receives -2, which is better than -3. If A is late, B receives 0 instead of -8 (this is in fact solely due to the punishment of B being in time when A is in time, no additional punishment would have been needed on top of B having to wait, i.e. receiving -5).<br /><br />This means that B's strategy of being late strictly dominates (it is always better), and hence, We can reduce the problem to:<br /><br /><table border="1"><tbody><tr><td>Strategy</td><td>B Late</td></tr><tr><td>A In Time</td><td>A: -5, B: -2</td></tr><tr><td>A Late</td><td>A: 0, B: 0</td></tr></tbody></table><br />Now, by the same logic, A's strategy of being late strictly dominates being on time, so the only Nash equilibrium will be that both show up late. Intuitively, this is because B hates being on time, and A doesn't mind accommodating for B's need to show up on time.<br /><br />It is presumably harder to be on time than showing up late, so that as soon as one person is a time optimist (meaning he shows up late) both will settle for "an implicit 10 more minutes on top of the decided time".<br /><br />Nevertheless, let's assume that, except for B's dislike of being on time, A actually dislikes being late. We thus have a conflict of interest, which we represent by punishing A with -X more for being late:<br /><br /><table border="1"><tbody><tr><td>Strategy</td><td>B In Time</td><td>B Late</td></tr><tr><td>A In Time</td><td>A: 0, B: -3</td><td>A: -5, B: -2</td></tr><tr><td>A Late</td><td>A: -2-X, B: -8</td><td>A: -X, B: 0</td></tr></tbody></table><br />As before, B's strategy of being late dominates being on time:<br /><br /><table border="1"><tbody><tr><td>Strategy</td><td>B Late</td></tr><tr><td>A In Time</td><td>A: -5, B: -2</td></tr><tr><td>A Late</td><td>A: -X, B: 0</td></tr></tbody></table><br />We now see that what it all boils down to is just how much A dislikes being late (quantified by the value of -X). If he dislikes being late more than to be in time and still wait for B, he will (obviously) actually go there and wait for B. We would then arrive at the asymmetric solution where A arrives on time (knowing B will be late) and B showing up late.<br /><br /></div>John Ahlgrenhttps://plus.google.com/109378830052365013943noreply@blogger.com0