Wednesday, October 26, 2016

Golden ratio, Fibonacci numbers, and sums of squares

Today in my Foundations of Arithmetic class we talked a bit about the Golden Ratio. This is a nice number that appears in many places in mathematics, art, and nature!

The defining property of the Golden Ratio is a self-replicating quality of rectangles: We say that a rectangle has a Golden Ratio, or that its dimensions are in a Golden Ratio if after taking away the biggest square possible, the remaining rectangle is proportional to the original one. 

This means that a rectangle like above will satisfy this condition if

Having this property means in particular that it is possible to carry out this trick forever, that is, since the leftover rectangle is similar to the original one, it is possible to take another square out and have a smaller leftover rectangle which is similar to the original one. Then, the process can be applied once more, and once again, indefinitely.

This recursion does not hold for other ratios. For example, if the ratio between the dimensions is rational, this process actually ends. 

For example, for a rectangle with dimensions $11\times 19$ we have that the process ends after 7 iterations

That is, after 7 times we had no leftover rectangle! The light blue $1\times 1$ leftover was a square already, so it is not possible to repeat the process. This happens because $\frac{19}{11}$ is a rational number. In the case of the golden ratio, it is an irrational number, 


so we can never finish the process, as there will always be a leftover rectangle. A nice pattern arises when performing this procedure which is surprisingly related to continued fractions. For the ratio $11\times19$ we have that its simple continued fraction expansion is




Notice that there is 1 $11\times 11$ square, 1 $8\times 8$ square, 2 $3\times 3$ squares, 1 $2\times 2$ square, and 2 $1\times 1$ square, the same as the continued fraction coefficients!

Looking at this more geometrically, the continuous fraction expansion tells a way to write down the area of the rectangle as a sum of squares, that is, 

$$11\times 19=209=11^2+8^2+3^2+3^2+2^2+1^2+1^2\,.$$

This is also true if the dimensions of the rectangle are commensurable, were the coefficients of the continued fraction indicates the number of times a certain square is repeated. If we have a rectangle with dimensions $p\times q$ with $p/q$ is rational. Then these squares will have dimensions that are of the type $d=np-mq$ with $n,m$ natural numbers. 

In the case of a rectangle with dimensions $\phi\times 1$ this process gives


which, not surprisingly enough, can be written using Fibonacci numbers

$$\phi=\sum_{n=0}^\infty \left(F_n \phi-F_{n+1}\right)^2\,.$$

In general, for any real $x>1$ we can do this trick. Let the continued fraction of $x$ be given by


then we have that considering a rectangle with dimensions $1\times x$ gives a square decomposition

$$x=\sum_{n=0}^\infty a_0(b_n x-c_n)^2\,,$$

which is finite if and only if $x$ is rational. An interesting problem would be to study the properties of the sequences $\{b_n\}$ and $\{c_n\}$, but I'll leave that for another time.

Tuesday, April 19, 2016

Segunda ley de termodinámica para los enteros

La entropía es una manera de medir la interacción entre lo pequeño y lo grande. Básicamente calcula el número de maneras en que se pueden obtener las mismas características macroscópicas en un sistema variando las propiedades microscópicas.

Un estado con mayor entropía es un estado altamente incierto, puesto que existen varias maneras de obtener dicho estado a partir de configuraciones y la probabilidad de obtenerlo en cierta configuración es entonces muy reducida. Sin embargo, estados con baja entropía tienen gran cantidad de información, puesto que solamente hay un número reducido de formas de obtener dicho estado, así una configuración en particular es altamente probable de ocurrir.

Es posible utilizar estas ideas con los enteros positivos. Podemos pensar que los micro-estados corresponden a los factores de un número y el macro-estado al número en si. Así el número $3$ tiene solo un micro-estado: $3$, sin embargo el número $12$ tiene $4$ micro-estados: $12$, $2\times 6$, $3\times 4$, $2\times 2\times 3$.

Si un macro-estado tiene $P$ diferentes micro-estados igualmente probables, la entropía asociada se define como

$$S=\ln P\,,$$

de esta manera podemos decir que la entropía de $3$ es $S_3=\ln 1=0$ y $S_{12}=\ln 4\sim 1.39$.

En general, podemos definir la entropía de un número como

$$S_n=\ln k\,,$$

donde $k$ es el número de formas distintas de factorizar $n$. Cuando $n=p$ es un número primo tenemos que $S_p=0$. Acá se puede ver la entropía para los números del 1 al 100 y del 1 al 10,000

Con esto podemos calcular el valor esperado de la entropía de un número.  Para $N$ un natural, sea $m=\max S_n$ para $n\leq N$.  Podemos entonces calcular el valor esperado de la entropía para los números hasta $N$ como

$$\langle S \rangle_N =\sum_{k=1}^N \frac{f_k }{N}\ln k\,,$$

donde $f_k$ es el la cantidad de números $n\leq N$ tales que $S_n=\ln k$. Para valores entre 1 y 100 y para valores entre 1 y 10,000 tenemos que las gráficas del valor esperado de la entropía son

Al principio el valor esperado de la entropía tiende a crecer, sin embargo a partir de $N=8,000$ parece estabilizarse un poco. ¿Pudiéramos tener una segunda ley de termodinámica para los naturales?

Tuesday, March 15, 2016

Probabilities in a loaded die

A few days ago, my friend Jesse and I were talking about how he was going to use actual dice in his class to teach students about probability. He would put them to roll dice and prove that actually each face has about the same probability of coming up.

He told me later that he looked online for loaded dice, that is, dice that would favor some of the faces and not have a uniform distribution. After a while, we thought it would be a nice problem to design a loaded die and to somehow calculate the theoretical probabilities for each face.  This is a  rather complex problem, so I started to think on reasonable simplifications to get something sound.

The main idea is to load a die by changing the center of mass of it. This would definitely impact the chances of getting a particular face. Usually we get that a regular die has a probability of $1/6$ of displaying a given face based on symmetry arguments. But when talking about a loaded die things get loaded.

Instead of considering a 3D die, I tried to analyze the 2D case, by considering a unit square with uniform mass density $\sigma$ and with a point mass $\mu$ at a point $(a,b)$.  This will make the square to have its center of mass at

$$\left(\frac{a \mu +\frac{\sigma }{2}}{\mu +\sigma },\frac{b \mu +\frac{\sigma }{2}}{\mu +\sigma }\right)\,,$$

and not at the center of the square. With this, the symmetries of the square are broken and a new description of its behavior has to be made. 

In order to figure out the probabilities, first we have to define what do we mean by the probability of getting a specific face. One way of doing this is by considering the frequentists point of view of the system: lets roll the die many times and calculate the probabilities as relative frequencies of the outcomes. 

Being this a a posteriori point of view, analyzing the physics of such situation becomes the best approach. 

We want to have the square freely rotating, landing on a flat line, and then recording which face comes up. In order to do this, we have to prescribe an angular velocity of the rotating square, the landing vertex, and the incidence angle. We can simplify the model by just considering the picture when the die lands on the line, forgetting about its spinning past. 

The argument is that by ignoring air resistance, friction, and everything else a good physicist will ignore, the landing angular velocity the square will land with is just a scalar multiple of its original velocity in the air (using conservation of momentum). So a good toy model is starting with a vertex fixed on the line, with an initial angle $\theta_0$ and an initial angular velocity $\omega_0$.  Using Newton's second law, we have that the sum of all torques is equal to the moment of inertia times the angular acceleration of the square. That is

$$I\frac{d^2\theta(t)}{dt^2}=\sum \tau\,,$$

were the moment of inertia for the square in this configuration is given by 


and where we only have one torque acting on the square due to gravity. Putting these together, we have that the system is described by the ODE

$$\frac{d^2\theta(t)}{dt^2}=-\frac{g || r ||}{I}\cos\left(\theta(t)+\delta\right)\,,$$

with initial conditions $\theta(0)=\theta_0$ and $\theta'(0)=\omega_0$, and where $g$ is gravity, $r$ the position of the center of mass of the square, and $\delta$ its argument. 

This is a nonlinear differential equation which can be numerically solved in mathematica. Solving this equation gives us the rotation angle of the square in time. If the angle reaches $\pi/2$, it means that it completely tilted over the next face, and we would have to make the same analysis again but now with new initial conditions $\Theta(0)=0$ and $\omega(0)=\kappa \theta'(T)$, where $T$ is the time that took the previous face to completely reach the next one and $\kappa$ is a constant describing the energy loss the square experiences when landing on the next face.

We can keep this process as long as the equation 


has positive real solutions and the number of times this procedure is carries over will tell us on which face the square is going to stop rolling. 

This approach is completely deterministic, as given any pair of initial conditions $(\theta_0,\omega_0)$, we can calculate on which face the square is going to stop rolling. In order to introduce probabilities, we can asume a probability distribution on the initial angles and the initial velocities. 

It is reasonable to pick a uniform distribution of the incidence angles $\theta_0$, so I choose a uniform probability distribution on $(0,\pi/2)$. As for the initial velocities, the argument relies on the system being cyclic.  It is true that all angular velocities are not equally probable, as when someone grabs an actual die the angular velocities would have a distribution centered towards a preferred value, but when considering the problem only from the landing perspective, what really matters is that the system presents an almost periodic behavior on the initial velocity.  So we can take a uniform distribution on $(0,\omega_{\text{max}})\,.$ We also have to take into account that the die could rol clockwise or anti-clockwise with uniform probability. 

Taking these assumptions into account would make the final probability not to depend on how we roll the die, that is, we are making it up for the lack of the symmetry the system has.

Finally we have to account for the probability of the die landing on a particular vertex. In order to calculate this, we can think that the probability of it landing on a given vertex is proportional to the time this vertex is the closest one to the landing line, while the die is rotating in the air. We can use the moments of mass about the face, and then calculate the probability of a vertex to be the average of its adjacent faces. 

Performing some numerical simulations, we get the following probabilities ($\sigma=1, \mu=100$)

(a,b) 0 1 2 3
(0.5,0) 0.268748 0.34257 0.0461125 0.34257
(0.1,0.1) 0.468528 0.468551 0.031472 0.0314486
(0.5,0.5) 0.25 0.25 0.25 0.25

and here is the code I used in Mathematica (the last function gives a the probability vector {0,1,2,3}):

T[{a_, b_}] = {1 - b, a};

INC[a_, b_, \[Mu]_, \[Sigma]_, c_] := \[Sigma]/
   3 + \[Mu] Norm[Nest[T, {a, b}, c]]^2;

Sol[a_, b_, \[Mu]_, \[Sigma]_, \[Omega]0_, \[Theta]0_, c_] := 
      t] == -((9.8 Norm[CM[Nest[T, {a, b}, c], \[Mu], \[Sigma]]])/
       INC[a, b, \[Mu], \[Sigma], c])
        Cos[\[Theta][t] + 
        ArcTan[Nest[T, {a, b}, c][[2]]/
          Nest[T, {a, b}, c][[1]]]], \[Theta][
      0] == \[Theta]0, \[Theta]'[0] == \[Omega]0}, \[Theta][t], {t, 0,

Rol[a_, b_, \[Mu]_, \[Sigma]_, W_, A_, c_] := Module[{w, Flag, n, X},
  X = Sol[a, b, \[Mu], \[Sigma], W, A, c];
  w = (0.9) ((D[(\[Theta][t] /. X), 
          t]) /. (NSolve[(\[Theta][t] /. X) == \[Pi]/2, t]))[[1]][[1]];
  Flag = NumericQ[w] && Positive[w];
  n = c;
   X = Sol[a, b, \[Mu], \[Sigma], w, 0, n];
   w = (((D[(\[Theta][t] /. X), 
            t]) /. (NSolve[(\[Theta][t] /. X) == \[Pi]/2, t]))[[1]][[
       1]]) (0.9);
   Flag = NumericQ[w] && Positive[w];

Fr[a_, b_, \[Mu]_, \[Sigma]_, \[Theta]0_, c_] := 
 Table[Mod[Rol[a, b, \[Mu], \[Sigma], w, \[Theta]0, c], 4], {w, 0, 1, 

Pr[a_, b_, \[Mu]_, \[Sigma]_, \[Theta]0_, c_] := Module[{temp},
  temp = Fr[a, b, \[Mu], \[Sigma], \[Theta]0, c];
  {Count[temp, 0]/Length[temp], Count[temp, 1]/Length[temp], 
   Count[temp, 2]/Length[temp], Count[temp, 3]/Length[temp]}

Pro[a_, b_, \[Mu]_, \[Sigma]_, c_] := Module[{temp},
  temp = Table[
    Pr[a, b, \[Mu], \[Sigma], \[Pi]/2 (i), c], {i, 0, 1, 0.1}];

M0[a_, b_, \[Mu]_, \[Sigma]_] = (\[Mu] b + \[Sigma]/2)/(
  2 \[Mu] + 2 \[Sigma]);
M1[a_, b_, \[Mu]_, \[Sigma]_] = (\[Mu] a + \[Sigma]/2)/(
  2 \[Mu] + 2 \[Sigma]);
M2[a_, b_, \[Mu]_, \[Sigma]_] = (\[Mu] (1 - b) + \[Sigma]/2)/(
  2 \[Mu] + 2 \[Sigma]);
M3[a_, b_, \[Mu]_, \[Sigma]_] = (\[Mu] (1 - a) + \[Sigma]/2)/(
  2 \[Mu] + 2 \[Sigma]);

C0[a_, b_, \[Mu]_, \[Sigma]_] = (
  M0[a, b, \[Mu], \[Sigma]] + M1[a, b, \[Mu], \[Sigma]])/2;
C1[a_, b_, \[Mu]_, \[Sigma]_] = (
  M1[a, b, \[Mu], \[Sigma]] + M2[a, b, \[Mu], \[Sigma]])/2;
C2[a_, b_, \[Mu]_, \[Sigma]_] = (
  M2[a, b, \[Mu], \[Sigma]] + M3[a, b, \[Mu], \[Sigma]])/2;
C3[a_, b_, \[Mu]_, \[Sigma]_] = (
  M3[a, b, \[Mu], \[Sigma]] + M0[a, b, \[Mu], \[Sigma]])/2;

Prob[a_, b_, \[Mu]_, \[Sigma]_] := (1/
     2) (C0[a, b, \[Mu], \[Sigma]] Pro[a, b, \[Mu], \[Sigma], 0] + 
     C1[a, b, \[Mu], \[Sigma]] Pro[a, b, \[Mu], \[Sigma], 1] + 
     C2[a, b, \[Mu], \[Sigma]] Pro[a, b, \[Mu], \[Sigma], 2] + 
     C3[a, b, \[Mu], \[Sigma]] Pro[a, b, \[Mu], \[Sigma], 3]) + (1/
     2) (C0[1 - a, b, \[Mu], \[Sigma]] Pro[1 - a, b, \[Mu], \[Sigma], 
        0] + C1[1 - a, b, \[Mu], \[Sigma]] Pro[1 - a, 
        b, \[Mu], \[Sigma], 1] + 
      C2[1 - a, b, \[Mu], \[Sigma]] Pro[1 - a, b, \[Mu], \[Sigma], 
        2] + C3[1 - a, b, \[Mu], \[Sigma]] Pro[1 - a, 
        b, \[Mu], \[Sigma], 3])[[{1, 4, 3, 2}]]

Prob[a, b, \[Mu], \[Sigma]]

Saturday, February 6, 2016

Integrales complejas y sumas divergentes

Como parte de mi investigación he estado estudiando formas de regularizar ciertas series que a primera vista resultan divergentes.  Para esto se han desarrollado muchas técnicas de regularización y yo me he enfocado en utilizar funciones zeta.

Actualmente estoy estudiando un problema relacionado con sumas de Tornheim-Mordell y comencé a utilizar técnicas de análisis complejo para resolver el problema. Básicamente las herramientas más poderosas se basan en el Teorema del Residuo y sus muchas versiones. Un pariente cercano de este resultado es la Transformada de Mellin, la cual resulta ser una Transformada de Fourier disfrazada. 

En términos simplistas, lo que dicen estos resultados es que es posible evaluar una integral de contorno por medio de ver los residuos que quedan encerrados.  Desde un punto de vista más físico, es el equivalente al Teorema de Gauss. Esta interpretación física resulta muy interesante, pues da una intuición de qué cosas esperar. Por ejemplo, resulta natural esperar que exista algún tipo de conservación de energía o carga eléctrica.  En el caso de funciones de variable compleja, tomando en cuenta algunas consideraciones, es posible tener esta conservación. En el plano complejo, esta conservación se traduce en que la suma de los residuos de la función suman cero,

$$\sum_{a\in\text{Polos simples}} \text{Res}(f(z),a)=0\,.$$

Sin embargo, en funciones sencillas como $f(z)=1/z$ parece que esto no aplica. Esta aparente falla se debe a que estamos olvidándonos de lo que sucede en $z=\infty$. Las funciones también pueden tener un residuo en infinito, y en este caso el residuo resulta ser

$$\text{Res} (f(z),\infty)=\text{Res}\left(-\frac{1}{z^2}f(1/z),0\right)=\text{Res}\left(-\frac{z}{z^2},0\right)=-1\,.$$

Con esta idea en mente, es posible estudiar lo que ocurre con funciones más interesantes. Por ejemplo, consideremos la función 

$$f(z)=\pi \csc(\pi z)\,.$$

Esta tiene polos simples para $z=n,n\in\mathbb{Z}$ con residuos

$$\text{Res }(\pi \csc(\pi z),n)=(-1)^n\,.$$

Al tratar de calcular el residuo en infinito,  tenemos que da $\pm\infty$, lo que intuitivamente querría decir que tenemos un polo de mayor orden en infinito. Por lo tanto, podemos hacernos un poco los locos y decir que

$$\text{Res }(\pi \csc(\pi z),\infty)=0.$$

Si calculamos la integral de $f(z)$ a lo largo de una recta vertical, podemos pensar que estamos calculando la integral cerrada que pasa por infinito del semiplano izquierdo.  (El contorno $L$ encerraría el semiplano izquierdo dada la convención de que el interior de una región siempre queda a la izquierda de la dirección en que se recorre el borde, es decir, la dirección anti-horaria es positiva). Como $f(z)$ no tiene residuo en infinito, podemos esperar que la integral sobre $L$ será finita siempre y cuando $L$ no pase por ninguno de los polos finitos $z=n$. Por ejemplo tomemos $L$ como la recta vertical $\Re(z)=1/3$.  Entonces

$$\frac{1}{2\pi i}\int_L f(z)dz=\frac{1}{2\pi i}\int _L \pi \csc(\pi z)dz=\frac{1}{2}\,,$$

donde la integral puede facilmente calcularse. Ahora podemos hacer uso de la interpretación anterior y pensar que la integral en realidad puede calcularse sumando los residuos encerrados dentro de la región determinada por $L$, 

$$\sum_{n=0}^\infty (-1)^{-n}=\sum_{n=0}^\infty (-1)^n=\frac{1}{2}\,.$$

La serie de la izquierda se conoce como la Serie de Grandi y ¡claramente es divergente! Una forma sencilla de verlo es el criterio de Leibniz, sin embargo al regularizar esta suma por diversos métodos se obtiene el resultado de $1/2$. Leibniz estudió esta serie y fue el primero en introducir cierto rigor en su cálculo. Fue duramente criticado, sin embargo Euler lo apoyó y fue así que comenzó el estudio se series divergente. 

Podemos decir que la divergencia de la serie de Grandi se refleja en el hecho de que $f(z)$ tiene un polo en infinito, sin embargo al ser un polo de orden mayor, aún resulta posible obtener un valor regularizado para la serie, no en el sentido usual de convergencia, sino en un sentido más profundo.

Sunday, January 24, 2016

Last digits of powers: Mersenne primes and perfect numbers

Recently, a new Mersenne prime has been discovered.  In the highly technological world of today, prime numbers play a, well, prime role.

The search for bigger and bigger primes has been a subject of interest in recent years, mainly due to the increasing computational power that we have achieved, and, oh yes, for the money. One of these big efforts to discover bigger and bigger primes is GIMPS, an online collaborative search for prime numbers.

The newest prime found has the form


In particular Mersenne primes are well suited to be found by computers, as computing powers of 2 results in an easy task (just left shift). 

The first question one would ask is, how big is this number anyway? This is an easy question to solve. In order to find out how many digits does $p_{49}$ have, we only have to find its logarithm base 10. This will give the exponent $e$ such that


which gives the number of digits $p_{49}$ has by calculating $\lceil e \rceil$.  This gives

$$\lceil e \rceil=\lceil \log 2^{74207281}-1\rceil =\lceil \log 2^{74207281}\rceil=\lceil 74207281 \log 2\rceil=22338618\,,$$

where the second equality is true since $2^{74207281}$ is not a number of the form $999\dots 9$.

The second question that could pop into our minds is, what is the last digit of this number? or even more, what are the last m digits of this number?

In order to solve this, we can use two important results in modular arithmetic: the Chinese Remainder Theorem and Euler's Theorem.

The idea is that we do not need to calculate the whole number to find some of its digits. The relevance of Euler's theorem is that estates that powers leave cyclic remainders when divided by a number. It gives a way to calculate this period. For example, to find the last $m$ digits of, say, $2^k$, really what we are asking is for the remainder that $2^k$ leaves when divided by $10^m$, that is

$$2^k\equiv d \text{ mod } 10^m\,,$$

where $d$ gives the last $m$ digits when taken to be in the least residue system. It is easier to analyze this congruence if we write $10^m=2^m\cdot 5^m$.  Thus the problem translates to

$$2^k\equiv d_1 \text{ mod } 2^m\,,$$
$$2^k\equiv d_2 \text{ mod } 5^m\,.$$

We transformed a single congruence in a system of congruences, which at first sight may seem more complicated, but in fact, it is easier to handle. The first congruence is rather straightforward. If $k>m$ we have that $d_1\equiv 0$. For the second one, we can use the magic of Euler's theorem,  since $gcd(2,5^m)=1$ we have that 

$$2^{\phi(5^m)}\equiv 1 \text{ mod }5^m\,,$$

that is, the period for the powers of $2$ modulo $5^m$ is given by Euler's totient


This means that to find out $d_2$, we can reduce the exponent $k$ and only consider its residue modulo $\phi(5^m)$. If 

$$k\equiv e\text{ mod }\phi(5^m)\,,$$

then we have that 

$$2^k\equiv 2^e\text{ mod } 5^m\,.$$

The advantage of this is that instead of calculating the powers with $k$ as the exponent (a huge number), we only need to calculate a power $e$ that is smaller than $4\cdot 5^{m-1}$, which might be much smaller than $k$.

Then, using the Chinese Remainder we get that

$$d\equiv 2^e2^m 2^{4\cdot 5^m-m}\equiv 2^e2^{4\cdot 5^m}\text {mod }10^m\,.$$

This helps us to calculate the last, for example, 3 digits of $p_{49}$ as 352-1=351. 

Another interesting calculation that can be made has to do with perfect numbers.  It is not known a general formula for perfect numbers, but we know that every Mersenne prime creates a perfect number.


gives a perfect number if $2^k-1$ is prime. Hence there is an associated perfect number with $p_{49}$ given by


Similarly, we can calculate that it is 44,677,235 digits long and its last 3 digits are 776. And the best part, we don't have to know the other 44, 677,232 digits!