Archive

Archive for the ‘Probability’ Category

Searching (again!?) for the SS Central America

August 31, 2014 Leave a comment

On Tuesday, September 8th 1857, the steamboat SS Central America left Havana at 9 AM for New York, carrying about 600 passengers and crew members. Inside of this vessel, there was stowed a very precious cargo: a set of manuscripts by John James Audubon, and three tons of gold bars and coins. The manuscripts documented an expedition through the yet uncharted southwestern United States and California, and contained 200 sketches and paintings of its wildlife. The gold, fruit of many years of prospecting and mining during the California Gold Rush, was meant to start anew the lives of many of the passengers aboard.

On the 9th, the vessel ran into a storm which developed into a hurricane. The steamboat endured four hard days at sea, and by Saturday morning the ship was doomed. The captain arranged to have women and children taken off to the brig Marine, which offered them assistance at about noon. In spite of the efforts of the remaining crew and passengers to save the ship, the inevitable happened at about 8 PM that same day. The wreck claimed the lives of 425 men, and carried the valuable cargo to the bottom of the sea.

It was not until late 1980s that technology allowed recovery of shipwrecks at deep sea. But no technology would be of any help without an accurate location of the site. In the following paragraphs we would like to illustrate the power of the scipy stack by performing a simple simulation, that ultimately creates a dataset of possible locations for the wreck of the SS Central America, and mines the data to attempt to pinpoint the most probable target.

We simulate several possible paths of the steamboat (say 10,000 randomly generated possibilities), between 7:00 AM on Saturday, and 13 hours later, at 8:00 pm on Sunday. At 7:00 AM on that Saturday the ship’s captain, William Herndon, took a celestial fix and verbally relayed the position to the schooner El Dorado. The fix was 31º25′ North, 77º10′ West. Because the ship was not operative at that point—no engine, no sails—, for the next thirteen hours its course was solely subjected to the effect of ocean current and winds. With enough information, it is possible to model the drift and leeway on different possible paths.

Read more…

Robot stories

June 29, 2014 2 comments

Every summer before school was over, I was assigned a list of books to read. Mostly nonfiction and historical fiction, but in fourth grade there that was that first science fiction book. I often remember how that book made me feel, and marvel at the impact that it had in my life. I had read some science fiction before—Well’s Time Traveller and War of the Worlds—but this was different. This was a book with witty and thought-provoking short stories by Isaac Asimov. Each of them delivered drama, comedy, mystery and a surprise ending in about ten pages. And they had robots. And those robots had personalities, in spite of their very simple programming: The Three Laws of Robotics.

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Back in the 1980s, robotics—understood as autonomous mechanical thinking—was no more than a dream. A wonderful dream that fueled many children’s imaginations and probably shaped the career choices of some. I know in my case it did.

Fast forward some thirty-odd years, when I met Astro: one of three research robots manufactured by the French company Aldebaran. This NAO robot found its way into the computer science classroom of Tom Simpson in Heathwood Hall Episcopal School, and quickly learned to navigate mazes, recognize some student’s faces and names, and even dance the Macarena! It did so with effortless coding: a basic command of the computer language python, and some idea of object oriented programming.

I could not let this opportunity pass. I created a small undergraduate team with Danielle Talley from USC (a brilliant sophomore in computer engineering, with a minor in music), and two math majors from Morris College: my Geometry expert Fabian Maple, and a McGyver-style problem solver, Wesley Alexander. Wesley and Fabian are supported by a Department of Energy-Environmental Management grant to Morris College, which funds their summer research experience at USC. Danielle is funded by the National Science Foundation through the Louis Stokes South Carolina-Alliance for Minority Participation (LS-SCAMP).

They spent the best of their first week on this project completing a basic programming course online. At the same time, the four of us reviewed some of the mathematical tools needed to teach Astro new tricks: basic algebra and trigonometry, basic geometry, and basic calculus and statistics. The emphasis—I need to point out in case you missed it—is in the word basic.

https://farm4.staticflickr.com/3921/14343225109_6d4c70558e_d.jpg

Talk the talk

The psychologist seated herself and watched Herbie narrowly as he took a chair at the other side of the table and went through the three books systematically.

At the end of half an hour, he put them down, “Of course, I know why you brought these.”

The corner of Dr. Calvin’s lip twitched, “I was afraid you would. It’s difficult to work with you, Herbie. You’re always a step ahead of me.”

“It’s the same with these books, you know, as with the others. They just don’t interest me. There’s nothing to your textbooks. Your science is just a mass of collected data plastered together by makeshift theory — and all so incredibly simple, that it’s scarcely worth bothering about.”

“It’s your fiction that interests me. Your studies of the interplay of human motives and emotions” – his mighty hand gestured vaguely as he sought the proper words.

Liar!
— Isaac Asimov

Astro can understand what I say, and maintain conversations with me. He has four microphones for voice recognition and sound localization, and two speakers for text-to-speech synthesis. But he sometimes gets confused: I say “acute”, and he understands “abide,” for example. This might sound frustrating by all standards. But mathematicians can’t be judged by any reasonable standard. Far from seeing this as a shortcoming, my team used as motivation the mechanics behind his speech recognition functionality, and they wrote some code to turn him into an expert Mastermind player.

Wait, what? How are these two things possibly connected?

Since Astro’s hearing is far from perfect, when he hears a word which is not clear, he gathers a short list of possible candidates. He then chooses the most likely guess based upon different considerations—using clues from context, for example. Based on similar strategies, he can be taught to be a perfect player of 5-letter-word Mastermind: Pick any five-letter word in the English language. Any word, provided it has no repeated letters. Astro chooses one random word as his first guess, and requests a score like in Mastermind.

To illustrate this process, I will show you how the game went when I chose Fabian’s favorite word: “acute.”

Astro: Is it firth?
Me   : NO
Astro: How many letters of my guess are in your word, in the right position?
       How many letters of my guess are in your word, in the wrong position?
Me   : 1, 0
Astro: Is it foams?
Me   : NO
Astro: How many letters of my guess are in your word, in the right position?
       How many letters of my guess are in your word, in the wrong position?
Me   : 0, 1
Astro: Is it junta?
Me   : NO
Astro: How many letters of my guess are in your word, in the right position?
       How many letters of my guess are in your word, in the wrong position?
Me   : 1, 2
Astro: Is it acute?
Me   : YES
Astro: Wonderful, I found it in 4 attempts!

I don’t want to get too technical here, but will mention some of the ideas. The main algorithm is based on techniques of numerical root finding and solving nonlinear equations — nothing complex: high-school level bracketing by bisection, or Newton’s method. To design better winning strategies, my team exploits the benefits of randomness. The analysis of this part is done with basic probability and statistics.

Walk the walk

Donovan’s pencil pointed nervously. “The red cross is the selenium pool. You marked it yourself.”

“Which one is it?” interrupted Powell. “There were three that MacDougal located for us before he left.”

“I sent Speedy to the nearest, naturally; seventeen miles away. But what difference does that make?” There was tension in his voice. “There are penciled dots that mark Speedy’s position.”

And for the first time Powell’s artificial aplomb was shaken and his hands shot forward for the man.

“Are you serious? This is impossible.”

“There it is,” growled Donovan.

The little dots that marked the position formed a rough circle about the red cross of the selenium pool. And Powell’s fingers went to his brown mustache, the unfailing signal of anxiety.

Donovan added: “In the two hours I checked on him, he circled that damned pool four times. It seems likely to me that he’ll keep that up forever. Do you realize the position we’re in?”

Runaround
— Isaac Asimov

Astro moves around too. It does so thanks to a sophisticated system combining one accelerometer, one gyrometer and four ultrasonic sensors that provide him with stability and positioning within space. He also enjoys eight force-sensing resistors and two bumpers. And that is only for his legs! He can move his arms, bend his elbows, open and close his hands, or move his torso and neck (up to 25 degrees of freedom for the combination of all possible joints). Out of the box, and without much effort, he can be coded to walk around, although in a mechanical way: He moves forward a few feet, stops, rotates in place or steps to a side, etc. A very naïve way to go from A to B retrieving an object at C, could be easily coded in this fashion as the diagram shows:

https://farm4.staticflickr.com/3884/14506683656_32784c832d_d.jpg

Fabian and Wesley devised a different way to code Astro taking full advantage of his inertial measurement unit. This will allow him to move around smoothly, almost like a human would. The key to their success? Polynomial interpolation and plane geometry. For advanced solutions, they need to learn about splines, curvature, and optimization. Nothing they can’t handle.

https://farm6.staticflickr.com/5595/14344128330_bb4845a89d_d.jpg

Sing me a song

He said he could manage three hours and Mortenson said that would be perfect when I gave him the news. We picked a night when she was going to be singing Bach or Handel or one of those old piano-bangers, and was going to have a long and impressive solo.

Mortenson went to the church that night and, of course, I went too. I felt responsible for what was going to happen and I thought I had better oversee the situation.

Mortenson said, gloomily, “I attended the rehearsals. She was just singing the same way she always did; you know, as though she had a tail and someone was stepping on it.”

One Night of Song
— Isaac Asimov

Astro has excellent eyesight and understanding of the world around him. He is equipped with two HD cameras, and a bunch of computer vision algorithms, including facial and shape recognition. Danielle’s dream is to have him read from a music sheet and sing or play the song in a toy piano. She is very close to completing this project: Astro is able now to identify partitures, and extract from them the location of the pentagrams. Danielle is currently working on identifying the notes and the clefs. This is one of her test images, and the result of one of her early experiments:

https://farm3.staticflickr.com/2906/14350961337_97bcb90e88_o_d.jpg https://farm4.staticflickr.com/3856/14537520835_4f0bf31eb1_d.jpg

Most of the techniques Danielle is using are accessible to any student with a decent command of vector calculus, and enough scientific maturity. The extraction of pentagrams and the different notes on them, for example, is performed with the Hough transform. This is a fancy term for an algorithm that basically searches for straight lines and circles by solving an optimization problem in two or three variables.

The only thing left is an actual performance. Danielle will be leading Fabian and Wes, and with the assistance of Mr. Simpson’s awesome students Erica and Robert, Astro will hopefully learn to physically approach the piano, choose the right keys, and play them in the correct order and speed. Talent show, anyone?

Book presentation at the USC Python Users Group

December 7, 2013 Leave a comment

Areas of Mathematics

October 22, 2013 3 comments

For one of my upcoming talks I am trying to include an exhaustive mindmap showing the different areas of Mathematics, and somehow, how they relate to each other. Most of the information I am using has been processed from years of exposure in the field, and a bit of help from Wikipedia.

But I am not entirely happy with what I see: my lack of training in the area of Combinatorics results in a rather dry treatment of that part of the mindmap, for example. I am afraid that the same could be told about other parts of the diagram. Any help from the reader to clarify and polish this information will be very much appreciated.

And as a bonus, I included a \LaTeX script to generate the diagram with the aid of the tikz libraries.

\tikzstyle{level 2 concept}+=[sibling angle=40]
\begin{tikzpicture}[scale=0.49, transform shape]
  \path[mindmap,concept color=black,text=white]
    node[concept] {Pure Mathematics} [clockwise from=45]
      child[concept color=DeepSkyBlue4]{
        node[concept] {Analysis} [clockwise from=180]
          child { 
            node[concept] {Multivariate \& Vector Calculus}
              [clockwise from=120]
              child {node[concept] {ODEs}}}
              child { node[concept] {Functional Analysis}}
              child { node[concept] {Measure Theory}}
              child { node[concept] {Calculus of Variations}}
              child { node[concept] {Harmonic Analysis}}
              child { node[concept] {Complex Analysis}}
              child { node[concept] {Stochastic Analysis}}
              child { node[concept] {Geometric Analysis}
                [clockwise from=-40]
                child {node[concept] {PDEs}}}}
          child[concept color=black!50!green, grow=-40]{ 
            node[concept] {Combinatorics} [clockwise from=10]
              child {node[concept] {Enumerative}}
              child {node[concept] {Extremal}}
              child {node[concept] {Graph Theory}}}
          child[concept color=black!25!red, grow=-90]{ 
            node[concept] {Geometry} [clockwise from=-30]
              child {node[concept] {Convex Geometry}}
              child {node[concept] {Differential Geometry}}
              child {node[concept] {Manifolds}}
              child {node[concept,color=black!50!green!50!red,text=white] {Discrete Geometry}}
              child {
                node[concept] {Topology} [clockwise from=-150]
                  child {node [concept,color=black!25!red!50!brown,text=white]
                    {Algebraic Topology}}}}
          child[concept color=brown,grow=140]{ 
            node[concept] {Algebra} [counterclockwise from=70]
              child {node[concept] {Elementary}}
              child {node[concept] {Number Theory}}
              child {node[concept] {Abstract} [clockwise from=180]
                child {node[concept,color=red!25!brown,text=white] {Algebraic Geometry}}}
              child {node[concept] {Linear}}}
    node[extra concept,concept color=black] at (200:5) {Applied Mathematics} 
      child[grow=145,concept color=black!50!yellow] {
        node[concept] {Probability} [clockwise from=180]
          child {node[concept] {Stochastic Processes}}}
      child[grow=175,concept color=black!50!yellow] {node[concept] {Statistics}}
      child[grow=205,concept color=black!50!yellow] {node[concept] {Numerical Analysis}}
      child[grow=235,concept color=black!50!yellow] {node[concept] {Symbolic Computation}};
\end{tikzpicture}

Have a child, plant a tree, write a book

April 23, 2013 2 comments

Or more importantly: rear your children to become nice people, water those trees, and make sure that your books make a good impact.

I recently enjoyed the rare pleasure of having a child (my first!) and publishing a book almost at the same time. Since this post belongs in my professional blog, I will exclusively comment on the latter: Learning SciPy for Numerical and Scientific Computing, published by Packt in a series of technical books focusing on Open Source software.

Keep in mind that the book is for a very specialized audience: not only do you need a basic knowledge of Python, but also a somewhat advanced command of mathematics/physics, and an interest in engineering or scientific applications. This is an excerpt of the detailed description of the monograph, as it reads in the publisher’s page:

It is essential to incorporate workflow data and code from various sources in order to create fast and effective algorithms to solve complex problems in science and engineering. Data is coming at us faster, dirtier, and at an ever increasing rate. There is no need to employ difficult-to-maintain code, or expensive mathematical engines to solve your numerical computations anymore. SciPy guarantees fast, accurate, and easy-to-code solutions to your numerical and scientific computing applications.

Learning SciPy for Numerical and Scientific Computing unveils secrets to some of the most critical mathematical and scientific computing problems and will play an instrumental role in supporting your research. The book will teach you how to quickly and efficiently use different modules and routines from the SciPy library to cover the vast scope of numerical mathematics with its simplistic practical approach that is easy to follow.

The book starts with a brief description of the SciPy libraries, showing practical demonstrations for acquiring and installing them on your system. This is followed by the second chapter which is a fun and fast-paced primer to array creation, manipulation, and problem-solving based on these techniques.

The rest of the chapters describe the use of all different modules and routines from the SciPy libraries, through the scope of different branches of numerical mathematics. Each big field is represented: numerical analysis, linear algebra, statistics, signal processing, and computational geometry. And for each of these fields all possibilities are illustrated with clear syntax, and plenty of examples. The book then presents combinations of all these techniques to the solution of research problems in real-life scenarios for different sciences or engineering — from image compression, biological classification of species, control theory, design of wings, to structural analysis of oxides.

The book is also being sold online in Amazon, where it has been received with pretty good reviews. I have found other random reviews elsewhere, with similar welcoming comments:

Which one is the fake?

December 7, 2012 5 comments
“Crab on its back” “Willows at sunset” “Still life: Potatoes in a yellow dish”

Naïve Bayes

June 21, 2012 Leave a comment

There is nothing naïve about Naïve Bayes—a very basic, but extremely efficient data mining method to take decisions when a vast amount of data is available. The name comes from the fact that this is the simplest application to this problem, upon (the naïve) assumption of independence of the events. It is based on Bayes’ rule of conditional probability: If you have a hypothesis H and evidence E that bears on that hypothesis, then

\mathrm{Pr} \big( H \lvert E \big) = \displaystyle{ \frac{\mathrm{Pr} \big( E \lvert H\big) \mathrm{Pr}(H)}{\mathrm{Pr}(E)} }

where as usual, \mathrm{Pr}(A) denotes the probability of the event A, and \mathrm{Pr}\big( A \lvert B \big) denotes the probability of the event A conditional to another event B.

I would like to show an example of this technique, of course, with yet another decision-making algorithm oriented to guess my reaction to a movie I have not seen before. From the data obtained in a previous post, I create a simpler table with only those movies that have been scored more than 28 times (by a pool of 87 of the most popular critics featured in www.metacritics.com) [I posted the script to create that table at the end of the post]

Let’s test it:

>>> table=prepTable(scoredMovies,28)
>>> len(table)

49
>>> [entry[0] for entry in table]

[‘rabbit-hole’, ‘carnage-2011’, ‘star-wars-episode-iii—revenge-of-the-sith’,
‘shame’, ‘brokeback-mountain’, ‘drive’, ‘sideways’, ‘salt’,
‘million-dollar-baby’, ‘a-separation’, ‘dark-shadows’,
‘the-lord-of-the-rings-the-return-of-the-king’, ‘true-grit’, ‘inception’,
‘hereafter’, ‘master-and-commander-the-far-side-of-the-world’, ‘batman-begins’,
‘harry-potter-and-the-deathly-hallows-part-2’, ‘the-artist’, ‘the-fighter’,
‘larry-crowne’, ‘the-hunger-games’, ‘the-descendants’, ‘midnight-in-paris’,
‘moneyball’, ‘8-mile’, ‘the-departed’, ‘war-horse’,
‘the-lord-of-the-rings-the-fellowship-of-the-ring’, ‘j-edgar’,
‘the-kings-speech’, ‘super-8’, ‘robin-hood’, ‘american-splendor’, ‘hugo’,
‘eternal-sunshine-of-the-spotless-mind’, ‘the-lovely-bones’, ‘the-tree-of-life’,
‘the-pianist’, ‘the-ides-of-march’, ‘the-quiet-american’, ‘alexander’,
‘lost-in-translation’, ‘seabiscuit’, ‘catch-me-if-you-can’, ‘the-avengers-2012’,
‘the-social-network’, ‘closer’, ‘the-girl-with-the-dragon-tattoo-2011’]
>>> table[0]

[‘rabbit-hole’, ”, ‘B+’, ‘B’, ”, ‘C’, ‘C+’, ”, ‘F’, ‘B+’, ‘F’, ‘C’, ‘F’, ‘D’,
”, ”, ‘A’, ”, ”, ”, ”, ‘B+’, ‘C+’, ”, ”, ”, ”, ”, ”, ‘C+’, ”, ”,
”, ”, ”, ”, ‘A’, ”, ”, ”, ”, ”, ‘A’, ”, ”, ‘B+’, ‘B+’, ‘B’, ”, ”,
”, ‘D’, ‘B+’, ”, ”, ‘C+’, ”, ”, ”, ”, ”, ”, ‘B+’, ”, ”, ”, ”, ”,
”, ‘A’, ”, ”, ”, ”, ”, ”, ”, ‘D’, ”, ”,’C+’, ‘A’, ”, ”, ”, ‘C+’, ”]

Read more…

Math still not the answer

May 16, 2012 1 comment

I wrote a quick (but not very elegant) python script to retrieve locally enough data from www.metacritic.com for pattern recognition purposes. The main goal is to help me decide how much I will enjoy a movie, before watching it. I included the script at the end of the post, in case you want to try it yourself (and maybe improve it too!). It takes a while to complete, although it is quite entertaining to see its progress on screen. At the end, it provides with two lists of the same length: critics—a list of str containing the names of the critics; and scoredMovies—a list of dict containing, at index k, the evaluation of all the movies scored by the critic at index k in the previous list.

For example:

>>> critics[43]

‘James White’
>>> scoredMovies[43]

{‘hall-pass’: 60, ‘the-karate-kid’: 60, ‘the-losers’: 60,
‘the-avengers-2012’: 80, ‘the-other-guys’: 60, ‘shrek-forever-after’: 80,
‘the-lincoln-lawyer’: 80, ‘the-company-men’: 60, ‘jonah-hex’: 40,
‘arthur’: 60, ‘vampires-suck’: 20, ‘american-reunion’: 40,
‘footloose’: 60, ‘real-steel’: 60}

The number of scored films by critic varies: there are individuals that gave their opinion on a few dozen movies, and others that took the trouble to evaluate up to four thousand flicks! Note also that the names of the movies correspond with their web pages in www.metacritic.com. For example, to see what critics have to say about the “Karate Kid” and other relevant information online, point your browser to www.metacritic.com/movie/the-karate-kid. It also comes in very handy if there are several versions of a single title: Which “Karate Kid” does this score refer to, the one in the eighties, or Jackie Chan’s?

Feel free to download a copy of the resulting data [here] (note it is a large file: 1.6MB).

But the fact that we have that data stored locally allows us to gather that information with simple python commands, and perform many complex operations on it.

Read more…

So you want to be an Applied Mathematician

September 16, 2011 10 comments

The way of the Applied Mathematician is one full of challenging and interesting problems. We thrive by association with the Pure Mathematician, and at the same time with the no-nonsense, hands-in, hard-core Engineer. But not everything is happy in Applied Mathematician land: every now and then, we receive the disregard of other professionals that mistake either our background, or our efficiency at attacking real-life problems.

I heard from a colleague (an Algebrist) complains that Applied Mathematicians did nothing but code solutions of partial differential equations in Fortran—his skewed view came up after a naïve observation of a few graduate students working on a project. The truth could not be further from this claim: we do indeed occasionally solve PDEs in Fortran—I give you that—and we are not ashamed to admit it. But before that job has to be addressed, we have gone through a great deal of thinking on how to better code this simple problem. And you would not believe the huge amount of deep Mathematics that are involved in this journey: everything from high-level Linear Algebra, Calculus of Variations, Harmonic Analysis, Differential Geometry, Microlocal Analysis, Functional Analysis, Dynamical Systems, the Theory of Distributions, etc. Not only are we familiar with the basic background on all those fields, but also we are supposed to be able to perform serious research on any of them at a given time.

My soon-to-be-converted Algebrist friend challenged me—not without a hint of smugness in his voice—to illustrate what was my last project at that time. This was one revolving around the idea of frames (think of it as redundant bases if you please), and needed proving a couple of inequalities involving sequences of functions in L_p—spaces, which we attacked using a beautiful technique: Bellman functions. About ninety minutes later he conceded defeat in front of the board where the math was displayed. He promptly admitted that this was no Fortran code, and showed a newfound respect and reverence for the trade.

It doesn’t hurt either that the kind of problems that we attack are more likely to attract funding. And collaboration. And to be noticed in the press.

Alright, so some of you are sold already. What is the next step? I am assuming that at his point you own your Calculus, Analysis, Probability and Statistics, Linear Programming, Topology, Geometry, Physics and you are able to solve most known ODEs. From here, as with any other field, my recommendation is to slowly build a Batman belt: acquire and devour a sequence of books and scientific articles, until you are very familiar with their contents. When facing a new problem, you should be able to recall from your Batman belt what technique could work best, in which book(s) you could get some references, and how it has been used in the past for related problems.

Following these lines, I have included below an interesting collection with the absolutely essential books that, in my opinion, every Applied Mathematician should start studying:

Read more…

Unusual dice

January 27, 2011 Leave a comment

I heard of this problem from Justin James in his TechRepublic post My First IronRuby Application. He tried this fun problem as a toy example to brush up his skills in ruby:

Roll simultaneously a hundred 100-sided dice, and add the resulting values. The set of possible outcomes is, of course, \{100, 101, \dotsc, 10000\}. Note that there is only one way to obtain the outcome 100—all dice showing a 1. There is also only one way to obtain the outcome 10000—all dice showing 100. For the outcome 101, there are exactly 100 different ways to obtain it: all dice except one show a 1, and one die shows a 2. The follow-up question is:

Write a (fast) script that computes the different ways in which we can obtain each of the possible outcomes.

Read more…

Bertrand Paradox

January 21, 2011 Leave a comment

Classically, we define the probability of an event as the ratio of the favorable cases, over the number of all possible cases. Of course, these possible cases need to be all equally likely. This works great for discrete settings, like dice rolls, card games, etc. But when facing non-discrete cases, this definition needs to be revised, as the following example shows:

Consider an equilateral triangle inscribed in a circle. Suppose a chord of the circle is chosen at random. What is the probability that the chord is longer than a side of the triangle?

First example Second example Third example

Read more…