Wednesday, November 24, 2010

Bizarre things that have cut me.

I was preparing dinner just now and somehow managed to cut myself. But I didn't cut myself with a knife as you might have imagined I would. I cut myself on the dried salsa on the salsa bottle as I opened it. Now I can add that to the list of bizarre things that have cut me.

  1. buttered toast

  2. a mattress (note not the mattress springs but the material itself)

  3. dried salsa

It should be noted that I require at least a little bit of blood to be drawn in order to count it a cut, if no blood is drawn then it is considered a scratch. In the case of the mattress whether or not it was a cut is moot since it was really an abrasion, a particularly nasty "rug burn" if you will. But it goes on the list anyway since it bled.

Tuesday, November 9, 2010

bizzarre but fun

admittedly I spent way too much time playing this game. The preponderance of "media monsters" named QQQ with perfect stats makes me think that perhaps that particular name is some sort of cheat to get balanced stats.

Monday, October 18, 2010

Pollard Strassen Polynomial Evaluation Method of Factoring, and a Variation

I admit I was reading Prime Numbers by Pomerance again and I have taken a fancy to one of the factorization methods described in the book. The book only gives a single paragraph on the method and I thought it would be fun to talk about it here. The idea is essentially this, to factor a number n pick chop the numbers less than n up into equal sized blocks and multiply all the numbers in these blocks together mod n and then take the greatest common divisor of the product of these blocks of numbers with n. Once we hit a block of numbers which has a gcd > 1 then if the gcd < n then we have a proper factor of n or if not then we can just search through the numbers in that block one at a time, checking to find the first number with a gcd with n greater than 1, and wallah!

Since we are guaranteed at least one factor in the numbers less than n^1/2 we only need to worry about breaking up the numbers less than that up into blocks. Rather intuitively the best division of work would be to make the blocks all sized equally and so you naturally come to the decision to make the blocks of size B = n^1/4. In order to make the calculation of the products of these numbers as quick as possible we take the time to construct the polynomial p(x) = x * (x-1) * (x-2) * (x-3) * ... * (x - B + 1) and reduce its terms mod n. Once we have that polynomial we just have to check gcd(p(B*i), n) to see if there is a factor of n in the block of numbers [B*i, B(i-1)] until we find an i that it works for.

I rather like the algorithm it isn't quite as cool as the rho method of factorization but it is surprisingly elegant and effective and has the additional advantage that it is clearly guaranteed to work. Just for fun I want to propose an idea that is almost certain to be worthless as an effective means of factorization but nonetheless appeals to my sense of aesthetics.

These polynomials which we are using to quickly evaluate the product of chunks of numbers are using the fact that factorials are nice smooth numbers with a lot of factors. In Pomerance's book he motivates the polynomial evaluation method by talking about how if one could evaluate factorials easily then factorization could be achieved by a simple binary search since gcd(k!, n) > 1 if k is greater than the smallest prime factor of n and equal to 1 otherwise.

Being a physicist I rarely have to care about the exact integer value of a factorial and instead am perfectly alright with using Stirlings approximation to the factorial as though it were exact since for any numbers for which it would be arduous to calculate the factorial exactly the error in its approximation is the most minuscule fraction, far below the error that one could hope to measure in a physical experiment.

But of course if we are interested in the factorization of n! then if we are off by even 1 the factorization will be completely different. Clearly trying to directly apply the binary search idea utilizing Stirling series is useless. The fractional error may drop dramatically with size but the absolute error still grows exponentially. So trying to keep enough terms in the approximation to get within 1 of the value of the actual exponential is a losing game. We could simply try all the numbers "close" to the estimated number but besides the fact that we might have to try an uncomfortably large number of candidate numbers in order to be certain of having tried the actual factorial there is a more serious problem. If we were trying to factor a 10 digit number the size of the factorials involved would be millions of digits long which is just a tad bit of overkill for factoring a 10 digit number. So if we want to keep the operations manageable we should really properly evaluate the factorial mod n. But because the factorials are so much larger than n even a tiny difference percentage wise in the value of the factorial would correspond to many times n and so give answers which are well mixed on n.

From all of these considerations it would appear that we would be much much better off simply doing trial division or for that matter actually using the Pollard Strassen polynomial evaluation method to factor our number. But still one might potentially be able to use Stirlings approximation or something like it (perhaps an appropriate adaptation to keep coherence mod n) might work out in the end.... perhaps even make for an actually useful factorization algorithm... I doubt it though. Still it is fun to think about and I would consider it something of a triumph to design a factorization algorithm that used the stirling approximation that just wasn't worse than the sieve of Eratosthenes.

Saturday, October 2, 2010

Listening to bubbles to find dark matter

The departmental colloquium this last week was given by Peter Cooper who is part of the COUPP search for dark matter. Just like all the other dark matter searches the idea is to make your detector have as much material in it as possible and try to shield it as much as possible from any known interactions, cosmic rays, background radiation, etc. What you don't shield away you want to be able to discriminate anything we know about and then what you are left with must be dark matter.

In the case of the COUPP experiment they are using bubble chambers deep underground with several tons of scintilator oil on top. A bubble chamber works by keeping the fluid inside it at a temperature which is just above its boiling point. But because a bubble of gas takes more space than the liquid it takes a little bit of extra energy to make a bubble form. When a cosmic ray or other particle comes through the liquid it deposits energy and this extra little kick can make bubbles form in the liquid.

Usually in a bubble chamber you would be interested in the track that a particle leaves in the detector. But anything that leaves a track of bubbles interacts way too strongly to be dark matter. So the question is how do you discriminate between dark matter single bubble events and more boring single bubble events? The answer is you listen to the noise the bubbles make when they are made! alpha events in the chamber caused by trace amounts of radioactive materials in the device will show up as louder bubbles!

I can't help but think of a picture of a physicist carefully listening to the vibrations of tiny bubble floating deep in the dark and hoping to hear snatches of unknown harmonies of nature.

Thursday, September 30, 2010

I got the most AMAZING seats to see Hamlet. (and for $5!)

Thursday last week I went to see Hamlet at the pioneer memorial theater. The performance was quite good and I think so was the editing of the play. I have long been quite the fan of the play Hamlet. I have read the play more than once and gave serious effort to memorizing it at one point (I actually managed to memorize the first scene in the first act before I gave up). Until a week ago I had only seen it performed via recordings. But Last Thursday I got to see it live and much to my good fortune not only did I get to see it live but I got to see it from an EXCELLENT seat. Jenna went to the theater ahead of me and purchased the tickets a little less than 30 minutes before the show started and then when I arrived she nonchalantly led me to the best seats in the house. Two seats on the very front row and at the dead center of the very front row! The theater has a program where U students can buy tickets to get the best seats still available 30 minutes before the performance for $5 a seat. Apparently the people with the season tickets who normally would have occupied that space had canceled. For any play this would have made me happy but that this happened for HAMLET!! it is just so amazing. The last time Hamlet was performed at the PMT was in the 80's and no doubt will not be performed again for a decade or two more. To give a perspective on how difficult such seats are to obtain a conversation with our neighbors revealed that they had season tickets and had been sitting in the same seats for 30 years! Furthermore the people who normally sat in the seats we were occupying had also been sitting there for many years (though how many years exactly I alas did not ask).

Wednesday, September 22, 2010

Is Hinchliffes Rule True?

This paper actually got published in 1995 and Title, Author Name, and Abstract make up the entire paper. I must say I thoroughly approve.

Tuesday, September 14, 2010

The Poo From Hell

I recently moved in to a new apartment I have up until this point been reasonably happy with having a new apartment. I really like having a kitchen and bathroom of my own instead of sharing them with strangers as I have been doing up until this time.

I found it endlessly annoying that I couldn't keep a roll of toilet paper in the bathroom and finish it off myself because I was at the house so much less than the other people using the bathroom and invariably someone else would use my toilet paper. At the last house I lived at I bought around 30 rolls of toilet paper to put in the bathroom. Of those rolls that I put into the bathroom I myself used the last few sheets only 2 times. Which gives a rough estimate that other people were using more than 90% of my toilet paper. Eventually I started taking rolls of toilet paper in to the bathroom from my room with me and using whatever was there if I forgot.

I must say that at my new apartment I was quite enjoying having my own bathroom (I am enjoying having my own kitchen much more but that is another rant entirely). The ability to shower whenever I want to without waiting for some one else and, better, not having to wait for someone to finish their shower when I need to go to the bathroom is very nice.

I never thought of it as an advantage that when I found a great big nasty dump sitting unflushed in the toilet I at least had people to blame for it. I had thought (silly me!) that when I had my own toilet I would never have to encounter such a situation again. But beginning about a week ago little floating turds started showing up in my toilet bowl. They are very hard to flush because they are so buoyant so at first I assumed that they belonged to my very own bowels and had simply been left behind when I flushed since they were so very floaty. I don't generally check to make sure that everything that went down when I flushed doesn't come back up a few seconds later. So I must have flushed and not noticed when these things remained behind. However something of an entirely different magnitude has been happening. When I came back from school Monday night I was immediately greeted by a horrible eye burning stench permeating my apartment. I thought that something in the garbage must be going rotten and perhaps the dishes in the sink had been allowed to sit for too long etc etc. After a quick cleaning of the kitchen and turning on the air conditioning to help diffuse the smell it became less by a little bit.

The smell was ebbing away only very slowly with this treatment and so when it cooled down somewhat outside I opened up the door to my balcony and went to turn off the air conditioning. But this time when I passed by the bathroom the stench became extremely strong and I realized suddenly that something rotten in the kitchen was not the culprit at all but instead it was something rotten in the toilet. I flicked on the light and saw a disgusting poo soup floating in the toilet bowl. The soup was complete with disgusting brown/orange colored water and several extremely buoyant logs of brown sludge. After flushing a few times I finally got it all to go away and immediately my apartment began to smell much better.

After finding such a thing waiting in my toilet for me to discover I couldn't help but wonder if someone had come into my apartment while I was away and had intentionally left this there for me to discover. I wasn't sure if I had locked the door or not and it was possible that someone had just come in and decided to help themselves to my restroom fascilities. I hoped that the problem was temporary and had been caused perhaps by a plumbing problem in an adjacent apartment.

But today when I came home from school again for the second day in a row I was greeted by a foul eye burning stench and then shortly thereafter by a floating mess in my toilet bowl. This time however the poo is even more buoyant and no matter how many times I flush the toilet some of it comes floating back to taunt me. It as if someone ate a life preserver and a whole bunch of rotting meat to make the most persistent and foul smelling poo imaginable, and then they took a nice long dump in my toilet while I was away.

Today I am certain that I locked the door when I left which fact combined with its regularity means that most likely this problem is a plumbing problem of some sort. Though I must say it also has to be related to the extraordinary buoyancy of the poo involved and perhaps someone in one of the other apartments has a secret passion for spaghetti made with foam pool noodles.

Wednesday, September 8, 2010

LD #18, maybe another time....

I never really got started on the game for this competition. I haven't made a post about my abject failure to do anything of note because between school and moving in to a new apartment I have had very very little time. For the competition I decided I wanted to do something with sound and very quickly came to the conclusion that I should make enemies sounds and in order to defeat them you had to cancel their sound out. The sounds you could make would be limited by the monsters you had defeated. I hadn't really figured out how to make the game interesting or fun but the idea seemed original and I have long wanted to learn to manipulate sound via programming. However the choice to use sound was my undoing. I near instantly discovered that my desktop has some issue with using sound in pygame... after reinstalling the module I just sort of gave up. The theme was basically my least favorite out of all the possible themes (even the joke theme double zombie rainbow) the inability to even run the simplest sound programs in the programming language of my choice pretty much drowned my remaining enthusiasm and I figured I would just play video games for the weekend instead of make them.

However I still feel that game might be fun to write at some future point so... maybe I will actually code it someday (I give it odds of about 40 to 1 against)

Friday, August 20, 2010

Ludum Dare #18 "Enemies as Weapons" Begins

Since this game competition is conveniently set for the weekend before school begins and I don't have anything else I absolutely have to do I have decided to participate in Ludum Dare 18. (a 48 hour game making competition) I The theme is officially "Enemies as Weapons" and now the cogitation begins.... What should I do with that?

Friday, August 6, 2010

11:11 make a wish

I don't really remember who introduced me to this but I remember being out in the parking lot of my high school late one night and I looked at the clock and it was 11:11. When I mentioned this to a friend who was with me they told me to make a wish. I have since adopted the 11:11 wish tradition and since I just made an 11:11 wish I decided I should try and spread the tradition.

The rules of the 11:11 wish are these 1) you must have randomly glanced at the time and discovered that it is 11:11 you cannot contrive to see the time in order to make the wish. 2) The 11:11 wish can only be made at night if it is 11:11 in the morning it doesn't count (of course you could argue that really then it should be the 23:11 wish but that doesn't sound as cool now does it?). 3) The wish must be made before the time changes to 11:12.

If you choose to adopt the 11:11 wish feel free to modify the rules as you like but these are the rules I adhere to.

Thursday, August 5, 2010

We must save the oceans

I have long been afraid that the real fight to save the environment will not start until it is too late. It is easiest to see the damage we are doing on land because that is where we live. It is easy to recognize that a barren landscape stripped of trees and devoid of greenery has had real damage to it. But a lake still looks the same when it is full of fish as when it is empty. but a lot of devastation we are causing is easy to see, if we look.

I have often been asked to help save the rainforest by helping to buy protected forest land. I just saved 7.4 sq feet of rainforest by clicking the link on this site. The biodiversity of the rainforest is astounding and so is its rate of destruction, this is also true of the ecosystems of the ocean. Unfortunately unlike the rainforest I can't simply donate to a charity (or click a link) to help have a couple of acres of ocean added to a sort of wildlife reserve. Many barriers stand in the way of creating a similar program for the oceans as already exists for rainforests. The ocean belongs to no country and no one and unfortunately this simply makes it all the easier for the ocean to be a victim of a tragedy of the commons.

Sylvia Earle has been building an organization in an attempt to create a program to create these protected areas of ocean she calls "hope spots". In 2009 she was a ted prize winner and she gave this ted.

Creating these protected waters is a wonderful idea and I think it is probably something we will have to do if we want to really save the oceans, but it isn't the only thing we will need to do. The ocean is getting screwed over not just by our practices of directly exploiting it but also by the way we live. Plastic and garbage that find their way to the ocean stay in the ocean and we now have something called the Great Pacific Garbage Patch of course the patch is called "great" not because it is rather nifty but because it is gigantic. Estimates of the size of the garbage patch vary but the LOW estimate is that is about the size of Texas and there is another similar patch in the Atlantic. The buildup of plastic and other detritus in the oceans is killing birds and whales and no doubt is causing other sorts of trouble which are more subtle. Just as worrying is the acidification of the oceans caused by the uptake of atmospheric CO2. The acidification will adversely affect the organisms which create corals. While I think global warming is tragic because of the loss of the polar ice caps and to a lesser extent because of the raising of the sea levels and the displacement of people that will cause; I am more afraid of the loss of our coral reefs and just of screwing up the chemistry of the oceans in general.

Thursday, July 29, 2010

An excellent talk by Douglas Adams

I was perusing the talks on the ted site (a wonderful site by the way if you aren't familiar with it) and I ran across this gem, it is a bit long but very entertaining.

I never knew Adams was an ecological activist.

Wednesday, July 28, 2010

Great Stelline Recipe

A while ago we purchased some star shaped pasta named Stelline. I tried out a recipe using it and it turned out great, so I am going to share.


  • 12 oz box of Stelline pasta
  • 3 tablespoons olive oil
  • 3 cloves of garlic, minced
  • 1 onion, chopped
  • 1 zucchini, diced
  • 1/2 red bell pepper, diced
  • 2 cups water
  • 2 teaspoons vegetable soup base*
  • 2 leaves fresh basil, chopped

*I used some stuff called "better than bouillon vegetable base" which rather obviously is a bouillon substitute and I am not sure how this would translate into bouillon, but you could definitely use that instead or replace the soup base and water with an appropriate amount of vegetable broth (or chicken broth if you don't care about the dish being vegetarian)

1) saute the onion and the garlic with the olive oil in a saucepan until the onion is slightly translucent.

2) add the zucchini and red bell pepper and saute 4 minutes

3) If necessary move the mixture into a pot large enough to contain all the ingredients (My saucepan is too small to accommodate everything). Add the water and the soup base and bring to a boil.

4)once the mixture begins boiling add the stelline and cook for 4 minutes constantly stirring.

5) mix in the basil and let sit for 5 minutes.

6) enjoy (parmesan cheese is great on top!)

Monday, July 26, 2010

Turning a sphere inside out

I first encountered the problem of turning the sphere inside out in a class on knot theory some time ago. At the time I saw just a small clip of the procedure they show in this video but I didn't really understand exactly what was going on, I just thought the video was cool. I ran across this video quite a while ago and I must say it is extremely entertaining and they explain the problem very well.

Saturday, July 17, 2010

The keys to the kingdom, Lord Sunday

I finished reading the last of the books in the keys to the kingdom series today and I will say this is probably one of my all time favorite series. I've never read anything by Garth Nix that I didn't like and these books are no exception.

If you want to read the books I highly recommend them and they don't take long to read they are written on the level of a children's book and you can read through one of the books in just a few hours. If you don't care to read the books or have already read them then read on otherwise consider skipping this post until I won't be spoiling the books for you.

The essential plot of the series is that the universe was created by a being called the architect who also created a whole other dimension called "The House" which watches over the universe. The House is divided into 7 realms corresponding to the 7 days of the week and particular denizen of the house is given one of seven keys giving them power over one of the days of the week in the universe and that corresponding domain within the House. After a nice long 15 billion year run the architect decides to call it quits and commit suicide. Apparently the architect is unable to commit suicide herself since she imprisoned a part of herself in a prison which she made to last until the end of her creation. Unable to kill herself without taking everything else with her she breaks herself into 7 parts each one being a part of the "will" of the architect. Which we are meant to take in the sense of "last will and testament". These 7 parts of the "will" are given to the 7 rulers of the days. Each part of the will telling a different ruler to relinquish their power to a mortal "rightful heir". Whereupon all of creation would be destroyed and the "rightful heir" allowed to build a new universe as they saw fit.

There is a bit of a complication in the carrying out of the will since those entrusted to carry it out decide not to commit suicide and destroy the universe. Instead they lock away the different parts of the will for 10,000 years at which point the story line of the books actually begins when one part of the will escapes and manages to dub a mortal rightful heir. The rightful heir then goes on to slowly gain power and dominance in the house collecting each of the keys and each part of the will in turn. All the while blissfully unaware of the fact that the ultimate goal of the will of the architect is the destruction of everything. Our hero Arthur, the rightful heir, works over the course of all 7 books in an effort to save and protect which he does seem to manage to do at least right up until the last part of the last book when all of creation is expunged.

I have read rather few books where the entire universe gets destroyed. I must say I didn't expect that to happen what with the heroes generally managing to save the day, especially in a children's book. I must say though a 7 book series has got to be the most elaborate suicide story I have ever heard. (though since the true purpose of the will is concealed right up until the last few pages of the last book the books really read just like any other children's adventure story).

Wednesday, July 14, 2010

I want to go to other stars, but the moon will do.

Looking up at the stars at night and looking out at the sunset I often feel a strong desire to visit other stars and see alien sunsets. It doesn't even really have to be me that visits these other stars it is inspiring just to think that at some point a pair of human eyes will be watching a sun set that is not our own. It ultimately doesn't really bother me much that humanity will one day perish. But it is somehow deeply saddening to think that by humanities end we might never have stepped away from our home planet.

I dearly hope we will prove ourselves capable of leaving our home solar system but while I think humanity hopefully will spread to other stars one day such a feat would be nothing short of the most monumental undertaking in human history. But humanity needn't go to another star to leave home we needn't even head off to mars (though that would be fun too). We simply need to go back to the moon and put down some roots.

The other day at sunset I was aching with the desire to go to another star system and the impossible distances involved. The moon happened to be out as well and I caught a glimpse of it looking beautiful and suddenly so very alien itself. Going to another star will remain a distant fantasy all my life but going to the moon is something I really could do in my lifetime. So I'm going to add another long shot goal to my list; set foot on the moon.

Saturday, July 10, 2010

Bad Jokes as a passtime for waiting in line; or 10.1 tropical fruit jokes

While waiting in line for rides at a theme park me and my companions were entertaining ourselves by inventing bad jokes of a particular strain. Since this pastime rather reminds me of the 101 themed jokes posts a friend used to make I shall reproduce a part of them here as 10.1 tropical fruit jokes. (I always thought that 101 bad jokes on a theme was really too many for a blog post)

Q. What tropical fruit gives the worst diarrhea?

A. The Poopaya

Q. What is the most religious tropical fruit?

A. The Popepaya

Q. What tropical fruit do plumbers prefer?

A. The Pipepaya

Q. What tropical fruit do people smoke?

A. The Pipepaya

Q. What is the spikiest tropical fruit?

A. The Porcupaya

Q. What tropical fruit is the best at karate?

A. the pap-HI-YAHH!

Q. What tropical fruit would you want on your team for a tug of war?

A. The Pullpaya

Q. Which tropical fruit is the hardest to keep from peeing on the rug?

A. The pup-paya

Q. What tropical fruit is the most agreeable?

A. The papayeah

Q. What tropical fruit do you use to go to the bathroom?

A. The Portipaya


A. The Pollpaya

Sunday, June 20, 2010

Delicious Vegetarian Stuffed Mushroom Recipe

I had some portabello mushrooms to use before they went bad and I looked online for a stuffed mushroom recipe and I found

Which looked delicious so I decided to modify the recipe to make it vegetarian and made out of stuff I had on hand. I was extremely happy with the result so I'm going to share.

Step 1) put a stick of butter, 6 0z of cream cheese 1/4 cup water, 1 tablespoon sugar, and 1 tablespoon grapefruit juice in a sauce pan and heat and stir until it thickens a bit.

Step 2) Add a diced red bell pepper, garlic powder and chili powder to taste to the sauce and let simmer

Step 3) Dice and Sautee an onion in a separate pan. Remove the stems from your portabell0 mushrooms chop them up and add them to the onion and let them cook for a bit.

Step 4) Add the onion and mushrooms to the cream sauce and fill the mushrooms with the mixture.

Step 5) Top the mushrooms with parmesan cheese and bread crumbs.

Step 6) Broil the mushrooms until the bread crumbs are a golden brown.

I was making 4 stuffed portabello's with the above recipe and I had a bit too much sauce but some of my mushrooms were much smaller than the others so your milage may vary.


Tuesday, May 11, 2010


This last Friday I finally participated in the commencement ceremonies for the degree I earned a year ago. There was a university wide commencement ceremony which I attended but had to leave early from and a convocation ceremony which I got to sit all the way through. After coming out on the other side of the graduation ceremony I think I would have been better served to simply go to the university wide commencement ceremony and just ignored the existence of the convocation. At the convocation I got to walk across a stage while someone read my name... it wasn't really much to get too excited about. I would rather go to the ceremony where the bulk of the program is made up of speeches instead of disconnected contextless names.

Although it is somewhat nice to have some personal recognition in hindsight it definitely wasn't worth sitting through everyone elses names in order to get my few seconds to cross the stage. Or at least it isn't worth it when I could just as easily get a ceremonial conferral of my degree just by attending the commencement, albeit not a a personal one.

Thursday, March 11, 2010

My poker book got delivered to a gynecologist:: or why ups should be spelled phonetically

I ordered "The Mathematics Of Poker" a little while ago and on Wednesday of last week it got delivered.... or at least on Friday when I checked the tracking data online to find out what had become of it I discovered it had been delivered and signed for... but not by me. At 3:00 on the dot Wednesday someone named "Kristie" had signed for my decidedly absent package.

I live in a house in which multiple rooms are rented out independently of each other. I wasn't certain if a Kristie had perhaps moved in recently and for some reason had decided to retain my package for safe keeping in her room instead of in the mantle in the front room. At least before I went about complaining I needed to find out if in fact I lived with a Kristie. After asking a number of my room mates and someone at the coffee shop that shares the plot of land on which this house is built I discovered that there was indeed a new female resident of the house but no one knew her name. Eventually the matter was solved by calling the land lord who informed me that the only woman living here went by the name of Angela (we shall ignore for the moment the fact that Yu-ling has been living here for longer than I have and she is definitely female).

Satisfied that there was no Kristie in the house the question became what address had my package really been delivered to? A little information could be gleaned from the tracking page.

SALT LAKE CITY, UT, US 03/03/2010 3:00 P.M. DELIVERY



03/02/2010 6:59 A.M. OUT FOR DELIVERY

03/02/2010 4:01 A.M. ARRIVAL SCAN
SPARKS, NV, US 03/01/2010 7:04 P.M. DEPARTURE SCAN

03/01/2010 12:59 P.M. ORIGIN SCAN

A correct street number is needed for delivery? What on earth did that mean? How did UPS manage to lose my address? I decided to go look and see what was on the adjacent streets at 1031 E and I found that in fact there was a house at 1031 one street over and in the other direction there was a hospital. There was no response when I knocked at the 1031 house and I assumed that delivery to the hospital was essentially impossible. Checking online showed that the address of the hospital was 1050 E giving it no commonality with my own.

As I was walking back from the 1031 abode however I saw the UPS carrier making his rounds and hurried to talk to him. I asked if any packages had perhaps given him some trouble on Wednesday and he responded that none had. I told him that I had been expecting a package and that it had been delivered and signed for by a Kristie but that I had no idea who that was. Very fortunately for me he knew who this Kristie was and told me that my package must have gotten delivered to a Doctor Hinson. Why this could possibly have happened neither he nor I had the slightest Idea though he assured me that the package was addressed 1050 E 100 S and to doctor Hinson.

I got online and found out that Doctor Hinson was an OB/GYN though I failed to find the location of her office. I let the matter sit until Monday. On monday my package arrived battered and bruised with a ups label that had apparently gone over the top of a label with my correct name and address partially ripped off. The box had been opened and retaped though the book inside was no worse for wear.

Checking the tracking page now there are two delivery lines one right after the other.

SALT LAKE CITY, UT, US 03/08/2010 3:42 P.M. DELIVERY
SALT LAKE CITY, UT, US 03/03/2010 3:00 P.M. DELIVERY

And then immediately after it dated 2 days after I had already received my book


I still have no idea what really happened in this whole mess but I thought the oddity was sufficient to merit sharing. This at the very least justifies me in a long time habit of half jokingly (now somewhat less than half jokingly) pronouncing ups as oops.

Sunday, March 7, 2010

Kissing Numbers

The kissing number in a certain number of dimensions is the greatest number of spheres that can be brought to touch (or "kiss") a central sphere if all the spheres are the same size. in one dimension the kissing number is 2. In two dimensions the problem is slightly less trivial but hardly difficult. It will probably not come as much of a surprise that the best arrangement in 2 dimensions is the familiar hexagonal arrangement.

A little thought will show that all the angular space around the central circle is taken up. Since the centers of 3 equally sized spheres mutually in contact with each other form an equilateral triangle.

So the minimum angle between the centers of any two circles touching the central circle is just Pi/3, the angle of an equilateral triangle. The sum of the angles between the centers of all of the circles touching the central circle must add up to a full 2*pi and each of these angles must be at least Pi/3. So the most circles we could possibly get to touch the central circle is 2*Pi/(Pi/3) = 6 circles. Since the hexagonal arrangement actually achieves this maximum we have our proof.

In three dimensions a good solution is found by taking the optimal 2 dimensional hexagonal arrangement of spheres around a central one in a plane but now it is possible to add spheres above and below the plane that touch the central sphere. Three spheres can be placed above and three spheres below the plane of the hexagonaly arranged spheres while still contacting the central sphere. The arrangement gives 6 + 3 + 3 = 12 as a lower bound for the kissing number in three dimensions. The centers of the spheres arrayed around the central one now form the vertices of one of the Archimedean solids, the Cuboctahedron.

Is this the optimal arrangement or is there possibly some more clever arrangement which could fit an extra sphere or two in? Generalizing our tactic for the two dimensional case we can ask what is the minimum angular space taken up by each sphere. A calculation of the solid angle taken up by a sphere touching the central sphere gives. Pi*(2-sqrt(3)) = 0.841 steradians. Dividing the total number of steradians by this value gives 4*pi/(Pi(2-sqrt(3)) = 14.92 So this gives us a definite upper limit of 14 spheres that we can fit around a central sphere.

But this bound ignores the fact that spheres do not fit flush with each other so we can't hope to fill all the angular space around the central sphere. Consider an arrangement of equal sized spheres placed at the vertices of a regular tetrahedron such that all the spheres touch each other. If all the spheres around the central sphere could be made to form such tetrahedral arrangements such that the lines between the centers of all touching exterior spheres form equilateral triangles clearly this arrangement would be optimal. I worked on a proof for a bit but already in three dimensions the problem is quite hard and I am not quite sure how to approach it. Nevertheless such tetrahedral arrangements are not possible in 3 dimensions or in any higher dimension either. The triangular packing of 2 dimensions is unique.

Wednesday, March 3, 2010

The Robots are gone

So the vast swarm of high school students is finally gone. It did end up turning out that my office was pretty much unusable between the hours of 2:30 and 9:00 and even during the day there came to usually be someone running around. Apparently there were roughly 150 students involved in the project in total though the number that I saw there usually averaged much closer to 30 people at a time.

I resisted posting over and over again about the inconvenience. I don't mind working in the library but there is a certain indignity in not being able to use my own office. Of course as everyone would tell me I was more than welcome to come and use my desk but when there are dozens of people coming and going and working together and discussing things and when those people are furthermore high school students... I think it will not be hard to convince you that the library was a much more conducive work/study area.

I really found it very irksome that the administration didn't deem this intrusion of my work space significant enough to merit a simple e-mail warning me of the swarm in advance. I first learned of the plan to share my office with these students when someone came by to install a box with the key to my office right next to my door. I wondered if perhaps my presence in this office had somehow gotten overlooked and on some official roster somewhere I was really supposed to be in one of the more populated grad student offices.

But after the last students had left and the copious amounts of garbage, and carpet, and plywood, and electronics, and dirty dishes, etc had been cleaned out of my office I got a letter from the department chair in my box.

Of course my first response was that this was some sort of official departmental action and therefore bad news. When I opened it I discovered it had a gift card in it. It actually turned out to be something of an apology.

Tim Anderton
Department of Physics and Astronomy
115 S 1400 E #201
University of Utah
Salt Lake City, Utah 84112

Dear Tim,

I would like to thank you for your accommodating the West High Robotics team during the past six weeks as they built and tested their robot in the James Fletcher Building. The group was very appreciative of the use of the grad student offices and the lab next door as this facilitated the construction and testing of the robot. Last year, the labs and office spaces for the robot build were located in Chemistry, but the mechanical construction took place in the Department of Physics and Astronomy machine shop.

I had not envisioned the amount of time the team would spend in the Department, and the number of people and/or computers that eventually were crowded into your small office. I know the imposition was rather severe at times. On behalf of the Physics Department and West High Robotics, I have enclosed a small gift card to the University of Utah bookstore. I sincerely appreciate your patience in the midst of all the chaos!

with Best Regards,
Dave Kieda
Chair, Department of Physics and Astronomy
Professor of Physics
University of Utah
Salt Lake City, Utah 84112

I am of somewhat mixed feelings about this. After the robotics team had gone I was happy to have shared my office with them. True it was quite a pain at times but at the same time I definitely approve of the project and think it is a good idea for them to have access to such a space in the department. The thing I most would have wanted is to have been asked permission or, failing that, at least been warned. On the other hand of course if I had been warned then I very probably would not have received this gift and apology. Especially since I would probably have ok'd the the use of the office anyway perhaps this is preferable. I can't help but wonder though if perhaps this is a manifestation of the fact that is often easier to ask forgiveness than permission. Ah, well C'est la vie.

P.S. On the off chance that Dave Kieda is reading this, Thanks for the letter and the card! I appreciate it despite my griping about my lowly status within the department.

ISS flyover

The International Space Station (ISS) will be visible over salt lake for about 7 minutes for the next few nights.

its pretty fun to watch at least if you are inclined to be impressed by stuff like that, which I definitely am.

Wednesday, February 17, 2010

Recession a Speedo advertising ploy?

The red is the number of jobs created/lost during the last months of the Bush administration and the blue is the jobs created/lost during the first part of the Obama administration. Is it just a coincidence that the graph looks like a red and blue speedo with the crotch at the inauguration? I think not...

That being of course either because the stimulus plan worked or because speedo just ran the most expensive ad campaign in history.

the original graph can be found here

Thursday, February 11, 2010

Mouse activity: Surfing and chatting

I was surfing the net using stumble upon and I stumbled upon this little program which logs your mouse position. The lines are mouse paths and the blobs are places where the mouse sat for a while.

Thursday, February 4, 2010

a best k term approximation problem

Here is an interesting little question.

For all dictionaries of M vectors chosen from a N dimensional vector space what is the minimum value of the maximum relative error in the best k term approximation for any vector in the N dimensional space.

The term "dictionary" in this usage comes from signal processing (or at least that is where I got it from). A "dictionary" in this case is just any set of vectors which we can use to express other vectors in terms of. Generally in the case where the vectors are not linearly dependent on one another we call such a collection of vectors a "basis" a dictionary is the same except we remove the restriction that the vectors be linearly independent.

For a particular dictionary of vectors the best k term approximation of a vector X is the vector whose difference from X has the smallest value which can be constructed as a linear combination of k vectors chosen from our dictionary. For instance take the example of the plane and our dictionary is the usual basis {(0,1), (1,0)} then the best 1 term approximation to the vector X = (x, y) is (x, 0) if x > y and (0, y) otherwise. Obviously no other dictionary with only 2 vectors in it will do a better job making 1 term approximations. Also obviously the best 2(or more) term approximation of any vector is just that vector itself since our dictionary vectors span the space of all vectors in the plane.

In fact if you take a moment to reflect you will quickly see the answer to the general question if k >= N is 0. This is clear since if k >= N then M >= k >= N so we must be able to choose a basis of our N dimensional space in which case the best k term approximation of a vector X is just X itself. Slightly more interesting but with just as obvious a solution is the case where M < N. If we have fewer than N vectors to choose from then it doesn't matter which particular vectors we put in our dictionary just so long as they are all linearly independent of one another. Then we can build a M dimensional subspace of the N dimensional one and any M dimensional subspace is "as good" as any other one. Of course actually it is important to choose linearly independent vectors only if we also care about the average relative error of the best k term approximation as well as its maximum value. Since we can't even span the space with our M terms when we consider any vector orthogonal to our M dimensional subspace we get a relative error of 1. (relative error is the number |X-Y|/|X| where X is the vector to be approximated and Y is the approximating vector)

Lets return to the plane and examine the first interesting case N = 2 M = 3 k = 1. A first thought is to chose three vectors such that they form an equilateral triangle. With this choice no vector will ever be more than 30 degrees different than its best approximation by one of the 3 vectors. Since adding in -1 times the three vectors gives 6 equally distributed vectors yielding 360/6 = 60 degrees between vectors meaning we can be at most 30 degrees off from one of them. This gives a maximum relative error of sin(30)=1/2. The observation that we should equally distribute the vectors as much as possible solves the problem for odd M but obviously such solutions do not hold for even numbers since then half of our vectors would be just -1 times other vectors leaving us with a very obviously non-optimal dictionary.

A slight modification of our proceedure remedies this problem. We simply equidistribute the vectors not through the angles 0 to 2*Pi but from 0 to Pi treating the first vector as occupying both the positions at 0 and at Pi. This works for both odd and even numbers of vectors in M and it is not hard to see that this is in fact the optimal solution. Since if we consider the case k>=2 for n=2 we can obviously always exactly match any vector so this is a complete solution of the problem for 2 dimensions.

One might expect for there to be similar unique and hopefully even simple and structured solutions for higher dimensions. On this I cannot pretend to know what the solution to the problem is like but I am certain that the 3 dimensional and higher cases of the problem whatever the solution is it isn't simple. For the 2d case finding the most equidistributed set of M vectors was not hard. There is a unique(up to a rotation) such configuration for every M. But for 3 dimensions and higher this is no longer the case. Though if we have say 6 vectors a good bet for the optimal solution is to put them at the centers of half the faces of a dodecahedron. Very probably the platonic solids each correspond to a provably optimal and unique(again up to rotation) solution/s of the problem for specific numbers of vectors M.

Clearly though these solutions do not provide the general solution. Since the solution of the general case in exact terms would certainly be at best very difficult and at worst practically impossible here we move into the realm of heuristics for a bit. The most obvious first approximation to picking equidistributed vectors in some vector space is simply to choose vectors off the unit n-sphere with a completely uniform probability density.

If M>>N then this is probably a good enough approximation to work and a dictionary of random vectors picked from N dimensional space will give us close to the minimum possible value of the maximum error. Just how much larger than N M would have to be in order for this to be sufficiently good is certainly not clear.

What if we were to choose a first vector at random and then choose subsequent vectors by picking the vector which has the smallest maximum magnitude dot product with a vector already in our dictionary with ties being broken randomly. Clearly this would first result in the building of an orthogonal basis of the space. Once a complete orthogonal basis has resulted we would begin picking vectors from a complimentary basis and so on down the line. Such a construction is tempting until you realize that such a prescription doesn't give the right answer even for the simple 2D case.

On the other hand a probabilistic version of the same algorithm is perhaps worth considering. Consider the same algorithm but now modified so that every time we have chosen M vectors we randomly delete half of them and then again choose deterministically and again randomly remove vectors and continue this process for some large number of steps.

Wednesday, February 3, 2010

A man of 18 Letters (maybe 52)

I just got my bachelor's degree diplomas from the university in the mail. I do admit I now have the urge to go out and get some frames that will fit them and hang them up on my wall. Also that would help make it so that I don't lose them. (Note to self begin using better filing system than bedroom floor provides)

They are identical to one another except one has the word "Mathematics" in the middle and the other one has the word "Physics" in the middle. I guess now I'm a man of letters 18 to be exact 7 in physics and 11 in mathematics. Of course you could include the letters in "Bachelor of Science" which gives an extra 17 letters for each degree almost tripling my total count to 52.

Sunday, January 24, 2010

Pollard rho method of factorization: iterated functions and factorization

The Pollard rho method of factorization is a heuristic factorization method which is based on the idea of iterated functions. Say you take some particular function f(x) which maps the integers in the interval [0, n-1] to themselves. Then necessarily for any particular starting number you pick say x0 the sequence of numbers x0, f(x0), f(f(x0)), ... will necessarily start repeating itself before n+1 iterations of the function since there are only n possible values.

Of course not all numbers in the interval need belong to a cycle and if the function sometimes maps multiple numbers to a single number then there must exist numbers that do not belong to any cycle. Starting off our iterated function at one of these numbers will cause a string of numbers the first few of which will not be repeated followed by an infinite loop of numbers that form a cycle. This "shape" of the graph of such an iterated function is what gives the name to this method of factorization since the greek letter rho is likewise made up of a short tail which leads into a loop.

it is certainly not obvious how one could hope to use such an iterated function to factorize a number. The motivation lies with the birthday paradox namely that if we pick a random collection of say sqrt(n) numbers then the probability that 2 of them are congruent mod n is about 50%. So if we think about our iterated function as being a pseudo-random function then we would expect that after about sqrt(n) iterations we will have around a 50% chance of picking a value that we have picked before.

Finally the idea of how we can use iterated functions to factor numbers begins to form. The key idea is that if we consider a sequence of numbers then after about sqrt(p) iterations we can expect to have found 2 numbers which are congruent to each other mod p. We call this a collision mod p. Of course if n is the number that we are attempting to factor then n = p*q with p prime and while we do not have access to p we do have access to n. So instead of analyzing sequences mod p we look at sequences mod n. Since n is a multiple of p when we mod by n we do not change the value of the same integer sequence mod p. So although we can expect to find a cycle mod n in around sqrt(n) steps more importantly we can expect a collision(and therefore a cycle) mod p in sqrt(p) steps.

How to detect such a collision mod p takes a bit of thought since we do not have the value of p and therefore cannot simply check for such collisions directly. Say there has been a collision mod p in our sequence and let the terms which are equal mod p be f and f'. Since f = f' mod(p) then f - f' = 0 mod p and therefore f - f' = kp for some integer k. Let g(i) be the i'th iteration of the function f acting on x0. If we take the greatest common divisor of the differences g(i) - g(j) and n then in the case of a collision mod p we will recover a factor of n which if we are lucky will not be equal to n itself.

There is one last piece to the puzzle which needs to be put into place before this is of any use however. If we were simply to check the value of gcd( g(i)-g(j), n) for all values of i and j such that i < i =" kL"> T. When this happens then g(i) will be the kL-T element of the cycle and the element g(2i) will be the 2kL - T element in the cycle meaning they differ by L elements and therefore must correspond to the same element in the cycle.

At this point we have sufficiently analyzed the algorithm so that if we readily had access to "random" functions from the interval [0, n-1] to itself then we could expect to produce a factor of n = p*q with p being the least prime factor of n in O(sqrt(p)) operations. Since p <>0 after say 100*n^(1/4) steps and performing hundreds or thousand such restarts can push this probability as low as you like.

That said it should be obvious that we do not have access to such suitably random functions. Generating such a function in its completeness would of course take O(n) time which is much greater than the O(sqrt(n)) time required for trial division. Because of this in reality such an arbitrary random function is replaced with some arbitrary function which is assumed to behave sufficiently "randomly". An oft made and at least somewhat good function which is often chosen is f(x) = x^2 + 1 (mod n) This function is relatively easy to calculate and gives good mixing and in practice actually produces good results. Unfortunately there is no proof that x^2 + 1 actually does operate as advertised. With a particular choice of function the claims of operating time that I just made should be viewed with scrutiny. For instance if the function is f(x) = x + a (mod n) then the iteration of the function is very clearly not "sufficiently random" and in this case one would expect a collision after O(n) steps instead of O(sqrt(n)). With the function of f(x) = x^2 + 1 one expects much better behavior but still probably not quite as good as one would expect from a truly random function and the actual performance is not know rigorously. Thus this factorization method remains purely heuristic but ultimately that is of little importance in practice since a factorization is just as valuable if it is found by a heuristic method as if it is found by more rigorous means.

The real value of the pollard rho method however really is its simplicity and intrinsic interest and elegance. There are other factorization algorithms which have been rigorously analyzed and are known to be faster (for instance the number field sieve method which I am working up to). In practice one wonders how important it actually is that the iterated function really be "random" one could for instance use a higher order polynomial but there is no reason to believe that this would significantly improve the randomness and since higher order polynomials would take longer to compute very likely in practice the simple x^2 + a is best.

One could concievably dynamically generate an actually random function by choosing the value of a particular argument when the function is called with that argument for the first time and then saving the value of the function for that argument. Since this would require about O(sqrt(n)*ln(n)) bits of memory to carry out the memory requirements could easily become unrealistic. For a number of say 20 digits we would expect to require about 20*10^10 bits of memory on average before a factor was found. For perspective since 1 gigabyte of memory is about 10^10 bits. A number with just 30 digits (a factorization task easy enough for my graphing calculator to handle) would require an average of 30*10^15 bits of memory which is roughly 3,000,000 gigabytes. Clearly it is not realistic to randomly generate and store a function in this manner.

More interestingly one might consider picking a random function by picking random coefficients in some set of basis functions. Obviously any complete basis of an n dimensional vector space must have n vectors in it. So if we want to be able to specify any possible function taking values at n discrete points we are not going to be able to do better than a straight table of values. On the other hand if we are willing to settle for sparsely populating the function space with functions such that they are sprinkled uniformly over the function space then we are much more in luck.

For instance we could pick random Fourier coefficients in a Fourier basis and then interpret the function mod n. Based on how large of a difference between functions we are willing to tolerate we could choose an appropriate number of coefficients to be able to approximate the functions sufficiently well. With careful analysis a Fourier basis could actually be possibly helpful but since the basis functions are inherently periodic there is a very real danger of creating very structured behavior in the iteration of the function which would interfere with the motivation of having picked a "random" function in the first place.

It would be intriguing to attempt something like this with one or another class of orthogonal polynomials. For instance one could use the Lagrange polynomials and pick weights for the first few polynomials to generate a polynomial which could be used as the randomly chosen function. If the probability distribution of the size of these coefficients had an appropriately chosen decay rate then one could very effectively sprinkle functions uniformly over the function space. Of course uniformity of the distribution of the functions over the function space is no guarantee that the iteration of the function is in any sense "random". Also ultimately this suffers from the same problem that probably the increased randomness is not worth the added time it will take to calculate the function.

So if we are going to spend a significant amount of time generating and or calculating the function we use then we had better expect some speed up in the generation of collisions for doing it. For sufficiently large n the added randomness of higher order uniformly distributed polynomials would probably be worth the extra calculation but the best we can hope for from truly randomly distributed functions is O(n^(/1/4)) which while not too shabby perhaps we could improve upon it.

For instance in the case of Fermat numbers it is known that the prime factors of the k'th Fermat number are of the form 2^(k+2) + 1 With that in mind it seems reasonable to make sure that you look at the numbers of the form 1 mod 2^(k+2) which is easily accomplished by making your function f(x) = x^2^(n+2) + 1. Of course now your function is much more computationally intensive but the speed up is well worth it.

In general what is desirable is that when we view the iterated function sequence mod p it is desirable that we don't have to wait very long before there is a collision. This comes down to the desire that the average cycle length mod p is low for most primes. Of course such functions exist for instance the function which is f(x) = x+1 if 2 | x and x - 1 otherwise has a cycle length of 2 for all starting numbers but obviously the property of "sufficiently random" is not met. But while it is fairly easy to show that every iterated function over a finite set causes paths that consist of tails leading into cycles what is not at all easy to do is to figure out say how long the average cycle is going to be.

I started writing this blog post around 11:00 this morning and it is now 11:00 at night so I don't think I shall be figuring out any interesting new twists to this 35 year old algorithm before bed. Of course here you can cue the usual rampant speculation perhaps certain properties of the Fourier spectrum of the function would reveal properties of the iterated function, or better yet perhaps there is some function basis for which functions with small iteration cycle lengths are sparse etc etc etc.

Wednesday, January 20, 2010

Verdict is in, high school students are noisy

Although today was better than yesterday the high school students that I am having to share my office with are extremely noisy. Besides their intrinsic loudness there is also the fact that they have power tools to play with. Fortunately the power tool fun is mostly in the lab adjacent to my office but the walls are thin. Although I do feel slightly interested in what they are doing really that only serves to make them all the more annoying.

Sad to say it but I am afraid to do my real work for fear that they would see it and comprehend it. For some reason I feel the need to project a sense that whatever it is that I am doing must be complex and arcane. I don't want to actually do my physics homework since they might look over my shoulder and discover that my physics is mostly just algebra, integrals, curls, and divergences and the like (all definitely accessible at a high school level). For similar reasons I don't want to do anything else that might be sufficiently simple to be comprehended by those around me. But reading and understanding the complex stuff takes more concentration and work than I can easily muster in the midst of such mayhem. I only hope that the students won't stay so late that they will still be hanging around my office late at night. I walked to the library so that I could read my number theory books in peace but hopefully this next six weeks just translates to the loss of the use of my office only between the hours of 3 and 6 or so not from 3:00 on.

Friday, January 15, 2010

My office is getting taken over by robots

As I was sitting in my office today I was informed of the fact that for the next 6 weeks I would be sharing the office with a bunch of high school students. The high school students are apparently working to make robots for some competition. They put a keybox next to the office door so that anyone who knows the combination to the box can get out the key and unlock the door. I suppose that on the bright side this means that if I lock myself out of my office during the first 6 weeks of the semester I won't have to go get a master key in order to get back in.

I am a little apprehensive about the presence of these students it might be interesting to have them around or it could be monumentally annoying. Since I have become something of an artificial intelligence fanatic over the past couple of years though I probably won't be able to keep myself from asking them about how they are coding up the AI. Perhaps they should be getting a little primer in q-learning or kernel learning...

Wednesday, January 13, 2010

My Web Identity Equity

I just put "Tim Anderton" (my name) into google and hit search. A quick perusal of the results is interesting but not terribly heartening. Although results that actually have to do with me at least happen to be on the first page. Results 2, 7, and 8 on the first page were related to me. Not surprisingly the most prominent result was the course web-page for the class that I was a TA for last semester. The next results were similar 7 was a page showing past REU participants and 8 was my facebook page.

Sifting through the first hundred results drops the percentage of relevant pages down from 30% to 17% which is still surprisingly high. But simply taking the percentage of items which are related to me is rather naive since it doesn't take into account the order of results. Clearly if the 17 results actually related to me were all at the end of the 100 results then that would be a much weaker net presence than if they were the first 17. How exactly I should deal with the strength of the order dependence is unclear.

After some thought I have settled on a system for measuring this quality of prominence, For lack of a better name I shall call it Web Identity Equity. Of course this is not restricted to persons if you enter the term "Thermal Physics" and then looked through the results to see where the textbook of that name by Kittel and Kroemer falls you could measure the identity equity of that book on the web thus the name. Instead of having the Web Identity Equity (WIE) represented as just a single number I have decided it is better to treat it as a vector.

The first element of the vector should definitely be the percentage. Even though it doesn't take into account any sort of ordering it is easy to interpret and gives good information about the density of results.

After a little deliberation the second element of the vector I decided should be a weighted sum with weights of 1 over the search rank. So the first search result gets weight 1 the second result gets weight 1/2 the third weight 1/3 and so on. This has the advantage that early results are greatly favored over later results but there is not much difference between the importance of the 14th and 15th result. In order to make the number nicer to process and compare the sum should be normalized. The sum of 1/n from 1 to N limits to Ln(N) + gamma where gamma is the Euler constant So not caring to carry out the normalization exactly I will just divide the result of the sum of the reciprocals of the relevant rankings by Ln(N) + gamma and call it close enough. The error is very small for any appreciable number of results the effect is about 2% for 10 results and is already 0.1% for 100 results so the effect wouldn't be noticeable anyway.

The 1/r weighting however falls off extremely slowly for the tail end of the curve as the fact that the series diverges attests. So the natural next step is to use the weighting 1/r^2 (r is the ranking of the result of course). This has the nice property that the sum of the ranking weights converges for an infinite number of results to Pi^2/6. Since now the tail end of the distribution is given in some sense zero weight this number represents a sort of heavy favoring of early search results.

Finally for the fourth number I feel that I need to take into account the fact that people will often not go to the next page of search results if they can help it and results on the same page will often share similar perusal. So for the last number we take a weighted average of the density of relevant results on each page. Since moving from page to page is much less likely than just looking at the next result I feel that the weights should have a strong bias for the early result pages. But unlike for the earlier measures I feel that here it is appropriate to treat the number of pages visited as a poisson distribution. Somewhat obviously the mean of the poisson distribution should be greater than one since you can't visit less than 1 search page and the probability of visiting more than 1 should be nonzero but the mean number of search pages is definitely less than 2 and if we just let the mean be 1 the appropriate weighting is rather nicely just 1/r! (that is 1 over r factorial).

it is tempting to include an extremely rank dependent ranking system that models the search depth as a poisson distributed parameter with some larger mean but frankly it is too much of a pain to calculate and normalize for large numbers of results unless I was willing to write a script. I might do that in the future but for the moment this will have to do.

Now getting back to the actual search results for my name. My Web Identity Equity Vector using the first 100 google results is

0.170, 0.212, 0.181, 0.252

I can't help but be somewhat amazed by the consistency in the different numbers. Apparently my web identity equity is about 20% no matter how you measure it.

Of course while fun this is only really interesting when you have other things to compare it with. This sort of thing would be perfect to write a little web script for so that you could do it for any random thing you felt like but alas I have no such skill.

Tuesday, January 12, 2010

Utility Access Only: or My New Office

Over the Christmas break I was moved out of my old comfy office which was a converted storage room in the basement, to a room that is old lab space. I suppose this room does have the advantage (disadvantage?) that I actually get reception for my cell phone here. But rather ironically this room is probably the least accessible spot in the whole damn building. On top of the offices positional awkwardness, which I assure you is severe, my office doesn't appear at first glance to have a working door.

My office has two doors, one is clearly marked JFB 205 and the other is rather humorously marked "Utility Access". Judging from the exposed strut running down the middle of the room between the two doors I think it is plausible that once upon a time the utility access door actually was what it purports to be. At any rate the door actually marked as though it led to an office instead of a janitorial closet is now completely inoperable. When the work crew came in to install the desks for some reason they figured it would be best to keep the desks as close as possible to the door and as far away as possible from the far wall... I don't think they had quite worked out that the desks poke out into the room farther than the door is away from the wall. Thus if you try to access the room through the obvious entrance you find the door jams after opening about an inch and a half. Giving the impression that perhaps the occupants have barricaded the door in order to defend against the imminent zombie apocalypse.

Fortunately for me I had tried to access the room before the desks were actually installed and so was aware that the apparent janitorial closet right next to the office was in fact a secret entryway. I hope the desk placement stupidity isn't corrected too quickly since this will enevitably lead to other people being aware of how to enter the room.

Fermat Number Factorizations

Fermat once conjectured that all numbers of the form 2^(2^n) + 1 were prime. More or less this was based on the fact that the first few such numbers 3, 5, 17, 257, 65537 were all prime. This was the farthest out that the primality of such numbers could be tested in Fermats day since the very next number in the sequence is 4294967297 which rather understandably Fermat did not recognize as being 641*6700417.

So far as we know the only prime Fermat numbers are in fact the first few numbers known to Fermat to be prime. Primality testing of the Fermat numbers and more challengingly the factorization of Fermat numbers has become something of a past time in number theory especially since computers got into the fray.

I should take a moment to talk about just how large the Fermat numbers really are. It is convenient to think of fermat numbers as binary strings. The Fermat numbers in binary are 11, 1001, 10000001, 1000000000000001, 100000000000000000000000000000001, .... and so on. The formula for producing them being to make a string of 2^n zeroes and then replacing the first and last zero with 1's. Viola a binary expansion of the n'th Fermat number. Rather obviously these numbers grow very very large very very quickly. Because of the difficulty of factorization of large numbers and the phenomenal growth of these numbers we have managed to completely factor only the first 11 Fermat numbers. This may not seem like much of an accomplishment but consider that "unbreakable" RSA cryptography uses numbers of a few hundred digits. F11 has 2048 binary digits which translates to over 600 decimal digits. One should keep in mind of course that RSA uses numbers which are worst case for factorization algorithms so factoring a 600 digit RSA number would be much harder than factoring a 600 digit Fermat number.

F12 is a 4096 digit number in binary (1234 decimal digits) and so even after dividing out all the known factors 114689, 26017793, 63766529, 190274191361, and 1256132134125569 we are still left with the incredible task of factoring a number of over 1200(decimal) digits. Although finding these gargantuan factors for F12 is certainly a laudable feat we have barely scratched the surface of the problem.

Friday, January 8, 2010

Binary Budokan

This ad for chrome cracks me up.

Thursday, January 7, 2010

The Time Machine

I just watched the most recent version of "The Time Machine" I have watched both the old black and white version and the one produced in the sixties. I thought the movies were all right and of course it wouldn't seem right for me to not have seen them being the time travel fanatic that I am. This most recent movie is much much better than the other two though with the reason being that it doesn't quite follow the story line set out by H.G. Wells. Now I think the original story line of the time machine is decent but overall the plot is not what the story is about. The story is about time travel not about weena. I think I am going to have to buy this time machine movie. The time travel sequences are very pretty and the story is well done and even a bit original although the eloi and the morlocks are still at the center of the whole thing.

I have always been something of a time travel fanatic. I have fond memories of Kip Thorne's "Black Holes and Time Warps". I remember checking the book out of the library in 2nd or 3rd grade reading the first 40 pages or so. I had to check it out multiple times even to get that far. Mostly I would find some out of the way corner and I would sit musing and dreaming while holding the book instead of actually reading it. I would read a few lines or a paragraph or two until something was said that I didn't understand or that made me think of something else and I would close the book with my finger in the pages at the point where I was reading. I remember learning scientific notation from the book which kindly explained for non-technical readers that 6x10^100 meant a 6 with a hundred zeroes after it. I kept going back to that little passage and rereading it over and over again. It was because here there was some tiny glimmer of the workings of real science something of the mechanism that made physics and math really work. I admit that scientific notation is about as simple as you get but to a second grade version of me it was something I could fully understand with a little thought and that was exciting. If I had read it from an algebra book in school it would have perhaps been interesting but no reason to day dream about unlocking the secrets of the universe.

Wednesday, January 6, 2010

Combine Ball

A couple of years ago a team I was in made a half life 2 mod that was a kind of dodge ball game. You threw combine balls around with the gravity gun. It was actually pretty fun. I figured it was time to post the video

Course Evaluations are In

So I got my first course evaluations for the discussion sections I taught in the fall. On the overall I took away the message that I really do need to be more careful about being clear be more carefully prepared and do more discussion and less working of problems. Of course I actually got more complaints to the effect that I wasn't doing enough homework problems than complaints that I wasn't doing the discussion well. But the complaints that I was not working enough homework problems were completely groundless since I was in fact doing almost nothing but working homework problems. It requires much less preparation to show how a few homework problems are to be worked than to prepare a real discussion.

I think my first semester of teaching went less well than it might have but I think it has made me a better teacher. I was a much better discussion leader towards the end of the semester than towards the beginning. Partly because my discussion sections were held at 7:30 and 8:35 in the morning I lost a great deal of the people in my classes. Since there were more evaluations than there ultimately were students attending the discussions I am pretty sure that some of the more negative reviews of my teaching were from people who came two or three times and then never again. I'm eager to see how the discussion sections I will be teaching in the spring evolve. Hopefully the classes won't go from not very full to barely there. Of course there will be some natural attrition since a large number of people will decide that the discussion section is not worth their time no matter how good it is. Unfortunately I won't really be able to productively compare the sections I will be teaching in the spring to the ones I taught in the fall since I will be teaching 2020 discussion sections instead of 2220 sections. Very probably the attrition rate in the algebra based physics (2000 series) is going to be higher than the calculus based versions (2200 series).

Tuesday, January 5, 2010

Playing Poker for a Living: Who are you playing against?

Because poker is played against other players instead of against the house you don't need to be phenomenally good at poker in order to have a positive expectation, you only need to be better than someone you are playing against. As the existence of casinos demonstrates many people are willing to lose money on average in order to have a chance to win some money. The fact is that a lot of people find it fun to gamble and even more people find it fun to fantasize about winning at gambling. So very good poker players simply end up taking money from players who are less good.

Online poker is the best way to try and make a living at poker. The cost of playing online poker is smaller than the cost of playing poker at a physical casino, also many players who do not have access to a casino are willing to play online. Online activity isn't bound to just one physical place and so it is different times for different players giving more uniform behavior over time.

There are a large number of people who play poker occasionally for fun and a few people who play poker constantly, some for a living. I would be willing to bet that the distribution of the number of players who play with a certain frequency roughly follows a power law. N(f) = a*f^k for some choice of constants a and k where N(f) is the number of players who play with the frequency f. I expect this to hold true mostly for relatively small frequencies. Obviously there are many more low frequency players than high frequency players which means that here we expect k to be negative.

Lets take a moment here to consider the question of what this sort of distribution would mean for who you are likely to be playing when you log in to a poker site. Higher frequency players play more often but there are a lot more lower frequency players. If the power law distribution holds (and there is some reason to think that it would) then a player which plays with a frequency f would have a chance of being picked with probability density proportional to f*N(f) since the probability of a particular player being picked is the combination of the probability of that player being on which is f and the number of players with that frequency. Normalizing this density we get

P(f) = (k+2)*f^(k+1).

taking this a step further we calculate the average frequency of play as

integral f*P(f) df = (k+2)/(k+3) = faverage

This of course isn't very helpful unless we have some idea of the value of k. As we already said k must be negative to make lower frequencies more common than higher frequencies. But this means that there is a singularity at f=0 and examination of the probability density that we just came up with suggests to us that if we want to keep the total number of people online at a given time finite we need a k which is greater than -1.

for the limiting values of 0 and -1 we get an average play frequency of 2/3 and 1/2 respectively. The middle values have values in between these. for the middle of the road k = -1/2 we get 3/5 rather obviously no player can be playing with frequency 1 and it is doubtful that the average player is playing 2/3 of their day since that would mean they just take out 8 hours to sleep and thats it. Even the low estimate of 1/2 is too high. Qualitatively though the point is that high frequency players are on more often than low frequency players. If we take these frequencies to be a fraction of the highest normal play frequency then these numbers make more sense. The highest realistic value for the top player frequency is probably about 1/2 which would correspond to playing 12 hours of poker every day. A regular 40 hour work week would correspond to a frequency of 5/21. Using this as a base line I would say the highest tier of poker players probably have a playing frequency of around 1/3. This gives much more realistic figures for the average frequency of around 2/9, 1/5, and 1/6 for values of k 0, -1/2, and -1 respectively.

At any rate this suggests that most people you play will be people who play a very significant fraction of their time. This however doesn't take into account the fact that people do not play at all times of the day evenly. Taking this into account will skew the distribution back towards low frequency players during peak times and towards high frequency players at off times.

Ludum Dare #16 Voting Results

Much to my surprise after the voting happened for Ludum Dare 16 it turns out my game entry came in 12th overall!

Top 20 Entries

Considering there were 121 total entries, that this was my first game making competition and that quite a number of the people who enter in these competitions actually code or even make games for a living, I'd say that is reason to be proud. Also my game was the only game in the top 20 which didn't come with some sort of executable. Of course I would like to say that this was a major disadvantage because it means many fewer people ran my game than might otherwise have. But I suppose that it may also be an advantage since only people that really wanted to play my game would have run it.

I'm going to try and participate in the next LD too and this time I will not have to relearn python or learn pygame and assuming I can find a few hours to detatch it from my LD 16 entry I will have a rough game engine. This time I will spend a few hours before hand making sure that I can make an executable.

Sunday, January 3, 2010

Stochastic Factorization Algorithms

There are a number of probabilistic algorithms for primality testing. Probabilistic primality tests are based on fermats little theorem and its relatives.

fermat's little theorem

basically there are properties of prime numbers that can be tested for in a computationally efficient manner. Thanks to this sort of test we can probabilistically "recognize" prime numbers.

Factorization on the other hand is a completely different story. Primality testing and factorization are both difficult problems but their difficulty comes in completely different packages. In primality testing we are trying to distinguish between just two different possibilities whereas in factorization there are a vast number of potential answers. Relatedly it is extremely easy to check a factorization solution we simply multiply our proposed prime factors together and if we have found the correct factorization of our number then the product of the prime factors will be that number. On the other hand if we are handed a number and told that it is prime checking that solution to the primality testing problem is no easier than or original problem.

Since knowing the factorization of a prime number is equivalent to knowing that number is prime the factorization problem cannot be any easier than primality testing. But since we know of no way to make factorization work in less than exponential time and probabilistic primality testing works in essentially constant time... this is not much of a restriction. Obviously it would be nice if we could make good probabilstic factorization work. Since factorizations are easy to check a correct factorization suggested by probabilistic methods would quickly become a known factorization.

What we want then is a way to look at a number and simply "recognize" its factors. For numbers written in our regular decimal system there are a few nice ways to simply recognize some of the factors of that number. For example if the last digit of a number is 0, 2, 4, 6, or 8 then one of the factors of the number is 2. Somewhat less trivially if the sum of the digits of a number is divisible by three then the original number is divisible by three. A test that you might not be familiar with is one for 11 if you alternately add and subtract the digits of a number divisible by 11 if the result is divisible by 11 then the number itself is divisible by 11.

Although we could come up with tricks like these for every prime number that wouldn't end up giving us a factorization procedure that was much more effective than just checking for factors by dividing by all the numbers less than the square root of the number we want to factorize, since we would have to perform procedures to recognize each divisor individually.

But this brings us to the question what if we could probabilistically test for divisibility by whole classes of divisors at a time? There must be a way that numbers which contain different classes of primes "look" different. I don't have a nice rigorous mathematical way to look at a number and test whether or not it contains say a twin prime or more usefully a 4n+1 prime nor for that matter any general category of prime. But it is reasonable to believe that there are statistically recognizable properties of numbers with primes of different types. I just spent a few hours coding and trying to find such properties but didn't find anything by hand.

Even so the question remains unanswered as to whether or not it might be possible to write a program which could learn good heuristics which would suggest factorizations for numbers. Rather obviously this sort of tactic wouldn't be much good for factoring numbers which are used in cryptography since those numbers are composed of two enormous primes and it is just not likely that any heuristic program no matter how good could realistically "recognize" the factors of such a number. Realistically in fact this sort of strategy probably only has a real chance of learning to quickly factor relatively small numbers say on the order of 20 digits. Which means that more normal sieve factorization methods will be more than capable of easily handling these same numbers. Despite the fact that nothing useful is likely to come out of it this conjecture has taken my fancy. I've already put some effort into actually working it out and I am actually going to put some mental muscle into this.