## Sunday, January 3, 2010

### Stochastic Factorization Algorithms

There are a number of probabilistic algorithms for primality testing. Probabilistic primality tests are based on fermats little theorem and its relatives.

fermat's little theorem

basically there are properties of prime numbers that can be tested for in a computationally efficient manner. Thanks to this sort of test we can probabilistically "recognize" prime numbers.

Factorization on the other hand is a completely different story. Primality testing and factorization are both difficult problems but their difficulty comes in completely different packages. In primality testing we are trying to distinguish between just two different possibilities whereas in factorization there are a vast number of potential answers. Relatedly it is extremely easy to check a factorization solution we simply multiply our proposed prime factors together and if we have found the correct factorization of our number then the product of the prime factors will be that number. On the other hand if we are handed a number and told that it is prime checking that solution to the primality testing problem is no easier than or original problem.

Since knowing the factorization of a prime number is equivalent to knowing that number is prime the factorization problem cannot be any easier than primality testing. But since we know of no way to make factorization work in less than exponential time and probabilistic primality testing works in essentially constant time... this is not much of a restriction. Obviously it would be nice if we could make good probabilstic factorization work. Since factorizations are easy to check a correct factorization suggested by probabilistic methods would quickly become a known factorization.

What we want then is a way to look at a number and simply "recognize" its factors. For numbers written in our regular decimal system there are a few nice ways to simply recognize some of the factors of that number. For example if the last digit of a number is 0, 2, 4, 6, or 8 then one of the factors of the number is 2. Somewhat less trivially if the sum of the digits of a number is divisible by three then the original number is divisible by three. A test that you might not be familiar with is one for 11 if you alternately add and subtract the digits of a number divisible by 11 if the result is divisible by 11 then the number itself is divisible by 11.

Although we could come up with tricks like these for every prime number that wouldn't end up giving us a factorization procedure that was much more effective than just checking for factors by dividing by all the numbers less than the square root of the number we want to factorize, since we would have to perform procedures to recognize each divisor individually.

But this brings us to the question what if we could probabilistically test for divisibility by whole classes of divisors at a time? There must be a way that numbers which contain different classes of primes "look" different. I don't have a nice rigorous mathematical way to look at a number and test whether or not it contains say a twin prime or more usefully a 4n+1 prime nor for that matter any general category of prime. But it is reasonable to believe that there are statistically recognizable properties of numbers with primes of different types. I just spent a few hours coding and trying to find such properties but didn't find anything by hand.

Even so the question remains unanswered as to whether or not it might be possible to write a program which could learn good heuristics which would suggest factorizations for numbers. Rather obviously this sort of tactic wouldn't be much good for factoring numbers which are used in cryptography since those numbers are composed of two enormous primes and it is just not likely that any heuristic program no matter how good could realistically "recognize" the factors of such a number. Realistically in fact this sort of strategy probably only has a real chance of learning to quickly factor relatively small numbers say on the order of 20 digits. Which means that more normal sieve factorization methods will be more than capable of easily handling these same numbers. Despite the fact that nothing useful is likely to come out of it this conjecture has taken my fancy. I've already put some effort into actually working it out and I am actually going to put some mental muscle into this.