Friday, December 18, 2015

More Free Resources!!

I love the internet, everything there is either free by default or can be found for free. For example, here is the algorithms textbook used by Princeton University in awesome online format.

Sunday, December 13, 2015

Post Receive

I've got my website running as a git server, basically when I git push to it git copies the project files into a folder I specify. All you have to do is set up your git repo on the server using git init --bare, then add a file to the hooks directory and give it executable permissions. Check out the contents of the post-receive file below. 

GIT_WORK_TREE="../serve" git checkout -f master

That's it, just one line in the file and my whole project gets updated and put into the ../serve directory.

Friday, December 11, 2015

Meander

Meandering is a word I like to think of as synonymous with procrastination. When I meander from subject to subject in an attempt to avoid actual work (as I am doing this very morning) I typically come up with ideas for blog posts. This idea however came up while listening to an awesome podcast by the two greatest podcasters of all time, Life of Alexander. If you're not familiar with these guys go listen to Life of Caesar (recently moved to Life of Augustus) to get a sense of their amazing work.

Now back to Meandering. The word meander gets its meaning from the Meander river in Turkey which is well known for its winding, random path. Back in Alexander's day, the Meander connected the city Priene to the Mediterranean Sea, but after much silting over a couple thousand years it has diverted elsewhere. Priene is important to Alexander because, as often stated in the Life of Alexander podcast, it was well known for its temple of Athena which burnt down the night of Alexander's birth. Feeling very bad for having distracted the Gods with his birth (Alexander thought of himself as a demigod and as being descended from Zeus and (actually) Achilles), Alexander offered to repair the temple, but the locals refused (kind of them considering how broke Alexander's campaign was).

What some people don't know about meander, is that the term also refers to a formation in a river called... you guessed it, a meander. There's a shocking amount of complicated mathematics, geology involving sine waves and angular momentum, but the easiest way to understand just what a meander is, is to look at the picture below.


The wikipedia also has a picture of what they call meander scars, along the Rio Negro river in Argentina.


Pretty random, pretty cool stuff.

Sunday, December 6, 2015

The Discovery Of Neptune

Not everybody knows of the discovery of Neptune by Laplace. The man literally discovered it on paper before ever seeing it in the sky, then told people when and where to look in the sky, and suddenly, Neptune was born. Check this out for the short story, and this for a legit paper.

Tuesday, November 24, 2015

NP Complete

An exerpt I found in mit opencourseware notes on NP-Complete problems in nature...

"1.2 In Nature

Nature deals with NP-complete problems constantly; is there anything we can learn from it? Biology employs what computer scientists know as genetic algorithms. Sometimes biology simply mutates the candidate solutions, but other times it combines them (which we call sexual reproduction).

Biology has adopted the sexual reproduction technique so widely that it must have something going for it, even if we don’t yet fully understand what it is. In experiments, local search algorithms that employ only mutations (i.e. “asexual reproduction”) consistently do better at solving NPcomplete problems than do algorithms that incorporate sex.

The best theory people currently have as to why nature relies so heavily on sex is that nature isn’t trying to solve a fixed problem; the constraints are always changing. For example, the features that were ideal for an organism living in the dinosaur age are different from the features that are ideal for modern-day organisms. This is still very much an open issue.

As another example of nature apparently solving NP-complete problems, let’s look at protein folding. Proteins are one of the basic building blocks of a cell. DNA gets converted into RNA, which gets converted into a protein, which is a chain of amino acids. We can think of that chain of amino acids as just encoded information. At some point, however, that information needs to be converted into a chemical that interacts with the body in some way. It does this by folding up into a molecule or other special shape that has the requisite chemical properties. The folding is done based on the amino acids in the chain, and it’s an extremely complicated process.

The key computational problem is the following: given a chain of amino acids, can you predict how it’s going to fold up? In principle, you could simulate all the low-level physics, but that’s computationally prohibitive. Biologists like to simplify the problem by saying that every folded state has potential energy, and the folding is designed to minimize that energy. For idealized versions of the problem, it’s possible to prove that finding a minimum-energy configuration is NPcomplete. But is every cell in our body really solving hard instances of NP-complete problems millions of times per second?

There is another explanation. Evolution wouldn’t select for proteins that are extremely hard to fold (e.g., such that any optimal folding must encode a proof of the Riemann Hypothesis). Instead, it would select for proteins whose folding problems are computationally easy, since those are the ones that will fold reliably into the desired configuration.

Note that when things go wrong, and the folding does get caught in a local optimum, you can have anomalies such as prions, which are the cause of Mad Cow Disease."

Pretty interesting stuff if you're into it...

SQL Reversion Scripts

When people new to sql think of databases they tend to think of a table of numbers that ends up in a graph of some sort. What they don't think of is the configuration and user settings that are also stored in that database. So lets say you have a column in the database that should be a series of comma separated values, but was for whatever reason is actually tab delimited. You want to update each column, but you don't want to run the risk of accidentally changing everybody, so here we can actually make a query that generates a series of queries that set the column back to the original value. Check out the example/gist below. You can simply copy paste your results into a new sql file, then run that sql file if anything goes wrong! Keep in mind that the triple quotes are for the case that your value to be changed is a string. If the value is an integer you won't be needing them.

Thursday, November 12, 2015

Timsort

So quicksort is pretty cool. Everybody knows "how" it works, meaning we've all seen a visualization of quicksort picking a pivot, going through and finding values less than or greater than and moving them around the pivot. Unfortunately very few people correctly implement the algorithm correctly on their first try, or if they did, they've never implemented it since. 

Less common that quicksort is the man behind the curtain (edit: I made this post when I was pretty exhausted, is "man behind the curtain" even a phrase? Does it make sense here?)... Timsort. Timsort is based on Tim Peter's list sort for python described here. It even runs O(n lg n) in the worst case, and often faster on real world data! As an adaptive sorting routine, Timsort uses a binary sort when it recurses deeply enough and combines sets of data that were ordered to begin with. (More edits... Tim sort is also a stable sorting algorithm, meaning that if you want to compare array vs array like with Strings, you can compare each piece of that array. This means objects with equal keys will not be sorted arbitrarily.)

Aside from the coolness of having beaten quicksort Timsort has its downsides, for example it does also use n/2 extra space where qiucksort does sorting in-place. In most applications I would have, the extra space would be no problem, however for big data analysis n/2 extra space could be GBs or TBs.