Scientist Wannabe: 2009

2009-07-14

Logging modules for perl

I recently decided not to create a logging class of my own, since there must already be a lot out there. Not surprisingly this proved to be a correct decision.

I started looking for packages already available for Ubuntu, and found Log::Handler, Log::Log4perl, and Log::LogLite. I started drafting prototypes for them.

F: {alignments} -> {tree topologies} is continuous

If the title is not clear enough, it means that the function that maps alignments of genetic information to phylogenetic tree topologies is continuous. Well, kind of. It's a scientific proof, not a mathematical one, so it's based on evidence. But as far as evidence goes, it makes a compelling argument in favor of the thesis.

The paper is Wong, et al, 2008 (Science) "Alignment Uncertainty and Genomic Analysis". DOI: 10.1126/science.1151532 .

The relevance IMHO of this paper is not to estabilish that 'common sense' is usually correct (although it is important in its own matter to know when it's possible to infer things solely from ituition). The authors make the case for interpreting alignments themselves as random variables, and in doing so, they conclude that (in a very precise way) small variations in alignments produce small variations in tree topologies inferred from those alignments. More so, they indicate this result is robust in respect to methods of inferring phylogenetic trees.

For the mathematically inclined, they defined a metric in the set of alignments, a metric in tree topologies, generated alignments (via MCMC) that were near the reference alignment, and observed that the resulting trees all had topologies near the reference topology in the given metric.

This CPU intensive work was done for alignments of whole genomes, with several alignment techniques and tree inferring algorithms, with similar results for most cases. Definitely worth a read.

(UPDATE: fixed title typo)

Migrating mail filters from MUA to MDA

I like freedom of choice. A lot. I even like to be able to choose different alternatives for common tasks during the course of the day. Call it a whim, but when it comes to mail programs, I can never decide, really, what's the best for me. Maybe this is because I never found one that suits all my needs.

Nevertheless, I used to use KDE before I became an Ubuntu user, and from that time I kept my whole PIM suite in the KDE stack from Dapper 6.06 to Hardy 8.04. Enough is enough, and if I'm not using KDE I wasting precious RAM and cycles keeping all those libraries loaded just for kmail and amarok. I will surely miss amarok, and kmail was great but it's time to move on.

I wanted my PIM to sync both from my Palm PDA and my home desktop, laptop, and workstation at work. I'm not a big fan of third party cloud services, if I can't be sure who has access to my data, so I decided to go back in time and migrate from a fully-featured-MUA centered setup to a home cloud-like setup. Provided I can access my PIM data from whatever MUA I choose, I can use pretty much anything that talks IMAP (which is anything these days). I'll be truly free, then, to use a great GUI app for the daily routine, a light GUI app if I need RAM to do more resource intensive work, and good old pine (ressurected as alpine) if I need to access my mail remotely, via ssh.

The project is basically to store everything in the desktop an imap server in my desktop machine, POP3 accounts fetched by fetchmail , and fed to an MDA (procmail or maildrop).

I already had kmail store my mails in maildir format, so installing an imapd server was a matter of installing the package. After a brief poll in ubuntu-users mailing list I installed dovecot, and kept using kmail for a year or so later to pull my gmail mail via POP3. Now, I'm only accesssing my mail via IMAP since the upgrade to Jaunty, so my home cloud project is kind of stalled.

I'm in the process of testing gnome-pilot and opensync to keep my PIM and Palm PDA synced.

All that remains is to migrate my mail filters from kmail to an MDA (be it procmail or maildrop), and then use fetchmail to get mail via POP. This is where I'm stomped. So far, the only think I've found that resembles a solution is this perl script, except it's the oposite of what I want. I'm not even sure I can use it to create the reverse solution I need.

I've drafted a little perl script, and I'll probably have to do it from scratch since my needs are pretty modest. I have 100+ filters (too many to convert manually), but most or all are the kind of 'match and move to a folder' simple filters. I'm just trying to scratch my own itch, not create a general purpose application, so it's probabl y a matter of simple Perl, once I decide what exactly will I output to.

Since time is always in short supply, if anyone knows a way to convert filters to either procmail or maildrop, I'll be glad to hear it. I also welcome suggestions on which of these MDAs to use.

2009-06-29

Stochasticity x Determinism in elementary math

Most of the problems I'm having in studying theoretical topics nowadays stem from the fact that I've had null measure content of Statistics as a math student. I've seen interesting topics as Analysis, and Differential Geometry in undergrad, and useful stuff like Linear Algebra and Computational Linea Algebra in the masters course, but absolutely no Probability and Inference.

Nothing related to Data (not the android) at all. And this is mostly my fault. I knew before the Masters Course I wouldn't be pursuing a PhD in Math, and I knew my Math teachers wouldn't deal with topics I would most likely need in the near future (the one I'm living now).

I'm chasing the lost time, with books in Bayesian Data Analysis and Inference, but I although most of the times I understand what I read, I never seem to grok it. I will, certainly, in time, but time is a commodity a grad student doesn't have - I need Statistics now. As well as Biology, Ecology, Evolutionary Biology, and (why not?) a little Computer Science. So it's fair to say I'm chasing the lost time, and losing.

Prior to leaving I've seen the creation of a new undergrad course, Applied Math, in my former University; it started with concentrations in Finances, Mathematical Biology and Scientific Computing, and it soon became obvious to the faculty that some basic knowledge in Probability and Inference were a must, so it's been introduced as obligatory courses for all concentrations. My former advisor there told me he thought it was overkill at first but soon (in the first year or so) realized how suitable it was.

This is why I think it's a terrific idea (definitely worth spreading) this nice Arthur Benjamin fella presented on this talk on TED.

Obviously I don't think you should rip out everything related to Calculus (I like it, after all :) ). If you are really to grok Probability, you need a strong base in Calculus (from integrals, to maximizing Likelihood functions). But a change in paradigm is definitely well deserved. Our modern western societies are still studying according to old rules. Rules that fit well to the time and reality where they were idealized, but probably are just outdated now. We are a new society on overdrive, with a new (still changing) set of moral rules, new problems and challenges, new perspectives, new age limits. Why stick with the 19th century education philosophies? At least let's realize it's about time to discuss if it's worth changing it. See also another TED talk on this subject.

This all also brings an old question I've always had: is it a good thing that education curricula should be centralized? It's good to know beforehand what people must (might? should?) have learned looking at their curriculum. I'ts useful for the teacher/professor to know what to expect the student to know, and this also applies to the student. I've been bitten before, when the pre-requisites for a course weren't clear. I also have first hand experience of how good and dynamic an improvised class can be, when given by a motivated (and skilled) professor. OTOH, I also have firsthand experience in improvised classes that sucked.

Which is the lesser evil?

2009-06-20

RFC: New paradigm for Ubuntu release cycle

Would it be possible to break Ubuntu into two independent sections: 'Interface' and 'Infrastructure'?

Think about Interface, as everything the ordinary user 'wants' to be updated in the 6 month cycle (GUI, apps, kernel, etc), and Infrastructure as everything else that keeps the engines running, i.e., glibc, and the like.

I think it would be nice if Ubuntu could make 6 month releases be binary compatible with each other, and in order to do that, you must compile everything with the same library versions. I can see why we might want a newer kernel every now and then, but why do we really need to to upgrade glibc every 6 months? Do the applications really need all the new features all the time? Can there be a separation of some sort so that some packages be upgraded, and others preserved? I think it would benefit all, since using the same packages for two or three (or so) releases would decrease the support burden, security- and stability- wise.

It would also have many other side benefits: easier to backport packages, reduce storage space in mirrors, and reduce upgrade issues (since there will be less packages being upgraded). It would definitely be a plus in the corporate side, where ISVs would have a much more stable infrastructure to develop on, and much closer to the two most popular closed OS approaches.

One drawback would be that bugfixes would be harder to get propagated downstream, and longer maintenance cycles sometimes aren't easy to do, specially in open source community, where development is often a continual burst of unordered creativity. It would also not be necessarilly easy to decide where to draw the line (e.g., would poppler/cairo be interface, or infrastructure? and XUL? GTK?).

Still, I think this would be an improvement, and this post is intended to draw healthy discussions in mailing lists and ubuntu related fora.

2009-06-19

My Ubuntu wishlist for the next LTS

I have quite high hopes for the next Ubuntu LTS release. Currently, I expect that:

• LTS releases will never ever be used as a platform for propagation of new technologies...

... such as the pulseaudio fiasco in Hardy. I still don't understand why pulseaudio is a Depends, not a Recommends. I hope this kind of thing will never happen again in an LTS. Seriously.

• Ayatana be widespread used in most or all officially supported applications.

I hope it will be used in all default installed applications. I'd like to see this happening as soon as 10.04, the proposed target for the next LTS. This will leave LTS-only users with jaws dropped for sure.

Tracked here.

• Fully featured PIM suite.

Seriously, Evolution is well integrated with Gnome, but I guess (citation needed) Thunderbird has a more widespread user base. It would be good to also have Thunderbird as the center of a whole alternative full PIM suite, that could sync with Evolution, mobile devices and the cloud (Google calendar/contacts, MS Live, Apple, coff... Squirrelmail coff..., etc).

I'd like to sync my PIM applications with mobile devices, like my Palm PDA, and in the future, a cell phone. However gpilot is less than complete (in comparison with e.g. kpilot), and since PalmOS is basically dead upstream, I don't expect much development in that front. OpenSync looked like it would be the way to go, but now it looks like vaporware - it's stable release 0.22 is years old, and the 0.3x development branch progresses slowly. It worked in my tests in Ubuntu 9.04, didn't in 8.04, but it seriously lacks a proper interface. Granted, I'm geekly enough to use it, but it feels very arcane. Definitely not ready for Ubuntu widespread usage (Kubuntu users might disagree, since kitchensync seems interesting, YMMV).

Also, I'd like that my PIM was stored (or synced) in such a way that it would be accessible with either evolution, thunderbird, alpine, my Palm TX and my cellphone. (Akonadi, anyone?)

I know most of what I just mentioned is not Ubuntu specific, but upstream work, but if Canonical considers itself in such a position to drive development, it should use its weight to direct a clean PIM infrastructure, be it opensync finally releasing something new and usable (either 0.2x, or 0.4x), or something else. Please, do it for the next LTS and I'll stick with it for quite a few years.

• Firefox should time out the addons upgrade window

Every now and then, when I open Firefox, it helpfully informs me that there are newer versions of addons I have installed, and offers to upgrade them. But because I cultivate obscene amounts of tabs open, I don't often wait to see firefox opening since it usually takes quite some time. This is when this window makes me throw something out the window; it prevents firefox from opening until I take some measure, either to upgrade or dismiss. This should clearly have a time out and dismiss after a few seconds (I guess 15 would be more than enough for a someone to aim at the OK button, and click it. Even if the user misses the time out, there is a chance of doing it afterwards, so it's not really necessary to do it at the start.

• Home cloud

Sure, why not? Canonical is soon launching Ubuntu One, I sure would like them to bundle a similar sync suite that could sync my PIM data across my desktop and laptop (and my PDA), media storage available over the wire for my music and videos, and various other potential applications. Who has the time to sync their whole hard drive with the cloud? Let me sync my important (small) stuff with the cloud, and the more heavyweight stuff remain at home. Palm is doing this kind of PIM could sync with the new pré smartphone.

And please give me the option to not have to trust anyone with my PIM stuff, it's too important to be left at third party servers, thank you. Home cloud, I always loved you, even though I never met you.

• Quickly be released quickly (get it? get it?)

This is something that I miss since I started programming for my project, it will really help me keep my test and workstations up to date. I already love it since the blueprint.

It's targeted to Karmic, AFAICT.

• Separation between 'Interface' and 'Infrastructure'

This is something worth a more in depth explanation, so warrants it its own post.

Upgrading Ubuntu from Hardy to Jaunty directly

I originally intended to upgrade only my laptop because of NM, and waited until Jaunty beta was released for the laptop upgrade, but decided the corrected procedure was safe enough for my purposes to do it also in my desktop.

The following is what I compiled after doing it twice, and correcting the errors from both these attempts. What this means is that this is what I think I should have done on both times, but is not exactly what I did, i.e. on both attempts I had issues, and was able to overcome them.

I decided to follow loosely the upgrade instructions from Debian's Release Notes, and it mostly worked. I found out, unfortunately, that some tweaks were necessary, and in the second time I tried, I could bypass most of the problems I had.

First, a brief review of the process, then a brief description of the problems I had, and how I solved them.

The instructions start with obvious recommendations on backup, notification of users, and contingency plans. I back up fairly often my mission critical files, and did a full backup of my home dir and /etc. Since it was just my personal laptop, I didn't have to bother much about a contingency plan or warning anyone else. Check, check, check.

If you're trying these instructions, you should definitely read the Debian Release Notes sections 4.1 to 4.4, skipping only 4.2.4 and 4.2.5, which are debian specific, before doing anything to your system. Instead of these, make sure you disable every unnoficial repositories (meaning, everything that doesn't seem like archive.ubuntu.com). I made an exception for the partner repo, archive.canonical.com, which is where I get adobe-flashplugin from. If in doubt, read the Repo Howto.

A nice tip from debian is to do the upgrade from inside a screen(1), and also to use script(1) in order to have a full log of everything. Just copy and paste the commands from section 4.5.1:


screen -S upgrade
script -t 2>~/upgrade-jaunty.time -a ~/upgrade-jaunty.script
sudo aptitude update

Then proceed to section 4.5.3 to make sure you won't face an epic fail because of disk space, specially if you have a separate /var partition.

From 4.5.4 onwards, it gets very Debian specific, and although it might be interesting to read, we have to stick with our Ubuntu specific issues. This brings us to the point in which I describe what can go wrong if you follow the Debian instructions.

All the problems I had stemmed from the fact that the packages python and python-minimal was never pulled as a dependency in this whole process, and was only upgraded in the final step, too late for most python apps, that expected a clean python environment, but the above entioned metapackages kept at their 2.5 versions. I was actually surprised that this turned out to be a problem, especially because I thought all these apps were also in Debian.

I say many errors in packages upgrading returning messages like:


Instalando language-selector-common (0.4.2.1) ...
INFO: using unknown version '/usr/bin/python2.6' (debian_defaults not up-to-date?)
Traceback (most recent call last):
File "/usr/bin/fontconfig-voodoo", line 101, in 
main()
File "/usr/bin/fontconfig-voodoo", line 56, in main
fc = FontConfig.FontConfigHack()
File "/usr/lib/python2.5/site-packages/LanguageSelector/FontConfig.py", line 36, in __init__
self.li = LocaleInfo("%s/data/languagelist" % datadir)
File "/usr/lib/python2.5/site-packages/LanguageSelector/LocaleInfo.py", line 41, in __init__
et = ElementTree(file="/usr/share/xml/iso-codes/iso_639_3.xml")
File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 546, in __init__
self.parse(file)
File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 586, in parse
parser.feed(data)
File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1245, in feed
self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 52, column 11

And a lot of other packages with less verbose errors, containing only the first INFO warning line. Crossing fingers wasn't enough, and when I saw most of the interface apps were hosed, I got this kind of errors when trying to run apps from the command line:


Traceback (most recent call last):  File "/usr/bin/software-properties-gtk", line 44, in 
from softwareproperties.gtk.SoftwarePropertiesGtk import SoftwarePropertiesGtk
ImportError: No module named softwareproperties.gtk.SoftwarePropertiesGtk

These messages made it somewhat clear that python modules were missing, but I made sure their packages were in fact installed. Reinstalling ( sudo aptitude reinstall [package] ) the package made the problem disappear for the application. I'm not knowledgeable in Python, but I think this has something to do with python byte compiling the scripts during package installation (please correct me if I'm wrong in the comments).

Luckilly I didn't have to reinstall the whole OS, just the python apps that were misbehaving. I had to go back and forth through dependency trees in order to find everything that depended on python and python-minimal, and reinstall them, and there were a lot of them. This was a good opportunity to use the script(1) log that was created in the process. It took me more than one hour, but I was able to fix everything without reformatting and reinstalling (kudos to aptitude interactive interface!).

I'm not sure why this issue affects Ubuntu, and not Debian, but I did run into this in my two upgrade adventures, so I thought it was worth to share the (probable) correct routine.

Now that the issue is covered, let's see what the upgrade routine would look like.

The first step in Debian instructions, is to upgrade apt, which will also pull libc and other essential stuff. However from the experience above, it would be useful to also upgrade python or python-minimal as soon as possible. It was only later that I found out I could probably have done both in a single step:

sudo aptitude install ubuntu-minimal

This is a meta-package that depends on apt, and python/python-minimal, so I think it would suffice. It will pull python2.6 and python-minimal2.6 as dependencies, so all python apps will already have their bed made, when it's required.

As a second step, let's make sure a second base meta-package kicks in, which will pull another batch of essential stuff:

sudo aptitude install ubuntu-standard

After that, you can safely issue the first step of the Debian release notes upgrade:

sudo aptitude safe-upgrade

Depending on which packages you you have installed, you might need to repeat this step until it upgrades no more packages. This is because although aptitude has a great dependency resolution algorithm, it's not perfect (but this is being worked on ;-) ).

When you come to the point that safe-upgrade can't upgrade any more packages, it's the time when most of the other stuff must be resolved, which is:

sudo aptitude dist-upgrade

This command should be the first time where packages are removed by conflict resolutions. Again, you might need to issue it more than once, but there's nothing to worry (unless you get errors, but they are being logged by script, and you can go over it whenever you want).

Now, as a final step, make sure the basic meta-package is upgraded, if it wasn't already:

sudo aptitude install ubuntu-desktop

Your kernel will be upgraded in this step, if you use the generic flavor, but your current version won't be removed. You might want to remove it manually afterwards. You can keep more than one, but it's mostly clutter, specially if you, like me, have a separate /boot partition.

Now you can logout of script, logout of screen and boot your new Jaunty. Computer Janitor (System>Administration>Computer Janitor, or whatever it is translated to for your language) will help you cleanup afterwards.

Note: Expect this mini howto to be updated.

2009-06-18

Why I upgraded my Hardy to Jaunty

l really tried to hold my upgrade mania and stick with the LTS releases of Ubuntu. I was convinced that LTS was the way to go for academic purposes, where one needs the stability of a non-chaning environment to work on his own projects, instead of working on his system, until I decided to upgrade my Ubuntu 8.04 installs to Ubuntu 9.04. A few reasons drove me to this decision, but I still think 8.04 was good enough for my purposes, and my petty problems were manageable.

I was constantly annoyed by a few usability bugs in evince, since viewing PDFs is a frequent thing in my daily routine. I managed to bypass completely any pseudo-need I had to upgrade to Intrepid 8.10 since my main motivation was a group of fixes for evince, and Perl 5.10.

Most of these needs were either not critical, or could be easily solved by a backport (which I used to scratch my poppler/evince itch). Other positive aspects of 9.04 that directly affect me were the new NetworkManager with improved wireless support as well as support to GSM modems, that makes a Huawei modem I borrowed work in a truly Plug n Play fashion.

But Perl 5.10, poppler/evince fixes, new NetworkManager were all the rationalizations I made to justify the hassle of a major, untested, undocumented dist-upgrade of this sort. Truth be told, I was mostly interested in checking out ~~Notify-OSD~~ Ayatana.

Mark Shuttleworth announced bold plans for a new notification system that could at the same time look great and clean up the clutter in the notification area. I gave some serious thought about why his original idea was a good thing, as well as a bad thing. I was not alone. Although I never expressed my opinions on the topic publicly, they are often partially mirrored by others, who can express in a better (clearer and more concise way) like Celeste Lyn Paul and Scott Kitterman.

Most important is the fact that the original proposal was revised and what I consider some of the most controversial sections were abandoned in the implementation that was released with 9.04 (notably the no history requisite).

The final version is good enough to provide a compelling case for an usability experiment - it's good enough to discuss about what's been around so far. I think everyone always thought about notifications in a basically equal way, as if there was a 'right' way to do it. Mark is trying to do it differently, and acks he may be wrong. I disagree with some points, but the major points are valid enough for a trial.

Not all is working, since the applications must be aware of this, and it appears that Canonical only developed the proper plugins for Evolution and Pidgin. I'd like to see similar integration for other applications, such as Mozilla Thunderbird, and hardware discovery notification (flash drives, camera, etc). I'm sure these will come soon, though. I also hope that a history GUI will be created in time. Hopefully something like the Log file Viewer app, that keeps logs separate by sources, and dates. The way it is now, it's very hard to even check out the log, since it's a dump of every notification from every application, including pidgin's messages, so it's growing very fast for me. Also, the notifications from Evolution are less than useful, since it only tells that I have new message, but doesn't point me to which account the new message is from, like pidgin does.

I've been using Jaunty for more than two months now (since beta), and I think I'm used to the new notification system by now. I switched my main IM app from amsn to pidgin, and get notifications of logins and messages. I get notifications of the bzr commits for my programming routine (which is not necessary, since I'm the only one developing it, but it's nice anyway), I get notification for hardware buttons, like sound volume, wifi connection and screen brightness. So far I like it, and I'm glad upgraded. It feels modern and cool.

Update: s/Notify-OSD/Ayatana/ and links.

2009-05-31

Enter Wolfram|Alpha: 3 - target public

Who will use Wolfram|Alpha? What is ideally its target audience? Certainly it's useful for statisticians and engineers, who deal with numeric data, can understand normalization and have experience with various graphical formats for data and results.

The big question here is: Will it be a big hit for general netizens? A major improvement on how ordinary people surf the interweb? In other words: will grandma use Wolfram Alpha? Should she?

I think not, or more specifically, not now. Lay people are happy with Google (at least I assume, citation needed), and although there's an overlap on the uses for both services, they are obviously not competing in their main focus, since they have different purposes (despite some sites painting a non-existent competition scenario).

This fictitious struggle with google shouldn't even be mentioned, since it's obviously a non-issue. It does not use similar technology, does not use similar data sets and does not tackle the same problem. It's actually hard to compare the two services, although in some cases, possible.

Wolfram Alpha is an answer engine, using Web Semantics technology for parsing and interpreting queries, and accesses a curated database. Google is like a big grep on steroids - it also parses the input query and do some normalization (notably accentuation and spell check), but what it does is basic text search on a huge database of text collected automatically over the web. It doesn't do something new (web search dates back to gopher), it innovates in the sorting algorithm (which as many other great ideas, it seems awfully simple once explained). In a nutshell, Google lets you find results over the web that best match your query, Wolfram Alpha tries to provide interpretation of quantitative answers to quantitative questions. It's all about number crunching.

Not all is sweet in the media hype, and I've seen at least two "bad" reviews in the wild (Ryan Tate for gawker and John Timmer for ArsTechnica) worth reading. Both give valid points, but I think they miss the key point: it is not supposed to be *the* answer machine, but the first big one (not even the first one, mind you - examples in wikipedia here, and here, and a nice pointer to a sciam article for more general approach to the topic).

It's not for me to prophetize that one day we will interact with the computer using solely human language, because of the excess ambiguities both in written and spoken language. Ordinary languages are not well suited for this, as Ryan Tate summarizes:

human language itself lacks the precision to enable what Wolfram is attempting

I expect WA to improve over the next years, both with acquisition of new data for the existing data sets and new unrelated data sets, and in the code linking data sets to each other and parsing human language.

I also expect other technologies to improve in parallel, and will not be surprised if other big answer engines appear, especially in the Open Source world. Standardization is key for these technologies, and open source developers often show an aptitude to crunch and implement open standards in drafts usable by tech-savvy people. But for now this is just wishful thinking.

Enter Wolfram|Alpha: 2 - hands on

This post is a follow up from the introduction on the interface website of Wolfram Alpha.

Let's look at some fun examples on how to use the new Answer Engine.

Query: where is god?

Now would you be surprised if a computer system knew the answer? I was, because it did. It helpfully offers to give coordinates and satellite image to God. (hint: far from the middle-east)

Answer: Hungary.

Can it predict major astronomical events?

Q: next solar eclipse in rio de janeiro
A: 11:53 am BRT | Sunday, February 26, 2017 (7.748 years from now)

Noted, but I will probably forget by then. Also, this is not a total eclipse. The next total eclipse observable from Rio will be on 2:41 pm BRT | Saturday, August 12, 2045 (36.22 years from now)

Similarly, we can get a log of such astronomical events.

Q: last solar eclipse in rio de janeiro
A: 9:31 am BRT | Tuesday, September 11, 2007 (1.721 years ago)

I missed it, unfortunately. I know, I suck.

How about my daily lunch's nutritional facts?

Q: rice + black beans + chicken breast + carrot + beet + lettuce + tomato
A: full nutrition table

Nice. It gives all the amounts for not only of calories (who cares about that anyway? ), but also all the vitamins, proteins and fibers. Looks like my lunch of choice is not that bad.

How about a Math example? A friend of mine recently asked my help calculating the roots of a polynomial with degree large enough so that Maple was barfing for her. Maybe there is a way to tweak maple in order to calculate this symbolically. Maybe this is big enough so as it only can be done numerically. I'm no expert either in CAS nor algebra, but 20 doesn't look like an excessive degree, especially for such a simple polynomial, but obviously I must be wrong.

Q: z^23=1

Cool. It not only gives the roots in a nice I-did-it-with-pencil-and-paper-not-matlab way, but also plots them in the C plane. It was also fast.

I bet with the proper query syntax, my friend could solve the proper system to get the roots that interest her Finite Fields problem. (hint: separate polynomials with comma).

Does it keep up to date?

Q: swine flu
A: (as of 26/05/2009)

Usability bonus: it noted from my IP address that I'm in Brazil, and provided the summary for my country. I can attest that it still showed the total of 9 cases during the whole day in which the amount was updated to 11 in the local media, so it means it doesn't get epidemiological data in real time from, say, CDC, which would be cool. However, it probably also means that data must be manually curated, which is good for accuracy purposes.

To end this not-a-tutorial list of examples, what about the most profound question one can ask to any system?

Q: life the universe and everything

WA knows the correct answer. It couldn't get any better. I won't spoil the answer, you'll have to see it for yourself.

Do you know other fun queries? Post in the comments.

Update: corrected images.

2009-05-16

Enter Wolfram|Alpha: 1 - intro

Today is the first time I could finally use the interface of Wolfram Alpha (wkp) for my own personal tests and tricks.

Those of you who haven't seen as much SciFi as I may not be as astounded as you should. If you don't know what Wolfram|Alpha is yet, don't worry, it appears to have been kept secret until the preview. There's a completely mind boggling screencast available, showing off some basic usage, and an extensive compendium of example queries.

The examples areas vary wildly among several scientific areas, such as Mathematics, Physics, Engineering and Statistics (not surprisingly, since they're the creators of the arguably popular Mathematica app (wkp), which is also at least one of the backends of the whole deal), and of course, the very food for thought that made it possible, like CS, Web and Linguistics. There are also some very data-centric areas such as Unit conversion and Nutrition facts. Again, not at all surprising, yet (but wait, we're not there yet).

Considering the background from where it came, there are also other less obvious areas represented in the example sections, such as Life Sciences, Earth Sciences, Finances, Geography and Socioeconomic data, Meteorology. From a first look, the example section seems to encompass everything, the one encyclopedia to rule them all. Understandable absences are Human areas, but these areas have problems of their own.

If you paid attention to the screencast above, you noted that you don't have to make queries in standard query form (simplified version of the full phrases you actually think before querying): you can actually make questions in standard English. Although this is not new (Altavista could back in the day), the way it parses the information in the query is very impressive. Not only it has an almost telepathic ability to understand what you mean, even if you're not really sure how to question, it shows (and this is new for me) how it is parsing.

For example, from the screencast, if you type Springfield, you'll find out it's a very popular name for city, and Wolfram Alpha tells you it's assuming you're referring to the one in Massachusetts, and displays a wealth of data about it. If you want another Springfield, you can simply choose from a dropdown, and it will then show the pertinent data available.

For another example, I asked it "who are you" (trying to be cheeky), and was immediately knocked out my chair when the result came:

I took a few seconds to acknowledge it was actually an error (those cheeky wolfram people). After I caught my breath, I pressed reload and got the correct answer:

OK, no singularity there, folks. Pheew...

The most freaking experience so far is when you try to cross between completely unrelated databases and get coherent answers. Like those abusive queries you see the movie and TV detectives cross impossibly huge databases to narrow down suspects, or find relationships between perps and vics. Or even those impossible queries from Star Trek, when Data asks the computer to... extrapolate! (Wolfram people, tell me this is on your TODO list).

And for those of you who can't stand computers when off duty, know that Wolfram Alpha may very well become the substitute to your ~~personal~~ favorite ~~psychotherapist~~ companion, since it apparently has a limited ability to "understand" famous quotes:

Asimov would be proud. I mean, as long as wolfram Alpha abides to the Three Laws. But of course, no one would be crazy to create a singularity unconstrained, right? (Right, Cyberdyne??)

Next, a follow up post with some interesting queries to get started.

2009-05-12

\pi = 3.0. Exactly.

Nothing to say about this. How can people even take anything for granted, surrendering their own indivituality? Free will, anyone?

2009-04-28

creativity@TED

This presentation has got to be the funniest TED talk (if not, please provide a link in the comments).

But this of course is just a bonus point, because as most (or all) TED talks, this one's very insightful.

This is a talk by Sir Ken Robinson (wkp), about creativity, particularly how our typical school systems undermine most of the innate creativity children have.

"(...) the whole purpose of public education, throughout the world, is to produce university professors"

He goes on to say (after a few good jokes about university professors) why this is so. It appears that that our public education in a whole is centered about the idea of academicability. I still have to sleep on that for a while, but maybe this is one of those cases where something is too obvious to be noticed.

It was around the 19th century that public education was "invented (...) to meet the needs of industrialism". He goes on with other interesting insights, but there's enough spoilers from me for now.

On a related video, another good tip for those of you who like to think about creativity. Here's Amy Tan (wkp).

Of course, I still want to know what Creativity *really* is, which always brings me to what the hell Intelligence realy is. A friend of mine is now completely obsessed with AGI, and thanks to a tip I gave him from a Scientific American podcast, Singularity (wkp), so probably in a while I'll have some insights from him about these interesting topics.

2009-04-26

Nostalgic rant of the interweb

I miss those good old days when HTML was white background, black foreground and blue links, not these devilish flash websites and heavy webapps we have today.

I miss the good old days of ASCII emails, not these HTML embedded with attached images, off-site images. Oh yeah, those spam-free text-only simple and efficient e-mail, that was email. Who the hell needs scripts being auto-executed when reading emails?

Heck, I miss the good old mainframe days, when men were real men, and coded their own interfaces, patched and recompiled the programs they used, until they worked to satisfaction. Of course, this was done in macho-language C, not these bogus unreliable unpredictable script languages, that don't enforce variable definitions. These modern languages that let ignorants remain ignorants.

Maybe I'm 20 years older than my real age.

Disclaimer: I'm not old enough to actually have used - or even seen, FTM - a mainframe, but that's beside the point. I can't code in C, although I have some basic familiarity with the syntax. I have seen someone use Python, but I don't grok it either. </hypocrisy> I wish I had enough skills to fix my own issues, but there's only so much time.

Vampire eco dynamics

What do you get when you mix fiction with science? You may have said sci-fi, but that's not all of it. You might also get fun science (as in, not necessarilly useful, purposeful or publishable science, just the kind of fun stuff you do in your free time).

Linked from pharyngula, I got the tip for a nice fun article someone posted in his (her?) family site. It looks like it may be a follow up to one of those meaningless conversations everyone likes to have among loved one and friends, but this guy actually did his homework to make his point - he did the math for the model, and simulations to get the equilibrium points. But that's not what I liked the most about it (although I like this kind of stuff a lot). I liked it that it's a funny

Just take a look at what I'm talking about:

In principle, ecologists might employ two basic strategies to get at a problem like this. The empiricists would go out and find a field site where they could actually observe predators and their prey, and just tally the results over time. The theoreticians would chuckle at the empiricists, and construct mathematical models that probably approximate the behavior of populations in the field, keeping their hands more or less clean in the process.
In real life, most ecologists use both strategies off and on. Unfortunately, I don’t know of any real life vampire populations in the field, so we’re going to have to pretend that we are strict theoreticians. That means that we’ll be using math: some algebra, some calculus, and some matrix theory. This is O.K.! It hurts a lot less than, say, getting bitten by a vampire as you’re trying to fit the bugger with a radio collar.

PDF of the article.

He could display little of the math and simulations, although I can see why he didn't. I would at least give the option to the curious reader to learn more, say, put the dirt in a supplemental PDF, or an appendix. Even though, it's