Scientist Wannabe: 2009-05

2009-05-31

Enter Wolfram|Alpha: 3 - target public

Who will use Wolfram|Alpha? What is ideally its target audience? Certainly it's useful for statisticians and engineers, who deal with numeric data, can understand normalization and have experience with various graphical formats for data and results.

The big question here is: Will it be a big hit for general netizens? A major improvement on how ordinary people surf the interweb? In other words: will grandma use Wolfram Alpha? Should she?

I think not, or more specifically, not now. Lay people are happy with Google (at least I assume, citation needed), and although there's an overlap on the uses for both services, they are obviously not competing in their main focus, since they have different purposes (despite some sites painting a non-existent competition scenario).

This fictitious struggle with google shouldn't even be mentioned, since it's obviously a non-issue. It does not use similar technology, does not use similar data sets and does not tackle the same problem. It's actually hard to compare the two services, although in some cases, possible.

Wolfram Alpha is an answer engine, using Web Semantics technology for parsing and interpreting queries, and accesses a curated database. Google is like a big grep on steroids - it also parses the input query and do some normalization (notably accentuation and spell check), but what it does is basic text search on a huge database of text collected automatically over the web. It doesn't do something new (web search dates back to gopher), it innovates in the sorting algorithm (which as many other great ideas, it seems awfully simple once explained). In a nutshell, Google lets you find results over the web that best match your query, Wolfram Alpha tries to provide interpretation of quantitative answers to quantitative questions. It's all about number crunching.

Not all is sweet in the media hype, and I've seen at least two "bad" reviews in the wild (Ryan Tate for gawker and John Timmer for ArsTechnica) worth reading. Both give valid points, but I think they miss the key point: it is not supposed to be *the* answer machine, but the first big one (not even the first one, mind you - examples in wikipedia here, and here, and a nice pointer to a sciam article for more general approach to the topic).

It's not for me to prophetize that one day we will interact with the computer using solely human language, because of the excess ambiguities both in written and spoken language. Ordinary languages are not well suited for this, as Ryan Tate summarizes:

human language itself lacks the precision to enable what Wolfram is attempting

I expect WA to improve over the next years, both with acquisition of new data for the existing data sets and new unrelated data sets, and in the code linking data sets to each other and parsing human language.

I also expect other technologies to improve in parallel, and will not be surprised if other big answer engines appear, especially in the Open Source world. Standardization is key for these technologies, and open source developers often show an aptitude to crunch and implement open standards in drafts usable by tech-savvy people. But for now this is just wishful thinking.

Enter Wolfram|Alpha: 2 - hands on

This post is a follow up from the introduction on the interface website of Wolfram Alpha.

Let's look at some fun examples on how to use the new Answer Engine.

Query: where is god?

Now would you be surprised if a computer system knew the answer? I was, because it did. It helpfully offers to give coordinates and satellite image to God. (hint: far from the middle-east)

Answer: Hungary.

Can it predict major astronomical events?

Q: next solar eclipse in rio de janeiro
A: 11:53 am BRT | Sunday, February 26, 2017 (7.748 years from now)

Noted, but I will probably forget by then. Also, this is not a total eclipse. The next total eclipse observable from Rio will be on 2:41 pm BRT | Saturday, August 12, 2045 (36.22 years from now)

Similarly, we can get a log of such astronomical events.

Q: last solar eclipse in rio de janeiro
A: 9:31 am BRT | Tuesday, September 11, 2007 (1.721 years ago)

I missed it, unfortunately. I know, I suck.

How about my daily lunch's nutritional facts?

Q: rice + black beans + chicken breast + carrot + beet + lettuce + tomato
A: full nutrition table

Nice. It gives all the amounts for not only of calories (who cares about that anyway? ), but also all the vitamins, proteins and fibers. Looks like my lunch of choice is not that bad.

How about a Math example? A friend of mine recently asked my help calculating the roots of a polynomial with degree large enough so that Maple was barfing for her. Maybe there is a way to tweak maple in order to calculate this symbolically. Maybe this is big enough so as it only can be done numerically. I'm no expert either in CAS nor algebra, but 20 doesn't look like an excessive degree, especially for such a simple polynomial, but obviously I must be wrong.

Q: z^23=1

Cool. It not only gives the roots in a nice I-did-it-with-pencil-and-paper-not-matlab way, but also plots them in the C plane. It was also fast.

I bet with the proper query syntax, my friend could solve the proper system to get the roots that interest her Finite Fields problem. (hint: separate polynomials with comma).

Does it keep up to date?

Q: swine flu
A: (as of 26/05/2009)

Usability bonus: it noted from my IP address that I'm in Brazil, and provided the summary for my country. I can attest that it still showed the total of 9 cases during the whole day in which the amount was updated to 11 in the local media, so it means it doesn't get epidemiological data in real time from, say, CDC, which would be cool. However, it probably also means that data must be manually curated, which is good for accuracy purposes.

To end this not-a-tutorial list of examples, what about the most profound question one can ask to any system?

Q: life the universe and everything

WA knows the correct answer. It couldn't get any better. I won't spoil the answer, you'll have to see it for yourself.

Do you know other fun queries? Post in the comments.

Update: corrected images.

2009-05-16

Enter Wolfram|Alpha: 1 - intro

Today is the first time I could finally use the interface of Wolfram Alpha (wkp) for my own personal tests and tricks.

Those of you who haven't seen as much SciFi as I may not be as astounded as you should. If you don't know what Wolfram|Alpha is yet, don't worry, it appears to have been kept secret until the preview. There's a completely mind boggling screencast available, showing off some basic usage, and an extensive compendium of example queries.

The examples areas vary wildly among several scientific areas, such as Mathematics, Physics, Engineering and Statistics (not surprisingly, since they're the creators of the arguably popular Mathematica app (wkp), which is also at least one of the backends of the whole deal), and of course, the very food for thought that made it possible, like CS, Web and Linguistics. There are also some very data-centric areas such as Unit conversion and Nutrition facts. Again, not at all surprising, yet (but wait, we're not there yet).

Considering the background from where it came, there are also other less obvious areas represented in the example sections, such as Life Sciences, Earth Sciences, Finances, Geography and Socioeconomic data, Meteorology. From a first look, the example section seems to encompass everything, the one encyclopedia to rule them all. Understandable absences are Human areas, but these areas have problems of their own.

If you paid attention to the screencast above, you noted that you don't have to make queries in standard query form (simplified version of the full phrases you actually think before querying): you can actually make questions in standard English. Although this is not new (Altavista could back in the day), the way it parses the information in the query is very impressive. Not only it has an almost telepathic ability to understand what you mean, even if you're not really sure how to question, it shows (and this is new for me) how it is parsing.

For example, from the screencast, if you type Springfield, you'll find out it's a very popular name for city, and Wolfram Alpha tells you it's assuming you're referring to the one in Massachusetts, and displays a wealth of data about it. If you want another Springfield, you can simply choose from a dropdown, and it will then show the pertinent data available.

For another example, I asked it "who are you" (trying to be cheeky), and was immediately knocked out my chair when the result came:

I took a few seconds to acknowledge it was actually an error (those cheeky wolfram people). After I caught my breath, I pressed reload and got the correct answer:

OK, no singularity there, folks. Pheew...

The most freaking experience so far is when you try to cross between completely unrelated databases and get coherent answers. Like those abusive queries you see the movie and TV detectives cross impossibly huge databases to narrow down suspects, or find relationships between perps and vics. Or even those impossible queries from Star Trek, when Data asks the computer to... extrapolate! (Wolfram people, tell me this is on your TODO list).

And for those of you who can't stand computers when off duty, know that Wolfram Alpha may very well become the substitute to your ~~personal~~ favorite ~~psychotherapist~~ companion, since it apparently has a limited ability to "understand" famous quotes:

Asimov would be proud. I mean, as long as wolfram Alpha abides to the Three Laws. But of course, no one would be crazy to create a singularity unconstrained, right? (Right, Cyberdyne??)

Next, a follow up post with some interesting queries to get started.

2009-05-12

\pi = 3.0. Exactly.

Nothing to say about this. How can people even take anything for granted, surrendering their own indivituality? Free will, anyone?