Skip to content
 

Factual – a new place to find data

Factual collects data on a variety of topics, organizes them, and allows easy access. If you ever wanted to do a histogram of calorie content in Starbucks coffees or plot warnings with a live feed of earthquake data – your life should be a bit simpler now.

Also see DataMarket, InfoChimps, and a few older links in The Future of Data Analysis.

If you access the data through the API, you can build live visualizations like this:

Of course, you could just go to the source. Roy Mendelssohn writes (with minor edits):

Since you are both interested in data access, please look at our service ERDDAP:

http://coastwatch.pfel.noaa.gov/erddap/index.html

http://upwell.pfeg.noaa.gov/erddap/index.html

Please do not be fooled by the web pages. Everything is a service (including search and graphics) and the URL completely defines the request, and response formats are easily changed just by changing the “file extension”. The web pages are just html and javascript that use the services. For example, put this URL in your browser:

http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.png?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]

Now if you use R:


library(ncdf4)
library(lattice)
download.file(url="http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.nc?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]", destfile="AGssta.nc")
AGsstaFile<-nc_open('AGssta.nc')
sst<-ncvar_get(AGsstaFile,'sst',start=c(1,1,1,1),count=c(-1,-1,-1,-1))
lonval<-ncvar_get(AGsstaFile,'longitude',1,-1)
latval<-ncvar_get(AGsstaFile,'latitude',1,-1)
image(lonval,latval,sst,col=rainbow(30))

Or if you use Matlab:

link='http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.mat?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]';
F=urlwrite(link,'cwatch.mat');
load('-MAT',F);
ssta=reshape(erdBAsstamday.sst,201,201);
pcolor(double(ssta));shading flat;colorbar;

The two services above allow access to literally petabytes of data, some observed some from model output. I realize you guys don’t usually work in these fields, but this is part of a significant NOAA effort to make as much of its data available as possible. One more thing, if you use “last” as the time, you will always get the latest data, This allows people to set up web pages that track the latest (algal bloom) conditions, such as done by one of my colleagues.

BTW – for people who want a GUI to help with the extract from within the app, there is a product called the Environmental Data Connector that runs in ArcGIS, Matlab, R and Excel.

Roy’s links inspired me to write another blog post, which is forthcoming.

This post is by Aleks Jakulin, follow him at @aleksj.

2 Comments

  1. diligentdave says:

    If you would forgive me, I’d like to refer back to a discussion on your blog of the relationship between Demographics and Economics—

    http://andrewgelman.com/2010/07/demographics_wh/

    Since comment on that is closed, I have to try to interject something here.

    While reading a paper titled: “On the Origins of the Great Depression” written in 1978 by then University of Manitoba professor & economist, Clarence L. Barber, I noticed this most important quote at the end of his paper—

    “I can think of no better the following sentence from a footnote on the final page of the text of Hicks’ Value and Capital. “Nevertheless, one cannot repress the thought that perhaps the whole Industrial Revolution of the last two hundred years has been nothing else but a vast secular boom, largely induced by the unparalleled rise in population.”

    Earlier in his paper, Barber pointed out how he believed that a decline in “household formation”, including a decline in both marriages and birth rates, contributed, and possibly was the main underlying cause of the “Great Depression” (supposedly of only the 1930′s). For example, in 1926, Barber pointed out, demand for housing in the U.S. began to decline. And, in early 1929, that demand plummeted. Followed, one assumes consequentially in October 1929 with the stock market crash. Barber also underscores the fact that DEMAND for loaned money declined before the AVAILABILITY of money reached one nadir in 1933, with the same banks who failed in the intervening years to loan out sufficiently so that they profited sufficiently to have capital available to loan out later (1933 and beyond).

    In 2009, David P. Goldman, an economist, at Firstthings.com wrote a very insightful article titled, “Demographics & Depression” where he pointed out how life cycle factors affected greatly by demographics can greatly affect economic activity. Here, in part, is what Goldman says—

    “Sometimes it helps to look at the world with a kind of simplicity. Think of it this way: Credit markets derive from the cycle of human life. Young people need to borrow capital to start families and businesses; old people need to earn income on the capital they have saved. We invest our retirement savings in the formation of new households. All the armamentarium of modern capital markets boils down to investing in a new generation so that they will provide for us when we are old.”

    He then pointed to this fact—

    “Families with children are the fulcrum of the housing market. Because single-parent families tend to be poor, the buying power is concentrated in two-parent families with children.
    Now, consider this fact: America’s population has risen from 200 million to 300 million since 1970, while the total number of two-parent families with children is the same today as it was when Richard Nixon took office, at 25 million.”

    A little further along in his article, Goldman adds—

    “If capital markets derive from the cycle of human life, what happens if the cycle goes wrong? Investors may be unreasonably panicked about the future, and governments can allay this panic by guaranteeing bank deposits, increasing incentives to invest, and so forth. But something different is in play when investors are reasonably panicked. What if there really is something wrong with our future —if the next generation fails to appear in sufficient numbers? The answer is that we get poorer.
    The declining demographics of the traditional American family raise a dismal possibility: Perhaps the world is poorer now because the present generation did not bother to rear a new generation. All else is bookkeeping and ultimately trivial. This unwelcome and unprecedented change underlies the present global economic crisis. We are grayer, and less fecund, and as a result we are poorer, and will get poorer still —no matter what economic policies we put in place.”

    Goldman summarizes, “Failing to rear a new generation in sufficient numbers to replace the present one violates that order, and it has consequences for wealth, among many other things. Americans who rejected the mild yoke of family responsibility in pursuit of atavistic (i.e., “old age”) enjoyment will find at last that this is not to be theirs, either.”

    He also points to why the bubble expanded and bursted so greatly in the US housing market—

    “In the industrial world, there are more than 400 million people in their peak savings years, 40 to 64 years of age, and the number is growing. There are fewer than 350 million young earners in the 19-to-40-year bracket, and their number is shrinking.

    “The graying of the industrial world creates an inexhaustible supply of savings and demand for assets in which to invest them —which is to say, for young people able to borrow and pay loans with interest. The tragedy is that most of the world’s young people live in countries without capital markets, enforcement of property rights, or reliable governments. Japanese investors will not buy mortgages from Africa or Latin America, or even China. A rich Chinese won’t lend money to a poor Chinese unless, of course, the poor Chinese first moves to the United States.
    Until recently, that left the United States the main destination for the aging savers of the industrial world. America became the magnet for savings accumulated by aging Europeans and Japanese. To this must be added the rainy-day savings of the Chinese government, whose desire to accumulate large amounts of foreign-exchange reserves is more than justified in retrospect by the present crisis.
    America has roughly 120 million adults in the 19-to-41 age bracket, the prime borrowing years. That is not a large number against the 420 million prospective savers in the aging developed world as a whole. There simply aren’t enough young Americans to absorb the savings of the rest of the world. In demographic terms, America is only the leper with the most fingers.”

    Goldman basically concludes with this all important fact—

    “The trouble is not that aging baby boomers need to save. The problem is that the families with children who need to spend never were formed in sufficient numbers to sustain growth.”

    “The origin of the crisis is demographic, and its solution can only be demographic.”

    Like a farmer planting wheat, where less grain is harvested than what was planted, and having this occur year after year, harvest season after harvest season, the effects are an ever diminishing supply of grain to plant, with each subsequent harvest ever yielding fewer seed grain to plant.

    So it is in terms of humans and human capital, where not only economically developed nations or “advanced” economies are producing babies at sub-replacement levels, but where more and more developing nations are doing likewise! This is truly cause for genuine alarm AND ACTION!

    We can ignore what is happening at an increasing rate in reproducing a dwindling number of children (babies), and hence a dwindling number of future wage earners and tax payers only at our peril. As this growing birth dearth did not happen overnight, so it’s cure cannot be effected overnight.

    I assert that the cause of the first “Great Depression” was falling birth rates. And, I believe, while WWII masked that depression, it by no means cured it (how could killing and destruction do so)? But, I also assert that it was the *so-called ‘baby boom’ (1946 – 1964) that got us and kept us out of it. *(I say “so-called ‘baby boom’”, because compared to the previous ‘normal’ period of birth rates, 1900 – 1910, the post-WWII ‘baby boom’ had lower overall birth rates. But, have you ever heard of the baby boom of 1900-1910? No! Why not? Because, compared to previous decades, it was even lower. The post-WWII ‘baby boom’ was merely returning to “more normal” birth rates. Actually, the ‘baby boom’ of that era appears so large MUCH MORE because of the birth dearths that BOTH preceded and followed it. (Since the early 1970′s, for example, in the US, birth rates among white women have been between 170 to 180 babies born/woman/per lifetime).

  2. [...] Factual – a new place to find data « Statistical Modeling, Causal Inference, and Social Science [...]