Question period with the BD team

Lately BuzzData’s been getting more inquiries from you about what we’re about and where we’re headed in the future. Most of them we get through emails we can respond to directly, but occasionally we do get questions submitted anonymously (that still deserve to be answered!).

As such, I’m taking it upon myself to forward anonymous questions to our staff to get the best answers possible and make them public. Here are the first couple. Enjoy!

Anonymous asked buzzdatablog:
Q: Any plan on internationalization/localization of the site? Like having it in other languages?

Yes, absolutely.

We decided early on to build BuzzData with the expectation that there would be many languages. We have the technical capability to handle additional LTR (left-to-right) languages today. In fact, we already have a (Colombian) Spanish translation 75% done and we have plans for German, Italian and Russian language support in the pipeline.

We hope to support as many languages as Facebook or Google some day, because we believe that many of the truly exciting opportunities for data are in the developing world and we want to be a primary enabler of of data innovation.

However, there’s a very good reason that we haven’t launched support for additional languages yet: our limited ability to address technical issues around the clock in everyone’s native language. BuzzData takes customer service extremely seriously and when we do launch into other languages, they will be first-class citizens in our world.

-Pete Forde, BuzzData CTO

Anonymous asked buzzdatablog:
Q: Love the work BD is doing! Working with a non-profit that needs to create tutorials/videos for their app … what do you use/recommend?

Thanks for the kudos! To be honest, I think my video tutorials are in dire need of gussying up myself (I just use QuickTime Player, which has an automatic screencast recording option and a one-click export to YouTube). I plan to change the format for upcoming tutorials later on down the line. I’ve been told IShowU is a great tool to use for screencast videos. 

In addition, 37Signals has a great step-by-step blog post on how to make slick online videos which might be helpful for you and your company.

Their process incorporates a fair amount of pricey software, however, so I recommend looking up cheaper software alternatives on AlternativeTo and seeing how you get by. 

-Momoko Price, BuzzData communications director

 

 

 

Can you spin stories out of data? Prove it.

                      BUZZDATA’S FIRST DATA-STORYTELLING CONTEST IS ON!

It was only a matter of time until I tried something like this. Hopefully this will be the start of something awesome here in Toronto (and perhaps abroad, if others want in): people evaluating and competing to tell the best stories with data.

THE GOAL:

To tell the story behind the data through your own BuzzData project

THE PRIZE:

$100 ITunes gift card — inspired by this guy’s prize for his “Pop and Lock-toberfest” circa 2010 (Sorry. Couldn’t help myself.)

THE RULES:

The number of data sources you can include in your project is unlimited, but you must use at least one of the following, and you have to include all data sources used in your final submission.

1) Toronto Water Billing Data for the last 10 years

2) EIU’s democracy index rankings 2008 and/or 2010

3) Canada Revenue Agency Contracts

4) World Bank Development Indicators

THE DEADLINE: 

Midnight EST, Sunday October 23, 2011

HOW TO SUBMIT YOUR ENTRY:

Submit your project by inviting me (Momoko on BuzzData) and the original publisher of your chosen dataset (as above) to check out your project on BuzzData when it’s done. 

If you know how to use BuzzData, submit as follows:

1. Clone one of the datasets above directly from the publisher on BuzzData. Make it private if you don’t want people to see it until it’s ready

2. Build your project on BuzzData (posting links and viz’s as appropriate).

3. When it’s done, note in the Overview which visualization/article/attachment is the final product(s), then, before deadline, invite me and the data publisher to check it out. And don’t forget to make it public if you want to show it to the world!

4. Tweet and link to your project elsewhere if you want to build interest in it (optional, but always a good idea)

If you’ve never used BuzzData before, here’s a quick video that shows how to start, build and submit your data project (don’t worry, it’s super easy!):

THE WINNER WILL BE VOTED ON AT OUR NEXT MEETUP ON OCT. 24! 

(Short notice, but no worries, this will happen monthly)

Want to attend BuzzData’s workshop next week? Here are the details:

Where: Room 120 at the Centre for Social Innovation, 215 Spadina Ave. (north of Queen St. West in Toronto)

When: Drop in from 6pm onward on Monday, Oct. 24. We’ll be closing up shop by 9pm

RSVP to our Hacks/Hackers meetup group here, please!

Hope to see you there!

-Momoko Price

 (Have you tried BuzzData yet? What are you waiting for, silly?)

BuzzData star: James McKinney

Buzzworthy Act: Simply put, BuzzData user James McKinney took some data I had and made it better. A lot better.

Now, revising a dataset doesn’t sound as sexy as say, publishing a data visualization or coding an app. However, at this early stage of building a visible group workflow culture for data, the implications of thoughtfully revising data might actually be more significant over the long-term than making it visual or entertaining. 

Above is the original version of a dataset I put up a few weeks ago called “Open Data Hubs Worldwide”: a simple reference list of regions and URLs for journalists and hackers looking for open data; nothing special.  

McKinney, a long-time open-data hacker, immediately followed my dataset and scrutinized it carefully. “I was just getting started on the platform, and since I had a list of Canadian open data cities, I downloaded the dataset and checked it against mine,” he recalls. He saw right away that my dataset was rife with organizational flaws that made working with it quite painful.

“The dataset didn’t have a ‘Country’ column, so to find those belonging to Canada, in my case, I had to go laboriously through all 100 rows. The column header ‘Region/Institution’ meant I couldn’t count on a keyword like ‘Canada’ or a code like ‘CA’ appearing in each case,” he said. (Lesson learned: sorting alphabetically alone rarely makes any sense for data.) “I mentioned this in a comment on BuzzData and reported my findings.”

I realized that McKinney could do a lot to improve my data, so I invited him to collaborate and make changes himself. 


“I started cleaning up, reformatting, and adding to the dataset,” McKinney says. “I created two new columns for country and subdivision, using standard ISO 3166 codes to make it easier for people to match other datasets to it. I also labelled data hubs as government-sanctioned or not, as many data users prefer primary source material.”

Now, with more than 30 followers, six clones and contributions from various users, this machine-readable dataset has become a legitimate resource unto itself, rather than data hidden behind a map, or a visualization, or a white paper. If you Google “open data hub,” this comprehensive dataset is now the top search result. 

Perhaps what I appreciated most during this collaboration with McKinney was the fact that I automatically learned better “data etiquette” from it. So often when we work with data, we organize it to suit our own local, immediate needs. We rarely think about how to set it up so that others can use it in the future. 

McKinney recalls making the same type of list last year, but the experience was different: “I had done something similar for the Open Government Data Working Group of the Open Knowledge Foundation a year ago, but we were using Google Spreadsheets at the time,” he says. “Google Spreadsheets’ strength is real-time document collaboration, but it’s weak on other social aspects. For example, if you close the spreadsheet, you lose your chat history.

“Although BuzzData is only in beta, already the conversation sidebar acts as a useful backchannel between collaborators and the overview page is a convenient opportunity to introduce editors to formatting guidelines and design decisions.”

While McKinney recognizes BuzzData’s potential, he also has his own suggestions on how to steer its development to foster a more cohesive data community: “In order to build a strong and active data authoring community, BuzzData should focus on pushing these collaborative and conversational aspects,” he says.

“One quick-win would be to allow editors to write update messages when uploading new versions, like commit messages in version control systems such as Git.

“Another obvious, but challenging, feature would be to display changes between versions. I believe these two features-being able to see and read about changes-are necessary steps to broader and deeper participation in data sharing and authoring.”

-by Momoko Price

Bio: James McKinney, 26, is the cofounder of Open North, a non-profit whose mission is to build online tools that make democracy better. A long-time resident of Montreal, McKinney is the creator ofBudgetPlateau.com and PatinerMontreal.com and maintainer ofResto-Net.ca. His username on BuzzData is jpmckinney

Another day, another iteration!

[The following post is a rundown of updates from our most recent newsletter. Enjoy!] 

Highlight of the week: BuzzData on Flowing Data 

In case you missed it, BuzzData recently caught the eye of Nathan Yau, author of the popular blog Flowing Data and the new hit data-viz book Visualize This! Yau wrote a nice review of our platform, curious about  whether BuzzData will forge a new path from existing or now-defunct data platforms. [Short answer: yes. Long answer: read this.] We were psyched about the coverage. Thanks, Nathan!

Development News

Over the last week, our team has really refined our dataset upload engine. Now when you upload your data, you’ll clearly see how much time is left to go in your upload and how far along it is. In addition, our ingest process has become far more flexible, so if you have some funky formatting hidden in your dataset, we’re better equipped than ever to accommodate it.

Adding links and formatting your datasets

We’ve had a couple of users recently ask about when they’ll be able to add links and headers to their datasets. Actually, you can already do this on BuzzData, using a simple web-writing syntax called Markdown.

If you’ve never tried Markdown, don’t worry, it’s easy. To add links to your dataset overview, just write it as shown:

markdown

Want to learn how to add headers, fonts, images and more? John Gruber has a great primer on Markdown, as well as a funky web app for testing it out. Give it a whirl. And if you need a hand, just tap us on the shoulder at support@buzzdata.com and we’ll help you out. 

Okay, that’s it for now. See you on the site! 

The BuzzData Team 

What BuzzData will (and won’t) be

“If you want to build a ship, don’t drum up people together to collect wood and don’t assign them tasks and work, but rather, teach them to long for the endless immensity of the sea.”

-Antoine de Saint-Exupery

The BuzzData beta has been public for a few weeks now. Its general reception so far has ranged from evangelistic enthusiasm for its early activity to tentative, thoughtful speculation about its future direction. 

BlogPulse co-creator Matthew Hurst earlier this month attempted, understandably, to position BuzzData in the data value-chain alongside pre-existing startup models: “It is going to be very interesting to see how the site grows and evolves,” he wrote. “Is it a commercial version of IBM’s Many Eyes? A twist on DataMarket or InfoChimps? A re-implentation of Swivels (the YouTube of data)?”

Social datasets— so what?

A few of our early beta users have probably mulled over similar questions since we launched. Many of our early adopters (usually hackers who use Githubget it: they can see where we’re headed and dove right in, while others, likely less familiar with collaborative workflow apps like Github, might upload a test dataset, follow a couple of other users, and then think: “okay, so now what?” 

BuzzData’s social features and easy-to-use UI are familiar value-adds in a post-Twitter world, but we as a team have come to realize that our True Big-Picture Mission is not nearly as easy to recognize for our early adopters. To be clear, with these social features (and many more to come), the BuzzData master plan is nothing less than to gradually infuse the data community (and beyond) with the same real-time, social, collaborative energy that revolutionized innovation for web developers a decade ago. 

What BuzzData will (and won’t) be

In answer to Hurst’s above question, BuzzData is not going to be anything like ManyEyes (or InfoChimps or DataMarket, for that matter). Sometimes I personally think BuzzData might be better described as “ManyHands” for data: as in, “many hands make light work.” 

The real vision of BD’s co-founders Pete Forde and Mark Opausky is an online hub whose purpose goes far beyond that of any static catalogue or “data marketplace,” as existing data-startups are now called. Our goal is to create a place where users — whether they’re individuals, news agencies, science labs, governments — have the power to publish, build, revise and expand existing data into information that’s more current, accurate, accessible and ultimately useful than any version of data they might create alone.

In general, data management is still a relatively isolated, esoteric process — if only someone (hint hint) was focused on connecting people more intuitively and efficiently to their data, their interests and each other, future innovation and knowledge discovery might move more quickly and reliably, while requiring less unpleasant gruntwork per individual person.

Wouldn’t that be nice?

Keeping our eyes on the prize

To improve the speed of data collaboration on BuzzData, one user recently suggested we implement Google-Spreadsheet-like editing functionality to BuzzData. We definitely agree, this seems like an intuitive move, but: we actually have our own plans in mind. Google Spreadsheets is great for on-the-spot, one-off group editing; we’re really bent on creating a place where the best, most current, most accurate data floats to the top, as easily accessible to its audience as it is attributable to its publisher. 

That said, there are many ways to skin a cat, and problems can often be solved by multiple routes. We’re really looking forward to hearing what our users think of the route we’ve taken once it’s fully unveiled. 

Social functionality and easy dataset publishing is just Stage 1 of BuzzData’s ultimate vision. We really hope you’re enjoying it. Stay tuned, because there’s a lot more in store for you. 

-Momoko Price

Got some ideas about improving data workflow? Try out the site (it’s free) and tell us your ideas at support@buzzdata.com (or feel free to bug me directly at momoko@buzzdata.com).

BuzzData Site Superstars (Vol. 1)

BuzzData isn’t just a data platform, it’s a community. As such, we’ll be regularly highlighting users who show exceptional creativity and initiative on the site. Here’s the first pair from last week (more to come): 

David Joerg  Alexander Smith 

Joerg, founder of The Data Collective, got active on BuzzData pretty much immediately, scrutinizing datasets and asking a host of intelligent questions. We love this kind of activity on BuzzData, not only because it gets people thinking, but because it prompts other users to maintain good “data etiquette,” ex. sourcing your data, specifying header rows, explaining your data appropriately. This is something we take for granted  on media sites like Vimeo, and should be actively encouraged in the only-recently visible data community. 

Because Joerg knew the value of simple visualization, he graphed some of the Globe and Mail’s data, quickly finding an apparent yet-unreported spike in sugar, which he prompted notified the Globe about. Shortly after this, Alexander Smith, CEO of Graphient, added oil-price-per-barrel data to the graph to further highlight the trend. He’s since looked into possible leads for what’s behind it. You can read about the whole development in a recent post on open-data advocate David Eaves’s blog

Perhaps one of the coolest things about this kind of activity, besides the data mashups and cross-disciplinary collaboration, is the very encouraging lift in constructive, informed dialogue with newsmedia. 

News website comment threads are often riddled with emotional and ideological blanket statements, etc. (as well as great contributor insights, let’s not forget). Data doesn’t just attract the data-literate (when was the last time you saw someone conceding in a newspaper comment thread that their convictions could benefit from a little regression analysis?). It also has the fantastic capacity to ground dialogue and keep the talk focused on numbers and reality, rather than people and beliefs. 

Here’s hoping we see more of this in the future! 

 

BuzzData’s now live!

Well, this private party’s been fun, but it’s time to stop being so coy and show the world what we’re about. The BuzzData beta is officially public, open to data lovers (and the data-curious) everywhere!

In the last two weeks, we’ve gotten some incredibly engaged and knowledgeable feedback from our private-beta users. Some of the more memorable, warm fuzzy-inducing excerpts:

“I’m sure you hear the word ‘slick’ and/or ‘sleek’ all the time and are perhaps sick of it by now. But that’s what it is, darn it!”

— “I tried uploading my massive 1,315,816-row CSV today, and it worked! :-D

— “I kind of wished the sign-up process were more arduous just so I could fill in some more forms.  O_o  That’s some magic fairy dust, that is.”

— “So far I’ve loved what I’ve seen on the site, I’m kicking myself for not getting on there sooner”

And perhaps the most validating one of all:

      Dude, I love using this!”

We hope to get plenty more feedback as we roll out bigger features — every bit helps us build a product that genuinely meets the needs of the expanding data community. Talk to us, we’re listening. 

Curious about our latest iteration? Check it out for yourself. Here’s one fascinating dataset currently on BuzzData: annual food price indices as published by the Globe and Mail:

Below — an overview (cross-indexed by topic and licensed appropriately):

Then of course, the data itself (feel free to clone or download):

Last but not least, the dataset’s followers:

(To date, two beta users have already graphed and mashed up the indices data, unveiling a yet unaccounted-for spike in sugar prices. You can read about the implications of this collaborative investigative effort on open-data advocate David Eaves’s blog today

Intrigued? You should be. And now that we’re live, you can invite your friends and colleagues to check out the site, too; no invite code required. What are you waiting for? 

A few small caveats to consider while we’re in public beta:

— This is still a beta, so there will be bugs here and there (let us know when you come across bugs, we’ll tackle them ASAP.)

— As a beta, we’re still fine-tuning site accommodation in different browsers. BuzzData works by far the best in Chrome, does well in Firefox 5 and Internet Explorer 9, and is functional in Firefox 3.6 and Internet Explorer 8.  

— We’re still sticking to tabular data (csv, tsv, and simple xls) for now. More to come, we promise.

Within the next day or two, we’ll also be rolling out new features that will let you reap the benefits of the platform and get more seamlessly connected to your existing social circle.

No more faceless emailing: BuzzData is giving data users the visibility and voice (and credit) they need (and deserve).  

Embracing the end of the ‘end user’

In step with our imminent public beta launch, BuzzData has recently been written about in VisionCloud, an EU-funded project that focuses on innovations “for the future Internet.” We met VisionCloud contributor and information architect Mirko Lorenz at the Open Knowledge Conference earlier this summer. Lorenz, a speaker at OKCon this year, has high hopes for BuzzData’s impact on data journalism. We hope we can deliver. 

Below is an excerpt from Lorenz’s interview with BuzzData CTO Pete Forde. You can read the whole piece on VisionCloud

At OKCon, the big open data gathering in Berlin at the beginning of July, we presented our ideas and concepts related to future cloud storage, data handling and data-journalism in particular. 

This is how we met Pete Forde, co-founder/CTO of BuzzData.

“Do you want to see what we have been working on? I think we solved a few of the problems you where just talking about.”

The next minute, on the back of a Biergarten table, Forde briefly demo-ed BuzzData, a soon to be launched platform enabling collaborative data interrogation. The system takes the open-data approach further, overcomes limitations of platforms such as Google Docs and could spark interesting collaborations in communities around the world. 

An age without end-users

BuzzData is addressing a larger theme evolving around the web, open data and new uses of all the tools that are now available: Effectively it allows to take a data set, publish it and then dig into the information concealed in the figures in public – alone or by sharing it with others. Datasets can be copied as a clone, thus opening many new ways to play with them. 

The service fills a need that gaining importance. It is increasingly important to know how numbers affect our daily routines. This is not confined to a single area of life, many areas will be affected: Business, government, health, your work, your community.

 Journalists and media companies are among the firsts to feel the growing pressure to make use of such new possibilities. Jeff Jarvis, journalist, professor and book author for example says that in the future media world “the article will be luxury”. Instead we will see a process in which journalists and users work together to really find out about a problem or development affecting the community.  

Every user is the start of something new

On the IT side of things, an interesting article addressing another angle of this change, says: ”There are no ‘end users’ anymore. With good BI, and especially with newer business discovery or self-service tools, no user is at the ‘end’ of anything. Every user is the start of something new.” (Source: Information Management)

Interview with Pete Forde:

Can you briefly describe the benefits of BuzzData?

Forde: BuzzData treats datasets as destinations where a community of interest can form. People used to hunting for data in a vacuum will love being able to discuss and annotate datasets. They can attach articles, visualizations, apps and even source code.

Meanwhile, they see that they are getting timely, accurate, complete data with proper licensing straight from the publisher. There’s no scraped data on BuzzData. Publishers love it because they can finally see who is interested in their data, and what they are doing with it.

How did you get the idea?

Forde: I was writing a book about open data and how it should be for all people, and how we could use it to fix some of the world’s problems. I was working on a related project that got me really interested in the open data movement. Right around the time startups like InfoChimps were being announced. To me, it seemed like the data marketplaces were missing an obvious opportunity. Sure, some people want to buy datasets from a cart. However, the biggest problem in data today isn’t finding it, but connecting the communities and educating the public. There’s a BuzzData-shaped piece of the data value chain missing that’s obvious, if you’re looking.

Was it difficult to get support or funding for this?

Forde: We raised an angel round from four exceptional Toronto investors. It was exceptionally difficult! There are not many active angels in Toronto, unfortunately. We scored a major coup when I recruited Mark Opausky to be our CEO. Mark built a $50M software company before we met, and so I get to learn from the best.

If you could make a wish: Which kind of users should use BuzzData?

Forde: I think that initially it’ll be very popular with journalists, bloggers and data hackers. However, we’re working very hard to make sure that we’re solving huge problems for scientists and academics. I have a crush on all librarians (it’s the glasses) so I’m making extra sure to think like an archivist when we design our features. Ultimately, I’d love to be responsible for seeing a dataset homepage manifest in Google search results right beside Wikipedia.

Then we’d have everyone using BuzzData, regardless of their tech ability.

That’s my dream.

Read more at VisionCloud.

Data-driven journalism, done faster

From the start, we went out of our way to enlist the participation of groups and businesses for the BuzzData beta — after all, BuzzData is all about improving group collaboration around data, right?

Having said that, bringing businesses on board at the beta stage, let alone post-commercial launch, is no small feat for a funky, outside-the-box app like BuzzData. The concept of open data is still relatively new, and simple workflow tools for data wrangling and sharing are rare. Finding organizations that were hip to the movement and up for trying a new, untested digital app was a fun challenge, needless to say.

Lucky for us, a small number of influential, forward-thinking organizations came forward to test the beta right at from start, including: 

The Economist Intelligence Unit 

The Globe and Mail (Canada’s national newspaper)

Global News (Canadian broadcast and online news)

The City of Vancouver

And while the beta’s only been active less than a week, we’ve already witnessed instances of unscripted cross-pollination between media, government and data-literate citizens. This is hugely exciting to us. 

The Globe and Mail’s account in particular, hosted by Toronto Hacks/Hackers organizer and Globe mobile editor Mason Wright, has been off to a promising start, largely because Wright clearly gets the give/take aspect of social networking, posting Globe articles to other users’ data and making an effort to put the Globe’s data in context with accompanying articles and visualizations.

It’s fascinating to watch this happen in the context of data. We’re so used to static catalogues and repositories that appear to move at a glacial pace. In contrast, on BuzzData you tell a user something — whether it’s your best friend or a national newspaper — and they talk back to you as a visible, dynamic, listening entity, a single degree of separation away. Not a new phenomenon to social media, certainly, but a refreshing change of pace for data communication. 

As an example, last week the Globe uploaded food price indices data as an accompaniment to a recent Report on Business article. The article itself focused on short-term food prices, but New York-based beta tester David Joerg took the data and, by simply plotting the data over time in Excel, uncovered a startling spike in sugar prices no one had yet noticed: 

Even Wright was surprised to see this. So the question remains: what’s driving the price inflation of sugar? Perhaps Joerg’s cursory data-viz will trigger an entirely new business investigation by the Globe in the near future. That would be incredibly cool, and a truly unique example of collaborative data journalism — one that, in an instant, transcended national boundaries and professional disciplines.

Not bad for the first five days of a beta. 

Conquering the BuzzData Kanban

You may remember a few weeks back when our back-end developer John McDowall introduced Taiichi Ohno’s workflow management system, “Kanban,” to the BuzzData office.

Kanban-ifying the office in tandem with implementing our new project manager Sarah as “Github ticket Gatekeeper” (seriously, don’t f_ with her post-its) has done wonders for our product development and workflow. We’re thinking bigger and getting better each day. The original KB system we had is practically a disgrace compared to the well-oiled machine we’ve got going now: 

SHABAM! Seeing that glorious vacant space (that was once covered in tickets) felt pretty good yesterday. We’re officially in solid shape for the beta! 

Awwww yeah. I think everyone in the office took pics of this. 

Of course, we still have plenty to do for the commercial launch:

I guess this is just what happens when you have a development team of perfectionists and a management team on a mission to change the world …