Here are the transcript and slides from the talk I gave this morning at OpenCon 2014. I was a little nervous as to how well this would be received -- nothing like challenging the meaning of a word that makes up the title of the conference.
This is one of my most popular tweets:
Openwashing: n., having an appearance of open-source and open-licensing for marketing purposes, while continuing proprietary practices.
It hasn’t gone viral by any means. But the two-and-a-half-year-old observation is resurfaced and retweeted pretty regularly.
I think the tweet resonated in part because we readily understand what “open washing” means through what we know about the word’s antecedents: “greenwashing,” “pinkwashing," “whitewashing." We recognize with these terms that industry forces are quick to wrap themselves in language and imagery in the hopes it makes them appear more palatable, more friendly, more progressive. More “green,” for example, more “open.”
My tweet also gets at some of the frustrations that many of us experience when we see the word “open” used to describe things we feel are not “open” at all. It’s a reflection of the ongoing challenges — conflicts even — that any “open” movement faces both internally and externally, as to what exactly is meant when that word is used.
And that’s the thing. The definition and designation of “open” is fraught. Incredibly so. Even among those of us who consider ourselves advocates for openness in some form or another, we still scrap over which what counts as really truly “open.”
In fairness, my tweet about “openwashing” wasn’t aimed at the debates about AGPL3 or Attribution-Non Commercial. It was a subtweet, if you will, a reference to the learning management system Blackboard’s acquisition of Moodlerooms and Netspot, two companies that help provide support and deployment services for schools that use the open-source LMS Moodle. "Ours is no mere dalliance with open source,” the company said. “Openwashing,” I muttered under my breath.
Blackboard is hardly alone here. In education technology — my field, that is — I can list for you any number of examples of companies and organizations that have attached that word “open” to their products and services: OpenClass, an learning management system built by Pearson, the largest education company in the world and one of the largest publishers of proprietary textbooks. I don’t know what “open” refers to there in OpenClass. The Open Education Alliance — an industry group founded by the online education startup Udacity. I don’t know what “open” refers to there in the Open Education Alliance. The startup Open English, an online English-language learning site and one of the most highly funded startups in the last few years. I don’t know what “open” refers to there in Open English.
All these append “open” to a name without really even trying to append “openness,” let alone embrace “openness," to their practices or mission. Whatever “openness” means.
Let me repeat that, because it’s important: whatever “openness” means.
Does “open” mean openly licensed content or code? And, again, which license is really “open”? Does “open” mean "made public"? Does “open” mean shared? Does “open” mean “accessible”? Accessible how? To whom? Does “open” mean editable? Negotiable? Does “open” mean “free”? Does “open” mean “open-ended”? Does “open” mean transparent? Does “open” mean “open-minded”? “Open” to new ideas and to intellectual exchange? Open to interpretation? Does “open” mean open to participation — by everyone equally? Open doors? Open opportunity? Open to suggestion? Or does it mean “open for business”?
That’s the problem. “Open” means all those things. And on one hand, multivalence is good. Having many meanings, many interpretations can be a strength. On the other hand, it’s a weakness when the term becomes so widely applied that it is rendered meaningless. I worry often that that’s what we’re faced with. “Open” has ended up being a bit like Supreme Court Justice Potter Stewart’s famous assertion that “I know [obscenity] when I see it.” That is, we hear a lot of “I know ‘open' when I see it” sorts of claims. If those of us who work within “open” efforts cannot always agree on what that adjective means, how do we expect others to? Should we expect others to?
I’ve actually come to believe, in the two plus years since I tweeted my critique of “openwashing,” that the answer here isn’t actually a clearer definition of “open”; the answer isn't more fights for a more rigid adherence to a particular license, good grief no.
I think the answer is more transparency about our politics. I think, in fact, the answer is politics.
We act — at our peril — as if “open” is politically neutral, let alone politically good or progressive. Indeed, we sometimes use the word to stand in place of a politics of participatory democracy. We presume that, because something is “open” that it necessarily contains all the conditions for equality or freedom or justice. We use “open” as though it is free of ideology, ignoring how much “openness,” particularly as it’s used by technologists, is closely intertwined with “meritocracy” — this notion, a false one, that “open” wipes away inequalities, institutions, biases, history, that “open” “levels the playing field.”
If we believe in equality, if we believe in participatory democracy and participatory culture, if we believe in people and progressive social change, if we believe in sustainability in all its environmental and economic and psychological manifestations, then we need to do better than slap that adjective “open” onto our projects and act as though that’s sufficient or — and this is hard, I know — even sound.
I want to make an argument here today that we need to be more explicit about these politics. We can’t pretend like “open” is going to do that work for us. In fact, we need to recognize: it might not be doing that work at all.
In particular, I want to examine at how “open” is invoked around education data, and I want to suggest that instead of a push for more “open data” in education, we need to instead — this is a phrase I am borrowing from Utah Valley University researcher Jeffrey Alan Johnson — to push for “information justice.”
When we talk about "opening" education data, I'd argue that we always have to tread very carefully. Education data lives in this tricky and powerful in-between space; as it is both-and. That is, it is often data generated at and collected by publicly-funded institutions. It is also deeply personal data, if not legally protected private data. Furthermore, the data that is collected often fulfills institutional needs, rather than learners'. That collection is often compelled, for reasons that might be progressive, and for politics that might not be.
And now, thanks to the proliferation of educational technologies, the sorts of data and the compulsions to collect it are increasing.
The push for more education data collection is not new. Not remotely. The National Center for Education Statistics has existed since 1867, when Congress passed legislation providing ‘‘That there shall be established at the City of Washington, a department of education, for the purpose of collecting such statistics and facts as shall show the condition and progress of education in the several States and Territories, and of diffusing such information respecting the organization and management of schools and school systems, and methods of teaching, as shall aid the people of the United States in the establishment and maintenance of efficient school systems, and otherwise promote the cause of education throughout the country.” Over a hundred years before there was a Department of Education, that is, the federal government was collecting education data.
As such local, state, and federal governments, along with educational institutions themselves have long tracked “data” about students. Since the advent of No Child Left Behind under George W. Bush, data collection has become part of a larger disciplinary effort, to identify and punish “failing schools.” And under Barack Obama’s No Child Left Behind policy, the data collection has only continued, an effort that dovetails quite nicely with schools’ increasing adoption of computer technologies and, as such, students’ increasing generation of “data exhaust."
The current administration is interested in more than just data at the school, district, and state level. It’s actively promoting the collection and analysis of student at the individual level, arguing that if we just have more data — if we “open up” the classroom, the software, the databases, the educational practices — that we will unlock the secrets of how every student learns. We can then builds software that caters to that, something that will make learning more efficient and more personalized — or that’s the argument at least. We should remember that this is mostly speculative. And we should recognize here that words like “personalization” function much like “open.” That is, they sound great in press releases, but they should prompt us to ask more questions rather than assume that they’re necessarily good.
In 2012, the Department of Education announced the Education Data Initiative, part of the larger Open Data Initiative that in its words will “'liberate' government data and voluntarily-contributed non-government data as fuel to spur entrepreneurship, create value, and create jobs while improving educational outcomes for students." That is, "open education data" isn't simply about citizens reviewing the success or failure or funding or outcomes of schools. It's not about shifting power, thanks to "openness," from the federal government -- those data hoarders -- to the people, to communities. To teachers, parents, students.
It is, however, a shift in power.
The push to “open” more education data has happened at the state level too. With a nod from the Council of Chief State School Officers (that is, an organization of state superintendents of education which has also been a major strategic proponent of the recent Common Core State Standards), and funded with $100 million from the Carnegie Corporation and Gates Foundation, the Shared Learning Collaborative — later rebranded to inBloom — launched in 2011, promising to create a massive warehouse of student data that would be “open” to third-party developers.
The infrastructure would be open-source, replacing what is, in so many cases, an ailing infrastructure of often proprietary databases, applications, and systems that many school school districts work with to manage students’ records.
And here, immediately, we can see the some of the problems with “open.” Because the code for InBloom was meant to be open source, it does offer some leverage against the proprietary infrastructure that most schools are saddled with: Pearson PowerSchool or eScholar for starters. Ideally, thanks open source, any school could install the inBloom codebase and be free of the inBloom organization and all its attachments to News Corp (that's who wrote a great deal of the code), to the Gates Foundation (that's who funded the project), and so on.
But then what? Open source doesn’t actually get us out of the conundrum that is education data collection. Open source doesn't opt you out of reporting mandates, for example. Indeed, “open” might put us farther into the weeds.
InBloom’s data specification included hundreds of data points about students — enough to make parents and privacy groups balk about what exactly what being collected and shared and why. It probably didn’t help that some of the development work was done by Wireless Generation, a company that had been acquired by News Corporation — right in the middle of that company’s phone hacking scandal. And it probably didn’t help when those in education technology make ridiculously triumphant claims about all the data-mining they plan to do.
Take, for example, the CEO of Knewton, which is a company that promises to take student data and provide “adaptive” pathways through textbook lessons, pronouncing that “We literally know everything about what you know and how you learn best, everything.” Knewton boasts that it gathers millions of data points on millions of children each day. He calls education “the world’s most data-mineable industry by far.” “We have five orders of magnitude more data about you than Google has,” the Knewton CEO said at a Department of Education “Datapalooza” event. “We literally have more data about our students than any company has about anybody else about anything, and it’s not even close.”
The argument — espoused by the Department of Education, handily doing the bidding of administration and administrative fetishes for data as well as the bidding of education technology companies like Knewton and inBloom and others — the argument is that more data works in the service of “better education,” that the problem that schools have long faced stem, in part, from a failure to collect and make use of data.
Data is kept in silos — in spreadsheets, in student information systems, in handwritten grade books — so the story goes (I believe that story), and therefore there hasn’t been a way to understand each child (that's bullshit), to see a full data profile of a particular student, let alone create algorithms and software best suited to move that student through school.
Again, the collection of education data isn’t new. Indeed, inBloom used a data model that was based in part upon SIF — the schools interoperability framework — a specification that is over a decade old. What was new here was the push to have this data be “open” more easily to third party developers and not simply the one company that won the contract for the student information system and the government.
But to challenge inBloom and others in education technology who are interested in educational data collection and data-mining, we need to do more than raise red flags about privacy. That's been the loudest complaint. A parent-led effort did just that, successfully organizing protests in the states and districts that were piloting the inBloom technology. One by one, these customers backed out. Louisiana. Colorado. New York. Illinois. By April of this year, inBloom had no customers left, and it announced that it was closing its doors. $100 million. For what it’s worth, some of the code is available on Github.
But I want to raise more questions about the data itself. Data is not neutral. Data — its collection, storage, retrieval, usage — is not neutral. There can be, as Jeffrey Alan Johnson argues, “injustices embedded in the data itself,” and when we "open data," it does not necessarily ameliorate these. In fact, open data might serve to reinscribe these, to reinforce privilege in no small part because data, open or not, is often designed around meeting the needs around businesses and institutions and not around citizens, or in this case students.
What “counts” as education data? Let’s start there. What do schools collect?
As I said earlier, the inBloom data spec included hundreds of data points. A small sampling: Academic Honors, Attendance Type, Behavior Incident Type, Career Pathway, Disability Type, Disciplinary Action, Grade Point Average, Incident Location, Personal Information Verification Type, Reason for Restraint, Eligibility for Free or Reduced School Lunch, Special Accommodation, Student Characteristic, Weapon Type.
I think it’s clear, as I list these, that the moments when students generate “education data” is, historically, moments when they come into contact with the school and more broadly the school and the state as a disciplinary system. We need to think more critically, more carefully about what it means to open up this data — data that is often mandated by the state to be collected — to others, to businesses. Again, is “open data” about liberating data, as the Department of Education suggests, "to spur entrepreneurship, create value, and create jobs while improving educational outcomes for students”
As Johnson argues, “the opening of data can function as a tool of disciplinary power. Open data enhances the capacity of disciplinary systems” — and school certainly functions as one of those — “to observe and evaluate institutions’ and individuals’ conformity to norms that become the core values and assumptions of the institutional system whether or not they reflect the circumstances of those institutions and individuals."
Did you speak out of turn in class? Are you a child of an illegal immigrant? Did you get written up for wearing a halter top? Are you pregnant? Did you miss school? Why? Why? Why?
What classes did you take? What grades did you make? Why? Why? Why?
(Is the answer to "why" a data point? And — here’s the rub — is that “data point” ever connected to an ethics of care or a sense of social justice?)
Education data often highlights the ways in which we view students as objects not as subjects of their own learning. I’ll repeat my refrain: education data is not neutral. Opening education data does not necessarily benefit students or schools Or communities; it does not benefit all students, all schools, all communities equally. Open source education data warehouses are not neutral. And similarly, the source code does not benefit students equally.
If we are to move, as Johnson suggests we do, from “open data” to “information justice,” we cannot depend on technology alone. Nor can we rely on that word “open” to serve as the metric by which we evaluate our practices and policies. This isn’t an argument for “closed” or “proprietary” systems. Not by any stretch. It’s an argument for building capacity and agency. We need to consider, for example, what data looks like in communities' hands, in students’ hands, what information students would want to collect on themselves, for themselves, who they would want to share it with and why. And in doing so, we need to recognize the messiness of our learning — of our data — and not normalize that for the sake of analysis, not open it -- counterintuitively I recognize — for the sake of control.
Read this way, "openwashing” signals something else. Something I find just as frightening as a corporation’s innovation of “open” as an adjective to describe their latest, clearly “not open” project.
What happens when something is “open" in all the ways that open education and open source and open data advocates would approve. All the right open licenses. All the right levels of accessibility. All the right nods from all the right powerful players within “open.”
And yet, the project is still not equitable. What if, in fact, it’s making it worse.
What are we going to do when we recognize that “open" is not enough. I hope, that we recognize that what we need is social justice. We need politics, not simply a license. We need politics, not simply technology solutions. We need an ethics of care, of justice, not simply assume that “open” does the work of those for us.