As a self-professed metadata geek, I’ve recently been participating in an online discussion about metadata and the Learning Registry. I have to say, it feels as if I’m on a merry-go-round that won’t stop, because for the past 10 years I’ve engaged in dozens if not hundreds of conversations about the use of OER (open education resources) metadata concerning these same issues: Do we need it? How should it be licensed? Who owns it?
Metadata are data used to describe attributes of a resource. Take, for example, the side of a Starbucks cup, which features checkboxes for customer preferences, such as type of milk and number of espresso shots. The use of those "metadata" ensure that each product conforms to the customer’s specification (such as grande, double, skinny, no foam). In the case of digital libraries, metadata are used (although sometimes invisible to the user) to allow users to discover resources by their attributes. Basic metadata might include title, author, and grade level, but metadata can take on more significant meaning once it is enhanced, refined, rated, reviewed, aligned, and associated with user-generated content and use patterns.
As president of ISKME, Institute for the Study of Knowledge Management in Education, which produced the first open education resource library, OER Commons, I’ve watched the sentiment on metadata go in and out like the tides. At first, OER metadata was coveted as the secret sauce necessary to keep resources searchable and discoverable. Back in 2004, with a grant from the William and Flora Hewlett Foundation, our research team spent over six months analyzing the metadata of various content creators, such as MITE, Connexions, Sofia, and Utah State University, to create a map of what was unique across them. Then we looked at several content standards bodies (Dublin Core, IEEE, etc.) to see what was common, unique, and cutting edge. From that research and analysis, the structure of OER Commons was born.
Over the years, we have actively curated content, shared resources, and nurtured partnerships that would enable us to build a commons for all educators and learners. This has required countless hours of refinement, enhancement, and providing technical tutoring for brilliant content creators who just wanted their work to be seen and shared by others. We realized that if OER were to be widely used, it would need to be described well, and as such, we had to make an investment in resource description.
Next, came the "naysayer period" of metadata, when some argued that controlled vocabularies and rich descriptions would be rendered meaningless as new AI and machine learning techniques moved to center stage. From our perspective, however, working on the ground with teachers for the past decade, we keenly understood how teachers used terms to find resources, so we struggled to keep the conversation going about the importance of metadata. Our goal was to keep a vibrant ecosystem of teachers and learners engaged in this eccentric thing called OER.
Then, metadata became important again, but this time, on the sly. For example, Google convened daylong meetings to understand what we did with OER metadata, and they realized, "Wow, that’s a lot of work."
And we’d respond, "Ask any librarian, of course it’s work to curate meaningful, high-quality resources."
In other cases, various people and organizations asked to use our metadata to experiment building their own tools and services. We felt this was kind of cool because the field was still in its infancy, and experiments with recommender systems and the like were just beginning. Then came other repository-builders who used our metadata to duplicate and build on what we’d done, and even though we would have liked to get some credit (frequently done anonymously), it was exciting to witness this uptake in OER that was taking place.
It was around this time, a few years back, that conversations about the licensing of OER metadata got started. While Creative Commons' legal team and others argued that one could not license metadata at all (which was somewhat true depending on how you define metadata, and mostly not true in countries outside the U.S.), sustainable business models in OER began to emerge.
Then, fast forward to a more robust OER environment, where people are actually using and reusing the stuff (!) and entering terms like paradata, descriptive data, and resource data, etc., and it all starts to get a bit confusing. Why do I say this? Let’s face it: there are more than a few elephants in the (class)room.
Organizations like ours (and we are not alone) are having second thoughts about how much metadata we share and with whom. Some are becoming skittish, because the tides have turned once again and quality metadata is in high demand. We decided at OER Commons, for example, to place an "all rights reserved" notice on our website (meaning, you can’t scrape metadata and reuse it), and a license for non-commercial use on our metadata, with the goal of working with partners who desire something more than faux collaboration.
Of course, some would counter that we should not be in the business of "open" if we aren’t willing to share our metadata openly—given that the OER themselves are licensed with varying flavors of open. Aside from the fact that "open" licenses do provide for non-commercial use of metadata, our goal is to increase access to education for any and all aspiring learners, putting the needs of the learner above all else. Meaning, we do not willingly participate in sharing our metadata if the eventual cost of accessing a service based on our metadata (and hence, resources) could be passed onto the learner.
That being said, we do openly share a portion of our metadata in a variety of ways. For example, we share metadata related to the Common Core State Standards in the Learning Registry. When founder Steve Midgley first asked us to participate in the Learning Registry (a truly innovative attempt to create an exchange of resource metadata for the good of education), we heartily agreed, because for the past 10 years we have enjoyed putting ourselves out there in experimental pools, just for fun and to help realize the potential that OER could bring to education. And we anticipate that we will continue to do so if we can license the metadata for non-commercial use.
However, at the same time, there is a growing concern we have begun to witness—a wave of open source pillaging. The power of open source software has been the ability it gives people to build on code to create something better, different, or new. But so far, about 90% of the re-use of OER metadata I have seen in action (not in theory) is about commercial publishers looking to resell it, disguised as a service. If you don’t believe me, see how our non-commercial resources are used inappropriately here. That is not the spirit of OER as some of us intended it. To me, OER is about access to education for all, in the public domain, forever, for free. It is not just an enticement to have something free, and then later be seduced to come back and buy the more 'high quality education.' After all, isn’t that what we have been trying so hard to change in U.S. education—where typically those who can pay end up with better access?
As for efforts like the Learning Registry, it seems very likely to me, that an effort like that will end up serving the commercial sector first and foremost. Do I applaud the efforts? Yes! Keep innovating everyone. Let’s just call it like it is.
In other words, I’m not at all opposed to commercial for-profit efforts, but let’s just not pretend that these are something else. What we are seeing more frequently these days is an OER storefront, supporting a freemium model with something that isn’t even theirs. It’s as if Barnes and Noble were to invite the local public library to set up a display in the front of the store, so when you first walk in you see this terrific selection of highly curated books, serving as a public good. But then when you step past the facade, you see it’s just provided as an entryway to the commercial store—akin to using OER as a marketing mechanism for a future sale. As a colleague said to me, "there is nothing wrong with elephants in the room, they just shouldn’t be feeding at each other’s troughs without an agreement to do so."
In conclusion, our decision to share metadata with a non-commercial license is motivated by our commitment to provide education as a public good while maintaining the ability to create our own sustainability models to do so. And as such, we want to ensure that barriers to access are not erected in the name of open education.