— Adam Machanic (@AdamMachanic) February 23, 2017
Yesterday Adam Machanic called for new #tsql2sday hosts. I applaud his continued efforts organizing this monthly blog party. It inspires new bloggers, gently guilts those who have lapsed, and provides a topic and schedule for the uncertain. But it routinely surprised me that there was no canonical list of past events with a link to the current topic.
There was of course Adam’s debut post and the recent rules of engagement update, but though some folks owned relevant domains, they went nowhere. The closest we had to an official archive was Steve Jones’ T-SQL Tuesday Topic List. I’m grateful he was diligent in maintaining this, but its existence hinted at a problem. I appreciate that the event is supposed to be a little organic, driving blog and twitter traffic with hashtags and attention, but this comment by Robert Bishop sums up a friction which might have been limiting its growth:
So is the twitter tag #tsql2sday the best way to learn about the next blog party? I always seem to find out about them after the fact and can’t get a blog post out in time.
That’s great, but as data professionals, we should want this history in some sort of structured data format. Which brings us to Adam’s above “challenge” to me.
For some time, I’ve wanted an easy way to host structured data (for free or very cheap) in a public way, where the community could contribute and discuss suggested changes, but where the final editing decisions were made by trusted caretakers. Wikipedia comes close, but (1) it has rules on the type of content which can be hosted, (2) it’s a little too open for my tastes, and (3) it presents data in human-readable format, not machine-readable format. Hosting my own Wiki would solve (1) and (2), but not (3), and it introduces a new issue: (4) I have to become a Wiki software host. There are some free or cheap third party Wiki hosting services, but my quick review didn’t find anything that was machine readable.
I would love to see a SaaS (software as a service) version of this: a hosted, structured database which would have a GUI for admins to design and configure the database, an API layer to extract information, a nice-looking GUI for the public to review the data, a GUI for contributors to suggest edits (complete with discussion and maybe voting), and a GUI for moderators approve them. If there’s something like this and I’ve missed it– oops. Let me know in the comments. But I don’t know of one, and my initial searches didn’t find anything, so I thought about how I could make some of this happen.
GitHub is git repository hosting as service. Git is designed as a source code control system, but it stores pretty much any data. GitHub has made a name in the industry with its free hosting of open source projects. Put this together and GitHub is a free host of pretty much any public data. But that’s not all. It also has a way to control and delegate editing access. Check! And it has a GUI for contributors to suggest edits (pull requests). Check! You can host data in any format, which can include CSVs, JSON, or XML– i.e., structured data. Check! If only there were a GUI for users to conveniently review the data.
So that’s the challenge: create a GitHub repository which stores T-SQL Tuesday data in JSON format, T-SQL which can be used to review and adjust the JSON-formatted data, and a web page which consumes and displays that data cleanly. And here’s the proof of concept repository and its visual display.