Branch (
branchandroot) wrote2012-07-09 01:26 pm
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Entry tags:
Incandescent Outrage, Film at 11
You know what infuriates me most about AO3 (today)? The more I read the little things that wranglers anonymous and not are stepping up to tell us, the clearer it becomes that it could work. I don't mean that in a fuzzy procedural way, either, I mean the actual structure of the archive is completely compatible with changing the archive over to canonical, navigable tags and usable fandom hierarchy navigation and not making the wranglers do every damn thing. IT COULD BE DONE RIGHT NOW. Most of the structure is already in place, it's just completely invisible to the users!
It would not take any extra hand-work on the part of the wranglers. It would not require more downtime than any other commit, or break existing functions. The big change would not even be a very difficult bit of code to write! (Terrifying, perhaps, but not difficult.)
How, you ask? Let me tell you, because the top of my goddamn head is about to blow off with the force of my indignation over the pointless ideological stonewalling that's stopping the archive in its tracks!
1) The absolutely necessary first step, on which all else is predicated, is that the public voices of the Archive must speak up to say that Responsible Parties were, after all, mistaken and that the tagging system needs to change. Announce what the changes will be, and a loose timeline of when, so users can take whatever action seems wise to them (eg doing the canonization of their stories themselves so there are no mistakes). Apologize for taking users down a dead-end path for so long and explain the logic behind the upcoming changes (ie readers actually being able to find the authors' stories and wranglers not dying of overwork as the archive grows). Admit that fixing things will inevitably cause some new mistakes, and surface old mistakes. Refrain from any \o/ whatsoever. Do it all again in email.
2) Next step. Make a "request new canonical tag" page. It should be quite simple, as close to one-click as can be. One text-field for each tag type (Fandom, Character, Relationship, Genre/Flavor/Whatever). If Character or Relationship is filled in, a prompt comes up for what Fandom this should be a child of. If a Fandom is filled, a prompt should come up asking for any character/relationships the suggester can think of offhand to populate it with. I'm thinking it should only be visible to logged in users; anything else is spambait. On the wrangler end, this should be equally one-click, as they review requests. Prospective buttons: Approve (immediately creates canonical), Request Review (pops up a flag in whatever task-flow forum exists, asking another wrangler or maybe staff for another opinion), Approve-Needs Forming (creates canonical which the wrangler may alter the phrasing of and optionally pops up a flag asking people to help find synonyms among the non-canonicals to wrangle into the new canonical), Reject (pops up a "reason" form which sends an email to the suggester with the reason filled in; also adds suggestion to blacklist table, with reason; previously blacklisted suggestions do not go through, just pop up a page with the reason for rejection [malformed, malicious, etc.]; the email and this form should both have a link to Support, in case someone wants to argue or get clarification).
3) Next step. Edit the posting form so that Fandom, Character, and Relationship fields can only be populated from the canonicals (possibly re-using the code for selecting a collection name). Fandom(s) remains the only required tag and must be entered first, to create the pool of Character and Relationship options; a warning to this effect should come up if someone clicks into those fields without entering a Fandom. The only other new bit of code required would be a new field for Genre/Flavor/hold a user poll to decide what to call this one. The place where all the No Fandom canonicals will go, at any rate. The Additional tags field can remain as is, in all its freeform glory, perhaps with a note to the effect that Additional tags will not be wrangled, now. Add a prominent link to the "request new canonical" form, and links beside each tag field that will lead to a page of that tag-type for the fandom(s) entered, so people can check what's available instead of having to guess forever. One new note, one new field, four new links, three altered field types. That's it! Test it and send that puppy live. Now all new stories will be in the appropriate format, working off the already existing canonical tag structure and requiring no further wrangling.
4) Adjust import function to look for and set Fandom first, and then try to match any other tags discovered to the fandom's canonicals pool, so as to allow as much of the filling-in process as can be done. Results will probably be about the same as they are now.
5) Write (another) filter sidebar in which only the canonicals show. A fandom index might show Characters, Relationships, and Genre/Flavor canonicals. A character index might show Relationships that character appears in and Genre/Flavor canonicals. A relationship index might also show associated Genre/Flavor canonicals. None of them should show Additional tags at all; instead perhaps there can be a link to the Tags section and a link to the Search form. Revise the in-menu "or" function so that it usefully searches for "character X OR character Y" AND "flavor A OR flavor B" instead of just "anything with X or Y or A or B" which is useless. All of these will be very straightforward mysql queries, instead of monstrous, ass-end-to walker functions. Hold the execution of this until tags are retrofitted.
6) Prep work done! On to the nerve-wracking, if not really difficult, part. Write a query to replace every non-canonical tag id with its canonical version in the story table of the database. Depending on the database structure, this could be as simple as "story.tag_id = tags.canonical where story.tag_id = tags.tag_id and not tags.canonical = ''". Take a good drink to settle your nerves. Clone the story table. Run the query on the clone; this will probably take a few days to get through. Make a news post that The Time Has Come. Take another drink. Shut down the posting form, import any new entries since cloning, swap the names of the clone and the live tables, revive the posting form, now breathe. That should take maybe fifteen minutes, if people prepare beforehand, and that's allowing a margin for a mild case of hysterics or two. If you want to close posting during the replacement process, you won't even have to do the import step. Lo, the archive is running seamlessly on all canonicals! Take another drink.
7) Now it's time for clean-up. Write a query to delete all empty and/or non-canonical tags. Write another to determine all stories with character/relationship meta-tags, the authors' emails, and send them auto-emails informing them that they have meta-tags Y on Story X that should please be re-set to the appropriate canonicals in order for the story to be searchable. Delete these particular meta-tags as they empty. Send general emails to inform all authors that retrofitting is done and they should please check their stories. Warn the wranglers and support people before-hand, because there will invariably be some mistakes showing up and probably some irate users needing to vent. In fact, make a special news post for them to vent in, including any strategies people can think of for easy checking and clean-up, and link to it in the email. Chairs should be on-hand for their volunteers with tea/hard liquor/kleenex/adorable kitten pictures. Schedule this, people.
8) Once the fallout looks to be dealt with, announce completion and success. Now you can \o/. Send the new sidebar live!
9) It's now time to reform the navigation. The Fandoms lists can probably stay as are, at least for now, but every single index page, whether for a fandom, a character, a genre, or whatever else, every index page should show the branch of the hierarchy it is in, as a breadcrumb at the top of the page. For example, selecting "Bishoujo Senshi Sailor Moon" should show, in the breadcrumb "Sailor Moon - All Media Types >> Bishoujo Senshi Sailor Moon". Selecting "Tennou Haruka" from one of those stories should show "Sailor Moon - All Media Types >> Characters >> Tennou Haruka" (or whatever the parent structure is, in that example). Beside the breadcrumb should be a link to the landing page of the fandom. All of these functions and pages exist already, all that is required is to make them visible to the users as well as the volunteers, with a few conditionals to conceal the wrangling links. The public view of the fandom landing page should also have a link to the "request new canonical" form.
10) While you're thinking about it, fix the advanced search, also, so that it has sub-fields for different kinds of tags, and searches for discrete tags as opposed to doing breakage-prone all-field string comparisons.
Congratulations. You now have a working, navigable, professional looking goddamn archive, that can run fast and sleek with as many readers/users as want to come; be proud of yourselves!
Now. That involved only a small amount of new coding, all of it straightforward, and it will fix both server-load and worker-load. The majority of the fix is one query to canonize existing story tags, and a slightly edited form to select new ones, using canonicals and hierarchy that are already established in every case. The rest of it is simply showing users the navigation that's already there. This change-over would not break any existing archive function. It could be nearly seamless. It would even surface things that are currently mis-wrangled but don't readily show it on the front end as the tags stand. And the wranglers would have the far more manageable job of reviewing requests for new canonicals and maybe populating new fandoms instead of trying to make sense of every senseless tag with their hands tied behind their backs. Everything is in place already, to make this work!
It could be done so easily. It could be started right now. WHAT IS STOPPING YOU?
It would not take any extra hand-work on the part of the wranglers. It would not require more downtime than any other commit, or break existing functions. The big change would not even be a very difficult bit of code to write! (Terrifying, perhaps, but not difficult.)
How, you ask? Let me tell you, because the top of my goddamn head is about to blow off with the force of my indignation over the pointless ideological stonewalling that's stopping the archive in its tracks!
1) The absolutely necessary first step, on which all else is predicated, is that the public voices of the Archive must speak up to say that Responsible Parties were, after all, mistaken and that the tagging system needs to change. Announce what the changes will be, and a loose timeline of when, so users can take whatever action seems wise to them (eg doing the canonization of their stories themselves so there are no mistakes). Apologize for taking users down a dead-end path for so long and explain the logic behind the upcoming changes (ie readers actually being able to find the authors' stories and wranglers not dying of overwork as the archive grows). Admit that fixing things will inevitably cause some new mistakes, and surface old mistakes. Refrain from any \o/ whatsoever. Do it all again in email.
2) Next step. Make a "request new canonical tag" page. It should be quite simple, as close to one-click as can be. One text-field for each tag type (Fandom, Character, Relationship, Genre/Flavor/Whatever). If Character or Relationship is filled in, a prompt comes up for what Fandom this should be a child of. If a Fandom is filled, a prompt should come up asking for any character/relationships the suggester can think of offhand to populate it with. I'm thinking it should only be visible to logged in users; anything else is spambait. On the wrangler end, this should be equally one-click, as they review requests. Prospective buttons: Approve (immediately creates canonical), Request Review (pops up a flag in whatever task-flow forum exists, asking another wrangler or maybe staff for another opinion), Approve-Needs Forming (creates canonical which the wrangler may alter the phrasing of and optionally pops up a flag asking people to help find synonyms among the non-canonicals to wrangle into the new canonical), Reject (pops up a "reason" form which sends an email to the suggester with the reason filled in; also adds suggestion to blacklist table, with reason; previously blacklisted suggestions do not go through, just pop up a page with the reason for rejection [malformed, malicious, etc.]; the email and this form should both have a link to Support, in case someone wants to argue or get clarification).
3) Next step. Edit the posting form so that Fandom, Character, and Relationship fields can only be populated from the canonicals (possibly re-using the code for selecting a collection name). Fandom(s) remains the only required tag and must be entered first, to create the pool of Character and Relationship options; a warning to this effect should come up if someone clicks into those fields without entering a Fandom. The only other new bit of code required would be a new field for Genre/Flavor/hold a user poll to decide what to call this one. The place where all the No Fandom canonicals will go, at any rate. The Additional tags field can remain as is, in all its freeform glory, perhaps with a note to the effect that Additional tags will not be wrangled, now. Add a prominent link to the "request new canonical" form, and links beside each tag field that will lead to a page of that tag-type for the fandom(s) entered, so people can check what's available instead of having to guess forever. One new note, one new field, four new links, three altered field types. That's it! Test it and send that puppy live. Now all new stories will be in the appropriate format, working off the already existing canonical tag structure and requiring no further wrangling.
4) Adjust import function to look for and set Fandom first, and then try to match any other tags discovered to the fandom's canonicals pool, so as to allow as much of the filling-in process as can be done. Results will probably be about the same as they are now.
5) Write (another) filter sidebar in which only the canonicals show. A fandom index might show Characters, Relationships, and Genre/Flavor canonicals. A character index might show Relationships that character appears in and Genre/Flavor canonicals. A relationship index might also show associated Genre/Flavor canonicals. None of them should show Additional tags at all; instead perhaps there can be a link to the Tags section and a link to the Search form. Revise the in-menu "or" function so that it usefully searches for "character X OR character Y" AND "flavor A OR flavor B" instead of just "anything with X or Y or A or B" which is useless. All of these will be very straightforward mysql queries, instead of monstrous, ass-end-to walker functions. Hold the execution of this until tags are retrofitted.
6) Prep work done! On to the nerve-wracking, if not really difficult, part. Write a query to replace every non-canonical tag id with its canonical version in the story table of the database. Depending on the database structure, this could be as simple as "story.tag_id = tags.canonical where story.tag_id = tags.tag_id and not tags.canonical = ''". Take a good drink to settle your nerves. Clone the story table. Run the query on the clone; this will probably take a few days to get through. Make a news post that The Time Has Come. Take another drink. Shut down the posting form, import any new entries since cloning, swap the names of the clone and the live tables, revive the posting form, now breathe. That should take maybe fifteen minutes, if people prepare beforehand, and that's allowing a margin for a mild case of hysterics or two. If you want to close posting during the replacement process, you won't even have to do the import step. Lo, the archive is running seamlessly on all canonicals! Take another drink.
7) Now it's time for clean-up. Write a query to delete all empty and/or non-canonical tags. Write another to determine all stories with character/relationship meta-tags, the authors' emails, and send them auto-emails informing them that they have meta-tags Y on Story X that should please be re-set to the appropriate canonicals in order for the story to be searchable. Delete these particular meta-tags as they empty. Send general emails to inform all authors that retrofitting is done and they should please check their stories. Warn the wranglers and support people before-hand, because there will invariably be some mistakes showing up and probably some irate users needing to vent. In fact, make a special news post for them to vent in, including any strategies people can think of for easy checking and clean-up, and link to it in the email. Chairs should be on-hand for their volunteers with tea/hard liquor/kleenex/adorable kitten pictures. Schedule this, people.
8) Once the fallout looks to be dealt with, announce completion and success. Now you can \o/. Send the new sidebar live!
9) It's now time to reform the navigation. The Fandoms lists can probably stay as are, at least for now, but every single index page, whether for a fandom, a character, a genre, or whatever else, every index page should show the branch of the hierarchy it is in, as a breadcrumb at the top of the page. For example, selecting "Bishoujo Senshi Sailor Moon" should show, in the breadcrumb "Sailor Moon - All Media Types >> Bishoujo Senshi Sailor Moon". Selecting "Tennou Haruka" from one of those stories should show "Sailor Moon - All Media Types >> Characters >> Tennou Haruka" (or whatever the parent structure is, in that example). Beside the breadcrumb should be a link to the landing page of the fandom. All of these functions and pages exist already, all that is required is to make them visible to the users as well as the volunteers, with a few conditionals to conceal the wrangling links. The public view of the fandom landing page should also have a link to the "request new canonical" form.
10) While you're thinking about it, fix the advanced search, also, so that it has sub-fields for different kinds of tags, and searches for discrete tags as opposed to doing breakage-prone all-field string comparisons.
Congratulations. You now have a working, navigable, professional looking goddamn archive, that can run fast and sleek with as many readers/users as want to come; be proud of yourselves!
Now. That involved only a small amount of new coding, all of it straightforward, and it will fix both server-load and worker-load. The majority of the fix is one query to canonize existing story tags, and a slightly edited form to select new ones, using canonicals and hierarchy that are already established in every case. The rest of it is simply showing users the navigation that's already there. This change-over would not break any existing archive function. It could be nearly seamless. It would even surface things that are currently mis-wrangled but don't readily show it on the front end as the tags stand. And the wranglers would have the far more manageable job of reviewing requests for new canonicals and maybe populating new fandoms instead of trying to make sense of every senseless tag with their hands tied behind their backs. Everything is in place already, to make this work!
It could be done so easily. It could be started right now. WHAT IS STOPPING YOU?
no subject
The problem I see right now is: that self-wrangling has never been done. Never been suggested or debated in public. Never appeared as any kind of goal, to the users. The archive is growing fast, and none of its incoming users have been habituated to that way of thinking or doing things. Instead, tagging has been presented as something no one needs to think about or know anything about, except the wranglers. That's a big problem for implementing the idea now. Maybe it could still be done, but unless the archive makes synonyming your own tags a mandatory step, which I can't see happening, it's going to be just as fragile and prone to failure as the current system.
Which brings us to this: The thing is, whether it ruffles coders and librarians’ feathers or not, different ways to format relationship and fandom tags have semantic meaning. It's not about feathers getting ruffled. It's about whether or not the archive is going to actually work for finding the fic that's posted there, and whether it can continue to work as it grows larger. What a lot of taxonomy experts are saying is that there are reasons why what the Archive is doing with tags isn't done elsewhere, and those reasons are not a lack of creativity. It requires the kind of additional labor that becomes prohibitive very quickly as size increases, and even if the labor can be distributed across users it's horribly breakage and accident prone, leaving a significant minority of fic un-findable.
The ideal of letting people use all the fine gradations of naming is a beautiful one. But I am not willing to consider (at a guess) five to ten percent of the archive's stories, which are not going to show up on searches or in tag indexes or sometimes even in fandom indexes, acceptable collateral damage in pursuit of that ideal. If the navigation doesn't work for everything, then it doesn't work.
I would also offer this thought. The underlying structure of the archive navigation is already exactly what I'm suggesting, here, and it shows. If someone clicks on a synonym tag, the title and url and name of the page they reach will not be that synonym; it will be the canonical. The canonicals were already what show up in the filter sidebar, and on the fandom index lists. The only place that alternative forms appear, even now, is on a story's own blurb, and what I am suggesting still leaves the Additional tag field open for exactly those kinds of tags. Those are not navigation tags, and I don't think we should try to force them to be. Those are, as you point out, for meta and commentary and nuance. By all means, let people continue to tag their fic with the forms that will declare their own approach. But I actually think it's a little more honest to the users to admit, and let the division of tags reflect the fact, that the site navigation is already running on lowest-common-denominator canonicals. Because it has to, to work.
no subject
a) it has to work, and
b) the mechanism has to be sufficiently transparent to the user that they can participate in the sense-making of the archive.
Right now, the heavy lifting of navigation-related sense-making is relegated to the wranglers and made heavier by the (ignorance*) of the users.
There are systemic arrangements that leave, as you say, collateral damage, and those really need to be fixed. I appreciate the folksonomic variations (free-style tagging) that enrich a controlled vocabulary (canonical tagging), and I think that your proposal to use controlled vocab for some fields and have free-form tagging available elsewhere is an ideal solution.
* ignorance only in the sense that because the system is opaque to the users, they can not participate in the process of helping it work better.
no subject
I understand your argument about not wanting a certain percentage of works to be unfindable, of course. I guess I'm unconvinced that a user-proposed synonymity system + user-powerered crowdsource merging capabilities would really drop that much fic.
(The importance of which is not negated by pointing out, though, that not everyone who archives at AO3 does it to be findable in that way. Some people are perfectly content to be findable by people who know the shibboleth of their fandom in precisely the right way, too. Fandom runs on shibboleths.)
I'm way more inclined to think a (different) solution is, as