branchandroot: wings of fire (fire wings)
Branch ([personal profile] branchandroot) wrote2012-07-09 01:26 pm
Entry tags:

Incandescent Outrage, Film at 11

You know what infuriates me most about AO3 (today)? The more I read the little things that wranglers anonymous and not are stepping up to tell us, the clearer it becomes that it could work. I don't mean that in a fuzzy procedural way, either, I mean the actual structure of the archive is completely compatible with changing the archive over to canonical, navigable tags and usable fandom hierarchy navigation and not making the wranglers do every damn thing. IT COULD BE DONE RIGHT NOW. Most of the structure is already in place, it's just completely invisible to the users!

It would not take any extra hand-work on the part of the wranglers. It would not require more downtime than any other commit, or break existing functions. The big change would not even be a very difficult bit of code to write! (Terrifying, perhaps, but not difficult.)

How, you ask? Let me tell you, because the top of my goddamn head is about to blow off with the force of my indignation over the pointless ideological stonewalling that's stopping the archive in its tracks!

1) The absolutely necessary first step, on which all else is predicated, is that the public voices of the Archive must speak up to say that Responsible Parties were, after all, mistaken and that the tagging system needs to change. Announce what the changes will be, and a loose timeline of when, so users can take whatever action seems wise to them (eg doing the canonization of their stories themselves so there are no mistakes). Apologize for taking users down a dead-end path for so long and explain the logic behind the upcoming changes (ie readers actually being able to find the authors' stories and wranglers not dying of overwork as the archive grows). Admit that fixing things will inevitably cause some new mistakes, and surface old mistakes. Refrain from any \o/ whatsoever. Do it all again in email.

2) Next step. Make a "request new canonical tag" page. It should be quite simple, as close to one-click as can be. One text-field for each tag type (Fandom, Character, Relationship, Genre/Flavor/Whatever). If Character or Relationship is filled in, a prompt comes up for what Fandom this should be a child of. If a Fandom is filled, a prompt should come up asking for any character/relationships the suggester can think of offhand to populate it with. I'm thinking it should only be visible to logged in users; anything else is spambait. On the wrangler end, this should be equally one-click, as they review requests. Prospective buttons: Approve (immediately creates canonical), Request Review (pops up a flag in whatever task-flow forum exists, asking another wrangler or maybe staff for another opinion), Approve-Needs Forming (creates canonical which the wrangler may alter the phrasing of and optionally pops up a flag asking people to help find synonyms among the non-canonicals to wrangle into the new canonical), Reject (pops up a "reason" form which sends an email to the suggester with the reason filled in; also adds suggestion to blacklist table, with reason; previously blacklisted suggestions do not go through, just pop up a page with the reason for rejection [malformed, malicious, etc.]; the email and this form should both have a link to Support, in case someone wants to argue or get clarification).

3) Next step. Edit the posting form so that Fandom, Character, and Relationship fields can only be populated from the canonicals (possibly re-using the code for selecting a collection name). Fandom(s) remains the only required tag and must be entered first, to create the pool of Character and Relationship options; a warning to this effect should come up if someone clicks into those fields without entering a Fandom. The only other new bit of code required would be a new field for Genre/Flavor/hold a user poll to decide what to call this one. The place where all the No Fandom canonicals will go, at any rate. The Additional tags field can remain as is, in all its freeform glory, perhaps with a note to the effect that Additional tags will not be wrangled, now. Add a prominent link to the "request new canonical" form, and links beside each tag field that will lead to a page of that tag-type for the fandom(s) entered, so people can check what's available instead of having to guess forever. One new note, one new field, four new links, three altered field types. That's it! Test it and send that puppy live. Now all new stories will be in the appropriate format, working off the already existing canonical tag structure and requiring no further wrangling.

4) Adjust import function to look for and set Fandom first, and then try to match any other tags discovered to the fandom's canonicals pool, so as to allow as much of the filling-in process as can be done. Results will probably be about the same as they are now.

5) Write (another) filter sidebar in which only the canonicals show. A fandom index might show Characters, Relationships, and Genre/Flavor canonicals. A character index might show Relationships that character appears in and Genre/Flavor canonicals. A relationship index might also show associated Genre/Flavor canonicals. None of them should show Additional tags at all; instead perhaps there can be a link to the Tags section and a link to the Search form. Revise the in-menu "or" function so that it usefully searches for "character X OR character Y" AND "flavor A OR flavor B" instead of just "anything with X or Y or A or B" which is useless. All of these will be very straightforward mysql queries, instead of monstrous, ass-end-to walker functions. Hold the execution of this until tags are retrofitted.

6) Prep work done! On to the nerve-wracking, if not really difficult, part. Write a query to replace every non-canonical tag id with its canonical version in the story table of the database. Depending on the database structure, this could be as simple as "story.tag_id = tags.canonical where story.tag_id = tags.tag_id and not tags.canonical = ''". Take a good drink to settle your nerves. Clone the story table. Run the query on the clone; this will probably take a few days to get through. Make a news post that The Time Has Come. Take another drink. Shut down the posting form, import any new entries since cloning, swap the names of the clone and the live tables, revive the posting form, now breathe. That should take maybe fifteen minutes, if people prepare beforehand, and that's allowing a margin for a mild case of hysterics or two. If you want to close posting during the replacement process, you won't even have to do the import step. Lo, the archive is running seamlessly on all canonicals! Take another drink.

7) Now it's time for clean-up. Write a query to delete all empty and/or non-canonical tags. Write another to determine all stories with character/relationship meta-tags, the authors' emails, and send them auto-emails informing them that they have meta-tags Y on Story X that should please be re-set to the appropriate canonicals in order for the story to be searchable. Delete these particular meta-tags as they empty. Send general emails to inform all authors that retrofitting is done and they should please check their stories. Warn the wranglers and support people before-hand, because there will invariably be some mistakes showing up and probably some irate users needing to vent. In fact, make a special news post for them to vent in, including any strategies people can think of for easy checking and clean-up, and link to it in the email. Chairs should be on-hand for their volunteers with tea/hard liquor/kleenex/adorable kitten pictures. Schedule this, people.

8) Once the fallout looks to be dealt with, announce completion and success. Now you can \o/. Send the new sidebar live!

9) It's now time to reform the navigation. The Fandoms lists can probably stay as are, at least for now, but every single index page, whether for a fandom, a character, a genre, or whatever else, every index page should show the branch of the hierarchy it is in, as a breadcrumb at the top of the page. For example, selecting "Bishoujo Senshi Sailor Moon" should show, in the breadcrumb "Sailor Moon - All Media Types >> Bishoujo Senshi Sailor Moon". Selecting "Tennou Haruka" from one of those stories should show "Sailor Moon - All Media Types >> Characters >> Tennou Haruka" (or whatever the parent structure is, in that example). Beside the breadcrumb should be a link to the landing page of the fandom. All of these functions and pages exist already, all that is required is to make them visible to the users as well as the volunteers, with a few conditionals to conceal the wrangling links. The public view of the fandom landing page should also have a link to the "request new canonical" form.

10) While you're thinking about it, fix the advanced search, also, so that it has sub-fields for different kinds of tags, and searches for discrete tags as opposed to doing breakage-prone all-field string comparisons.

Congratulations. You now have a working, navigable, professional looking goddamn archive, that can run fast and sleek with as many readers/users as want to come; be proud of yourselves!


Now. That involved only a small amount of new coding, all of it straightforward, and it will fix both server-load and worker-load. The majority of the fix is one query to canonize existing story tags, and a slightly edited form to select new ones, using canonicals and hierarchy that are already established in every case. The rest of it is simply showing users the navigation that's already there. This change-over would not break any existing archive function. It could be nearly seamless. It would even surface things that are currently mis-wrangled but don't readily show it on the front end as the tags stand. And the wranglers would have the far more manageable job of reviewing requests for new canonicals and maybe populating new fandoms instead of trying to make sense of every senseless tag with their hands tied behind their backs. Everything is in place already, to make this work!

It could be done so easily. It could be started right now. WHAT IS STOPPING YOU?
anatsuno: (thinking about it)

[personal profile] anatsuno 2012-07-10 02:21 pm (UTC)(link)
I should be running out the door and I’m not - at all - skilled at this on the technical level like you are; I completely agree that the way things work NOW is unsustainable and it needs to change, and I haven’t read the comments (this is my whole disclaimer), but here’s where and why I respectfully disagree with you here - I hope I’ll be articulate enough.

I understand that in other archives since time immemorial the work the archive did, the infrastructure it provided, was to have / relied on having canonical silos. But the AO3 was created precisely with the philosophy that it would support and encourage diversity and not enforce formulations (much) in those so-called canonical things. this was explicitly asked for/debated/decided at the time, I remember it clearly because it was and remains crucial to me.

What I mean is, the fact that I can designate a work of mine as being LotRiPs (fandom) and not Lord of the Ring RPF (fandom) is not unimportant to me - it is one of the basic, fundamental reasons I archive my work at the AO3 when I have never archived it in a fandom-specific or pairing-specific archive before.

What you’re suggesting would totally make AO3 life easier but I can’t agree with your step #3, at all. I wish wrangling would change but I disagree strongly w/ making tags enforceable from the top. Still & forever. The promise of a lack of top-down imposed vocabulary / taxonomy was made, and I want AO3 to keep it.

(aside: that promise is not even implemented enough imo: I am disappointed that while I can - I do - indicate that my fic is a LotRiPS fic on the page itself and in the header, the title of the page in my browser, which also becomes the text of the bookmark if I bookmark the page on Pinboard, say, or the filename if I save the fic, remains the “canonical formulation” - that’s a top-down imposition I resent, not just because I would resent it no matter what, but precisely because it goes against the promise that was made)

The thing is, whether it ruffles coders and librarians’ feathers or not, different ways to format relationship and fandom tags have semantic meaning, sometimes a LOT of it, see many a fandom kerfuffle - so imo users must be able to tag using those formats if we mean to respect them as creators. The whole point of the tag system is to invisibly unify that diversity, not to erase it.

I’m fascinated with [personal profile] sara’s comment about taxonomy having value as work that is looked down upon in the OTW, and I agree that is a problem - work is work, and librarians indeed have worked long and hard on these issues, we might not have to reinvent the wheel. But I think we’re also doing something that has never been done, precisely because we’re not trying to impose (too much of) a top-down decision on semantic differences that are meaningful to a very wide, very varied and very opinionated crowd. As a friend of mine (a coder, too) put it after reading your post: “So they’re proposing to remove the part of the tagging system that’s more sophisticated than any other site in or out of fandom? No.”

What would help more than enforcing CANONICAL TAGS ARE ALL THAT APPEAR is letting people designate synonyms themselves. So that if I tagged my story with a variant tag, it would say “Is this related to X”? Or, “is there a canon tag you want to hook to?” But I could still tag my stories Star Trek: Alternate Original Series instead of Star Trek (2011), damnit. Just let ME wrangle them as I submit my work; allow users to suggest merges like LibraryThing does, etc. But don’t force my hand.

It seems, from what I hear from within, that building infrastructure for interactive support is regularly backburnered - but it could totally address this.

So yeah, there is a lot to discuss, here. Also, I admit, I'm not just loath to see some of the changes you propose, I’m also irritated that the history of the AO3 seems forgotten; that most people seem to think it all was done wrong because omg some people are egotistical/stupid/inefficient and the Org is a slow dim-witted behemoth, and so on and so forth. There are underlying philosophies behind some historical decisions, if not behind all, and there are ethical discussions that were had at the origin of the archive that we all seem to have forgotten entirely, here.

So yeah, there are reasons why I can tag my fic so the header says it’s a LoTRiPS fic, or a Star Trek: Alternate Original Series fic. There are reasons why I need to be able to say “this is a domlijah podfic” and not a “Dominic Monaghan/Elijah Wood” podfic. And these reasons are why I (and a number of others) archive my work at the AO3. They make the computer work difficult, and organizing wrangling the way it was organized has also required + made human work difficult, so we both agree that some things need to change - but that doesn’t mean disappearing diversity is the answer. We can change in other ways.

At least I very much hope so.


Edited (typo) 2012-07-10 14:28 (UTC)
kenllama: llama, with caption "I feel pretty" (Default)

[personal profile] kenllama 2012-07-10 06:13 pm (UTC)(link)
I think you've nailed it here.

a) it has to work, and
b) the mechanism has to be sufficiently transparent to the user that they can participate in the sense-making of the archive.

Right now, the heavy lifting of navigation-related sense-making is relegated to the wranglers and made heavier by the (ignorance*) of the users.

There are systemic arrangements that leave, as you say, collateral damage, and those really need to be fixed. I appreciate the folksonomic variations (free-style tagging) that enrich a controlled vocabulary (canonical tagging), and I think that your proposal to use controlled vocab for some fields and have free-form tagging available elsewhere is an ideal solution.

* ignorance only in the sense that because the system is opaque to the users, they can not participate in the process of helping it work better.
anatsuno: Troy & Abed stuck in a vending machine (Troy & Abed in the MAchine!)

[personal profile] anatsuno 2012-07-10 07:54 pm (UTC)(link)
You're right about the underlying structure - it's what I'm pointing the finger at when I say the promise hasn't been fulfilled the way it could/should have been (imo, ofc). You can see your offer to make a cleavage between 'personal flavor' and site navigation as more honest, but personally I would still see it as a let down. It's not the same to have an engine that knows that my tag lotrips is like someone else's tag Lord of the Rings RPF tag or to have an separate field for local color. It's not the same semantically and socially, not just computationally.

I understand your argument about not wanting a certain percentage of works to be unfindable, of course. I guess I'm unconvinced that a user-proposed synonymity system + user-powerered crowdsource merging capabilities would really drop that much fic.

(The importance of which is not negated by pointing out, though, that not everyone who archives at AO3 does it to be findable in that way. Some people are perfectly content to be findable by people who know the shibboleth of their fandom in precisely the right way, too. Fandom runs on shibboleths.)

I'm way more inclined to think a (different) solution is, as [personal profile] troisroyaumes says, to split display name from canonical name. It's not the same as splitting the fields where the info lives into site-wide & user-specific, imo.
troisroyaumes: Painting of a duck, with the hanzi for "summer" in the top left (Default)

[personal profile] troisroyaumes 2012-07-10 06:39 pm (UTC)(link)
(Dropping in from [tumblr.com profile] unofficialotwnews) I feel like there should be a way to modify [personal profile] branchandroot's proposal so that you have the canonical dropdown that determines where your fic appears but also have another field to type in the display fandom/character/relationship name if you dislike the canonical form. If you don't specify a display name, it'll default to whatever canonical you chose. So canonicals get used for searching, browsing, etc. and serve as the fanwork metadata, but the additional fields are used for the actual information display on your fic. That would stay true to the Archive's philosophy of user choice/expression while actually fixing the search/navigation issues.

However, I think this would involve more coding work than [personal profile] branchandroot's current proposal. I mean, I am not involved with AD&T in any way, so I don't actually know how their database is set up but I suspect it would require more database wrangling if you had to separate out display from metadata.

ETA: I want to add...I really like this proposal (with the above modification or not) and am hoping that AD&T staffers will take a look and give it some consideration.
Edited 2012-07-10 18:43 (UTC)
troisroyaumes: Painting of a duck, with the hanzi for "summer" in the top left (Default)

[personal profile] troisroyaumes 2012-07-10 07:14 pm (UTC)(link)
I think it wouldn't increase server load since the Archive already has to retrieve the original tags used by the author for display purposes. Changing the system so that these are just display strings that don't relate to anything else in the database and don't need to be indexed might actually speed up performance. These would be extra columns in whatever table holds the summary, notes, etc. so it shouldn't increase the number of queries. But yes, I don't know about large scale site optimization either to know for sure.

According to [personal profile] ira_gladkova's latest post, AD&T is already talking to [staff profile] mark so that's a hopeful sign.
troisroyaumes: Painting of a duck, with the hanzi for "summer" in the top left (Default)

[personal profile] troisroyaumes 2012-07-10 07:49 pm (UTC)(link)
Good point...an on-demand field sounds like a good idea to me; one could do the same for the "canonical request" field as well. A UX consultant could probably weigh in on how to include the functionality while keeping the form simple, and I know there are definitely UX design professionals in fandom...

Hmm, maybe I'll post about this proposal in the volunteer forums and see if it gets anywhere. Granted, not too many people use the volunteer forums at the moment, but I've seen the AD&T chair post there.
synecdochic: torso of a man wearing jeans, hands bound with belt (Default)

[personal profile] synecdochic 2012-07-10 09:38 pm (UTC)(link)
Out of curiosity, because I've always wondered: why is "LotRiPS" vs "Lord of the Rings RPF" a meaningful distinction? I mean, I've always noticed that it is usually called the former in most of the circles i've encountered it in, but not always, and I've always been curious :)
annotated_em: cross-section of a lemon (Default)

[personal profile] annotated_em 2012-07-10 09:40 pm (UTC)(link)
Thank god I'm not the only person who has this question.
anatsuno: a women reads, skeptically (drawing by Kate Beaton) (Default)

[personal profile] anatsuno 2012-07-10 09:43 pm (UTC)(link)
hahaha, this is where I show myself to be completely ridiculous - I'm not aware that there is any difference in meaning, *except* that one is a shibboleth and the other isn't - which isn't to say that that I prefer it because it's a filtering device (that's not why I like it).. I prefer it because it's the name of the fandom as I knew it. I was never part of the Lord of the rings RPF fandom, I was (am) "a lotrips fan". Even though both things supposedly mean the same thing, for me, calling that fandom something other than lotrips is like having to change the name I call my aunt by - which I would be happy to do *if my aunt was choosing a new name*, but be horrified at if the government made that choice for her (and me) - do you see what I mean? Lotrips is what the fandom I was in called itself. Lord of the Rings RPF is what outsiders call us.
synecdochic: torso of a man wearing jeans, hands bound with belt (Default)

[personal profile] synecdochic 2012-07-10 09:54 pm (UTC)(link)
Ah, got it! Thanks for clearing that up, it's been bugging me for years :)
anatsuno: a small white dog with a long blonde wig and oversized white plastic shades  (middlename: overkill)

[personal profile] anatsuno 2012-07-10 09:53 pm (UTC)(link)
I feel compelled to add that because one might find my personal peeve re: the name of my fandom ridiculous, that doesn't negate the validity of my argument. There are other examples where the canonical names for a fandom pose much more serious diversity issues (reflect a colonial holdover mentality for example) and alienate people in different ways that aren't just 'anatsuno loves the silly insider name for her fandom too much'. I hope you won't dismiss the idea out of hand - it's a real outreach/diversity issue. It might not be *enough* of an issue for it to take precedence, what with all the problems to solve with the archive, but it's not a ridiculous issue either.
lady_ganesh: A Clue card featuring Miss Scarlett. (mmm what?)

[personal profile] lady_ganesh 2012-07-11 03:07 pm (UTC)(link)
IDK, though, that happens anyway. I still grind my teeth when doing pairing tags for Weiss Kreuz - yes, I can tag the damn story however I want, and the character tag for Omi | Mamoru is now helpfully one tag with a pipe, but on any search and inside the system, my "Naoe Nagi/Takatori Mamoru" story is going to get hooked to "Naoe Nagi/Tsukiyono Omi", and my feelings about the differences in those parings are essentially irrelevant.