Branch (
branchandroot) wrote2012-07-11 12:37 pm
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Entry tags:
Gotta be a table
*drums fingers*
The more I think about the idea of a display-name/alias for canonical tags, the more I think: it's going to have to be a new table.
I'm pretty sure that it would be faster to store any given story's aliases in a single field of the Story table as a string of key-value pairs. But those pairs would need delimiters that would not ever show up in the aliases themselves and that... that's not something I really want to bet on, when it comes to fandom, names, and the evolution of pairing syntax.
It would also be more accident prone during the canonizing of the tags. (I'm starting to feel like I should capitalize that phrase. Like the Running of the Bulls or something.)
So. Separate table, nice and simple, with columns for story id, canon-tag id, and alias. I'm thinking story id should be the primary key. The most frequent use is going to be producing story-blurbs, and I'm guessing this sucker is going to have to be partitioned. So, partition by story id range and crank that into the one extra query this will create per story (or at least prepare for it; it might not be necessary yet).
Putting aliases in their own table will also make them far more easily searchable.
petronia pointed out that some fan cultures, for example the Chinese-language fans, will want and expect to be able to search for things like "Sasuke/Naruto" as distinct from "Naruto/Sasuke", and also search for things like "*/Sasuke" (that is, anyone/Sasuke). Putting a specific "search display names" option on the Advanced search, and putting the aliases into a table of their own with one alias per field, will make that a lot more feasible. It won't be perfect, because it will rely on authors to alias, but there should be some significant overlap between authors who will alias like that and authors writing the kind of fic members of that fan culture want to see.
Now the first step of the Canonizing of the Tags is a lot simpler. Relatively speaking. For each story, get the current tags; get the names and ids of those tags; write them to the alias table. That can run in the background as long as it takes, since it won't affect anything yet.
The new posting form has to be prepared next, so that it's ready to present canonicals and "talk" to the alias table. The new code to display story blurbs and pages will need to be done up, but that should be nice and straightforward. Incidentally, I quite like
busaikko's idea of putting Additional tags, possibly labeled Author Tags, under the summary to make them clearly separated from the menu tags and more clearly part of the author's own meta-information about their work. That would prepare the way to possibly show Reader Tags, and make them clearly distinct from anything the author zirself put on the story.
And now the canonization query should run smoothly, as each child-tag id is replaced with the id of its final parent tag, including the ids in the alias table. Which will, after the query runs, match up with the changed ids in the Story table. And without needing any dangerous, and slow, additions like "match any number that comes before X delimiter in this long string". *dusts hands*
Best of all, as
niqaeli points out, this can be considered an improvement in user control of their own content. Users would now be able to absolutely control what canonicals are associated with their stories, instead of being forced to leave that up to the wranglers. (Which must surely be less nerve-wracking for the wranglers too...). The user will still have control of exactly how all the text of their content appears and, since no user actually has control of how the navigation structure appears right now, no one will lose any control they had.
So there is an increase in user control, plus an improvement in searchability, a considerable improvement in stability and performance, and a huge improvement in the efficient use of wrangler time that might let them be more pro-active in populating new fandoms with suitable canonicals ready for author use. This should even, much as it disgusts me, let the OTW leadership avoid Step One from my previous post. So get cracking, people.
The more I think about the idea of a display-name/alias for canonical tags, the more I think: it's going to have to be a new table.
I'm pretty sure that it would be faster to store any given story's aliases in a single field of the Story table as a string of key-value pairs. But those pairs would need delimiters that would not ever show up in the aliases themselves and that... that's not something I really want to bet on, when it comes to fandom, names, and the evolution of pairing syntax.
It would also be more accident prone during the canonizing of the tags. (I'm starting to feel like I should capitalize that phrase. Like the Running of the Bulls or something.)
So. Separate table, nice and simple, with columns for story id, canon-tag id, and alias. I'm thinking story id should be the primary key. The most frequent use is going to be producing story-blurbs, and I'm guessing this sucker is going to have to be partitioned. So, partition by story id range and crank that into the one extra query this will create per story (or at least prepare for it; it might not be necessary yet).
Putting aliases in their own table will also make them far more easily searchable.
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Now the first step of the Canonizing of the Tags is a lot simpler. Relatively speaking. For each story, get the current tags; get the names and ids of those tags; write them to the alias table. That can run in the background as long as it takes, since it won't affect anything yet.
The new posting form has to be prepared next, so that it's ready to present canonicals and "talk" to the alias table. The new code to display story blurbs and pages will need to be done up, but that should be nice and straightforward. Incidentally, I quite like
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
And now the canonization query should run smoothly, as each child-tag id is replaced with the id of its final parent tag, including the ids in the alias table. Which will, after the query runs, match up with the changed ids in the Story table. And without needing any dangerous, and slow, additions like "match any number that comes before X delimiter in this long string". *dusts hands*
Best of all, as
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
So there is an increase in user control, plus an improvement in searchability, a considerable improvement in stability and performance, and a huge improvement in the efficient use of wrangler time that might let them be more pro-active in populating new fandoms with suitable canonicals ready for author use. This should even, much as it disgusts me, let the OTW leadership avoid Step One from my previous post. So get cracking, people.
no subject
I get what you're saying about 'matronizing'... it just read kind of strange. Maybe because 'patronizing' has the sense of 'condescending' (as in, I know better than you) while 'matronizing' made me think of motherly nagging. Which isn't really an "I know better" so much as a "I might not know anything at all but I'm still going to get on your case about it, over and over and over, until you agree". Heh. Which is cultural gender-bias in that interpretation, admittedly.
As for the controlling behavior... it's a bizarre little subculture that's grown up, hasn't it? Seems to me that the original or stated intention -- you can do what you want! with your database! -- has caused a kneejerk kind of reaction amongst the higherups. Like, in trying to allow too much chaos in one area, or trying to let people interact without set boundaries, they ended up going in the total opposite direction to counteract that chaos. Sort of acts like an entire object lesson in how firm boundaries and consistent patterns are required in many things, including interpersonal relationships, archive management, and database tables. Let one area go totally crazy with lack o' boundaries and lack o' consistency, and you end up overcompensating elsewhere. So I guess the way I see it is that if the archive and its structure were more, well, structured, then the management could relax, because things would be clearer. They wouldn't be locking down out of some unspoken emotional need for consistency in the face of so much ambiguity.
The problem is (above and beyond the problem of what rules to set) is breaking out of that existing rut of too much one way, not enough another way, to find the balance. Especially since that means reducing a fuckload of drama about the imbalance and the inconsistencies and the control, and that kind of imbalance tends to attract people who are imbalanced themselves -- which just creates a feedback loop between the systems (the archive itself) and the people running/using it.
Which is to say: ultimately, adding or revising tables in the database isn't a simple act of just adding/revising tables. It's something that could, and probably would, have a far-reaching effect on the overall archival-cultural system as a whole, because the database structure is endemic to the entire system's culture and mindset. And thus, my dear, even a seemingly minor change could have ripples far beyond whether we call it 'character/???' or 'character/unknown'.
no subject
Which is really a shame, because the "???" is so much more evocative. I wonder if using entities instead would work...
if the archive and its structure were more, well, structured, then the management could relax, because things would be clearer
*nodnod* I do think that would help a lot. And the same in other areas of the organization, too. There seems to be so much people are trying to do "by hand", as it were, that should be automated while critical decisions get "I don't know, what do /you/ think" to death, so many places where bylaws should exist and don't but there are umpteen gajillion meetings being held over the tiniest details that the volunteers could really be trusted with. I think you're right on the money about the control having gotten into the wrong places!
no subject
(learned that as well working with using a symbol instead of a slash -- I think we used a square bullet -- as a compromise between / and x and + and the various fandom connotations for each. Unfortunately, within a few months, half the bullets weren't square but were random bizarre nonsense strings, and I was writing entire functions just to find and change them back after any save. grrrrr.)
no subject