Wednesday, 11 February 2015

Failure to Communicate

So the Senate Committee Report into captioning reform in Australia has been squeezed out. To the surprise only of those who think the world is comprised of caring and fundamentally good people trying to do the right thing, the report is a snivelling apologia for the broadcast industry, the committee politely rolling over, licking their pointy boots, thanking them for its beatings (how else will it learn?) and asking polite permission to be allowed to please put the lotion on its skin.

Right away, Rupert.


It breezes right the fuck past all submissions from consumer groups. Weirdly it quotes them, provides no particular counter-arguments, then decrees whatever free ride the broadcasters were going to get anyway. I’m not going to go through it in detail, because honestly it’s too inane and that’s an exercise that would endumben us both. But for a sampler, the Minister (a man more renowned for Not Being Leader and for an admittedly fabulous leather jacket than for any particular intellect) concludes that:
I want to make it quite clear that broadcasting licensees will still be required to meet the same specified level of captioning for television programs to assist viewers with hearing impairment.
Now I don’t want to give you a sense of déjà vu, learned reader, but I went into some detail about how that was not actually, in the tangible world to which we are confined, the case. As a random sample, the ability to average the captioning output across linked sports channels (a measure which in overseas markets has led to reduced output), a get-out-of-jail clause for technical faults (news flash – the regulated free market can incentivise really clever measures to reduce those technical faults), the expanded exemptions for new channels (such a missed opportunity to build captions into the foundations) and the reduced standards for live captions (keep me honest, ffs!), ALL represent a drop in the Minister’s “specified level of captioning”.

Ah, but in your coat pocket...your card!


…But I don’t want to retread old ground. Indeed, here would be the point, in a well-conducted public debate, where I look at the rebuttals to points such as mine*, develop my defences for that which remains defensible, and abandon that which is not. But there’s very little scope for it. The views of the deaf fell on deaf ears, while erstwhile bastions of fairness among the ranks of broadcasters, like ABC and SBS, proved fair-weather friends. Arguments such as mine were raised in the same way that a budgie smuggled into a coal mine squawks about climate change – unheard, disregarded, directly inactive and dead on arrival. They were not engaged with, but instead placed side-by-side with the submissions of the Grown-Ups and talked past, as if tacitly admonished to run along and play outside.

* In fact, not only were my arguments mirrored in many submissions to the committee, my own words were quoted as part of one group’s submission, so the debate is discombobulatingly direct.



If there is one new argument to be made, it is this. The requirement for consumers to report, rather than broadcasters to annually self-report, was justified on the basis that between 6:00am and midnight, primary channels require 100% captioning, so it’s “easy” for consumers to see when it’s erroneously missing. That argument holds some water. Not enough, but some. But that exact reasoning constitutes a really good reason why averaging of captioning quotas on sport channels is a bad idea. The committee effectively acknowledges that reliable captioning quotas are the easiest kind for consumers to help enforce. Fair enough. Well, then the committee specifically endorses making it harder to report captioning problems on sport channels, because there that reliability gives way to “flexibility” (to offer less, of course – Murdoch’s sport networks already have the flexibility to exceed requirements).

He’s a flexible guy.


We also apparently have “no evidence” that viewers will be less effective at enforcing compliance than comprehensive industry record-keeping. Whereas, you see, we have lots of evidence from broadcasters that it will reduce the regulatory burden. That sound you can hear is a captioner screaming something about unequal consultation, in the darkness, after finding himself tragically unable to shake shake shake it off.



Must highlight this little gem:
The committee agrees that the breadth of consultation in relation to this bill has been insufficient. As a consequence, the effect of some proposed amendments appear to have been misunderstood…
Catch that? Consultation was too short. Not, however, because of any shortcoming in gathering the views of stakeholders and forming a broad consensus model for the legislation. No, “consultation” here means unilaterally talking, and too little consultation means they haven’t browbeaten people sufficiently into submission. For the record, my colleagues, my viewers and I understand. We do. We can’t help but notice it’s just a bit shithouse, is all.



It should be noted the committee represented the depressing bipartisanship for which our Senate is of course so celebrated. Which is to say, weak-as-piss Labor rubber stamping. The committee consists of three Coalition Senators, two Labor and one Green, along with two further Greens listed as “participating members for this inquiry”. The Greens issued a strong dissenting report, calling for a host of sensible amendments and rejecting large parts of what must, therefore, have otherwise been a consensus view (between Rupert Murdoch and his navel).

Bipartisanship.


So what do we do now? Well, unless I’m misreading it, the marginally amended bill still needs to pass the Senate, so if you hear of any Labor Senators ambling in the direction of Damascus, consider rigging up some extra-flashy pyrotechnics and a generous supply of peyote. And if and when we lose, keep on fighting in the form of complaints. Whether you’re a viewer or even a captioner (why not?), complain officially about every error you see, every loss, every time we cover up important information, every program or channel or instant without comprehensive captions. If it falls to you to enforce compliance, do it mercilessly. Don’t ask yourself whether it was a reasonable live error, whether it was a show which maybe didn’t require captions, if it was a technical hiccup at the network. Register every complaint. Don’t feel sorry for captioners. Keeping us honest makes us do better work, and more importantly makes our clients and employers provide us the resources to do better work. We’re on your side, so punish us good.





Disclaimer.

Monday, 9 February 2015

Miss Happens

You know, there has been some weird reporting about live captions lately. The regulator Ofcom in the UK are being reported in various tabloids as suggesting that lots of captioning errors make captions harder to understand. In other news, the sky is suspected of being blue, the Pope showing signs of leaning towards Catholicism and not all bears fully utilise available public restrooms. I’m willing to assume the tabloids buried the lede somewhere and Ofcom went into more detail about error rates, delays, losses and such, which are the nuts and bolts of caption quality, but the tabloid reports seem to leave it at errors=bad, which is a no-brainer. The Mirror at least takes a whimsical approach to it, using errors=bad as a springboard to segue into a compiled listicle of funny mishaps. But the Sunday Express went bewilderingly fire-and-brimstone with it, using what appeared to be around four errors (ironically enough during a discussion of accessibility services) as a cudgel to attack the BBC, without the slightest discussion of what live captioning involves, or what is an acceptable error rate. They suggest that “Western Mark there is an answer” emerged where the caption should have just read “enough”, which I find pretty implausible because I recognise some of the brushwork. “Western Mark” is definitely an unlucky side-effect of that captioner, whoever they may be, saying “question mark”. I’m guessing the “an answer” part was a mishearing, whether human or software, of “enough”. As for “there is”, two possibilities are equally likely. Either they said it, and the Sunday Express had an uncharacteristic lapse in their usually stellar journalistic thoroughness, or they didn’t say it and it was added in an attempt (futile though it may have been) to trigger their voice software’s syntactical context-recognition. “There is enough…” is less likely to be misheard than simply “Enough”. Anyway, as I’ve said before, we’re in the business of the inexact. An error every minute is great, an error every two minutes makes you that Jedi-on-amphetamines captioner who we all hate a little bit, and if they think three or four errors is newsworthy, then boy howdy do I have some scoops for them.



On with the mishaps! In the lead-up to BAFTAs and Oscars and such, there’s been some tricky film chatter to caption. I know it’s a cliché to joke about his bafflingly homophonous name, but there was a strange, naughty, zeitgeisty joy in realising “an addict Cumberbatch” had gone to air. An award recipient wanted to “thank these gorges, gorgeous women,” because cliff faces are the true unsung heroes of the film industry. The Clint Eastwood classic Letters From Iwo Jima has a certain stoic profundity. “Letters from your Gmail”… does not. Finally, “a noir writer” emerged as “on a wire writer”, which has a nice kind of poetry.



There’s always a little bit of gambling when place names come up unexpectedly. Dragon has a large native database of place names, and we’ve all added reasonably comprehensive wordlists for the places our captions go to air. Still, there’s always a few in need of refinement. Thus we had “pill bra”, which sounds less like an Australian mining region and more like a place you stash your ecstasy so the bouncer won’t find it. Similarly I’m not sure “Albury wood donger” is a real place. Nor are countries spared, with Guatemala finding its way to air as “quite a Mahler”.

Quite.


Politics also remains a source of endless hilarity, and not just in terms of entertaining “captain’s calls”. Who knows what permutation may govern in coalition in the next UK parliament, but I hope no-one sides with the “glib dams” – their arguments, while pithy, don’t hold water. The NHS has been a hot political topic, but while increasing funding for NHS Blood and Transplant sounds like sound policy, “NHS Lard and Transplant” sounds like a Menulog regret waiting to happen. Voting “by conscience” seems laudable; voting “icon sheds” seems incomprehensible. One politician speaking “in relation to debt” spoke instead about “immolation debt”. And here’s me thinking setting fire to stuff isn’t all that pricey. And politics intruded unwonted when a visit to a toy fair became a visit to a “Tory affair”. Political adversaries became political “at the ferries”, which sounds altogether more pleasant. A colleague’s unfortunate pause made Indonesia’s leader “President Joker”. I guess we all appreciate the success of three-word slogans in politics, like “Yes we can” and “Stop the boats,” so I feel “Kill the Batman” is in with a shot. And I had Syria’s leader down as “resident aside”, which I guess given the displaced people in that area isn’t so much funny as not funny.

The owls are not what they seem.


Nature documentaries – still fun. Usually there’s time to nix these before they go to air, as they’re captioned offline, but they give me a guffaw in my booth. So only I got to see that instead of “our aquatic friends,” the humble fish became “power quantic friends”. And an expert, when asked how fast crocodiles grow, apparently replied “Well, it depends what they’re reading.” Finnegan’s Wake is, I grant, a challenge to digest. “Wobbegong” is a fun word at the best of times, but Dragon decided “wobbly goal” was a better fit.

Don’t let it get the best of you.


Captioning church remains an error-spotter’s delight. “Liquid myrrh”, admittedly an obscure phrase, came out as “liquid murder”, which would have made for an undeniably more badass (if less pretty smelling) Messiah. And while “coheirs to eternal life” has a stronger scriptural foundation, “co-eds to eternal life” sounds like more fun.

Forgive me.


The gravitas and dignity of history makes captioning errors in historical features all the more starkly silly. Thus I could only giggle as one colleague marked 10 years since a “50-foot salami” buffeted South-East Asia. And again when combat veterans were said to be suffering from “post-dramatic stress” (I guess they method). And I could only gasp, then giggle as another colleague, in a feature on Auschwitz survivors, made the classic “I scream”/”Ice-cream” switcheroo.



Finally, a couple of random mishaps from lifestyle shows. Recreational archery is made decidedly more challenging when the apparatus becomes “bow and error”. The trade-off between cardiovascular fitness and fun seemed reasonable when “aerobic routine” became “erotic routine”. The tips on where an eligible young gentleman can find the best “rattler pad” may come in handy if I want to have my snake bros over to play X-Box. A celebrity was described as being “no shrieking violet,” which is a good thing as that sounds abjectly terrifying. A sport analyst saying “I sense optimism” unintentionally reaffirmed his commitment to Sparkle Motion when it came out as “ice dance optimism”. “Be with you in a sec” and “be with you in a sack” are two very, very different things. And an interactive segment opening with “we’re back to hear all your thoughts” emerged as “wear bacterial thoughts”.



And that, tabloids, is how you make some gosh darn captioning errors. Just a couple of last things before I go. I mentioned the proposed changes to captioning regulation. Well now there is an opportunity to comment on Australia’s captioning regulations. Go do that! And I’ve talked about how much more smoothly the captioning process goes when you’re part of the process rather than an afterthought. This was recently explored in much more detail in a post on iheartsubtitles. And finally, another interesting post on the educational benefits of closed captions. They… uh, exist.



Disclaimer.

Thursday, 8 January 2015

Quality and Accuracy Part Three: Style and the Great Offline Caper

Happy New Year, captioning enthusiasts! We swan-dive into the thick, opaque molasses of 2015 in interesting captioning times. Australia may or may not have exorcised the spectre of deregulation and quality-cutting, industry-wide security is being tweaked against the threat of hacking (as many have mentioned, captions comprise valuable metadata – a corollary is that it’s data which unsavoury types may covet), but at the cost of some convenience and productivity for captioners, and I discovered on New Year’s Day that captioning live fireworks instead of news headlines at the top of each hour is really quite fun (even if “auld” isn’t the most Dragon-friendly word). So I basically hope 2015 will see continued public support for high-quality captioning, smooth and user-friendly security protocols, and colourful explosions just every day.

If fireworks persist, see your doctor.


Another development over the past few months has been a diversifying of skills for your friendly neighbourhood Rogue Captioner. I mentioned in the very beginning that there are two basic strands of TV captioning – live and offline. Live captioning involves either stenography or respeaking in real time as things go to air like an uncommonly literate hamster, sometimes combined with cueing out prescripted elements. While I remain primarily a live captioner, I’ve been gradually learning most of the steps in the shadowy world of offline captioning, and filling in the sometimes-unpredictable gaps between manic live times with the more methodical offline work. I thought I’d share a little about how it’s done. As it makes sense here to go into style and standards, this post also forms the much-belated third part to my “quality and accuracy” series. The other parts covered losses and errors.



You may be surprised how much more time which goes into offline than live captioning. Live captioners typically produce output at a rate of around 5:2, since they require some prep beforehand (and of course a sandwich after), and then usually share the load 50-50 with a co-pilot. So fully captioning a two-hour segment requires two captioners to each prep for roughly half an hour, then alternate 15-minute or 30-minute slots on air, for a total of five captioner-hours. But offline is a really different, and much more chronovorous, ballgame. It’s rarely less than 10:1 – that’s 10 hours of work for one hour of captioned content – and even that kind of efficiency may only happen if you’re an unusually dextrous millipede with opposable thumbs.



So where does the extra time come from? Well, the glib answer is offliners are lazy perfection and timing. Live captions slither and snake their way onto the air, a word at a time, with around a five-second delay and an accuracy rate of between 97.5 and 99 percent. Offline captions appear in carefully sculpted blocks, adhering to a long list of style guidelines, exactly as the speaker is talking. This post will take you through the process of getting a captioned program ready for broadcast, up to the point of a final edit (that part isn’t yet among my responsibilities, so there be sea monsters), and let you in on the sorts of things we need to keep in mind. Two main processes need to happen – first scripting, and then file/fix-up – with some inevitable overlap between them.



So we first receive an episode, in the form of an MPEG file, from a broadcaster. Captioners assign themselves all or part of the runtime of the episode, depending on how much time and caffeine they have available, then get to work creating a script. The first step in scripting is to import it into our offline captioning software. This software is designed to stop, collaborate and listen with both Dragon, the speech-recognition software for respeakers, as well as the shorthand software used by stenographers. It combines video-navigation functions like play, stop, slow-motion, or (very usefully) jump-back-one-second-and-play, aka the “what was that?” button, with captioning functions like colour change, positioning on screen, and adding, deleting and combining captions. It adds up to an absolutely dizzying array of keyboard shortcuts, and watching someone really experienced use it can be quite baffling. I ain’t there yet, so the shortcut I’ve most mastered is Ctrl-Z, to undo. Next, before you get started, you need to make sure the timecode on the video matches that in the caption file, which can be thought of as the captioning equivalent of the clapper used to synchronise audio and visuals in film.



The software then takes a moment to create some invaluable metadata (I barely knew her data!) which maps out the audio track, calculating the “shape” of the sound in a way which will help it to guess where each caption should fall. Interestingly though, it also maps out the visuals, marking out where all the shot changes fall. When I discuss the second process, file/fix-up, that will come in handy.

Metadata: Nothing Whatsoever to do with Envelopes.


So now the main work of scripting can begin (at the very beginning, which Julie Andrews tells me is a bonza place to kick off). On the first play-through of the file, or “first pass”, we respeak it in much the same way we would for live content, with a few differences. Firstly, since we can pause and go back, there is no sense in paraphrasing to avoid words we don’t know or can’t catch, or to get around cross-talk (characters speaking over each other) or fast dialogue. The first pass can be far from perfect, but it must at least be reasonably complete. While in live captioning it sometimes makes sense to skim or just convey the gist, offline has no place for that. Secondly, the first pass is where we begin to create the timing. We do this by setting markers (more keyboard shortcuts) where a conversation or section of narration or whatever begins and ends. The software then gets fancy. It guesses the breakdown of the captions, based on colour changes, punctuation and a two-line limit, then looks at the shape of the audio from that metadata I mentioned, and roughly matches up each caption to that like an audiovisual OkCupid. So let’s say I’m captioning this scene:


I would put a section opener before “Big man in a suit of armour. Take that off, what are you?” and a section closer after “Genius, billionaire, playboy, philanthropist.” The software would recognise three sentences, each comfortably within two lines in length, and would split it accordingly into three captions. Then it would read the audio track metadata and see three little corresponding spikes at frequencies consistent with human speech. I’ve told it where the first caption begins and the third one ends, so it should make an educated guess that the second caption begins after Steve pauses, and the third begins where the voice soundwave shifts to Tony’s taunting tone. Those sentences were all similar in length, but if they vary, the software can include sentence length in its calculations. It often gets it wrong (single words which are held a bit long like “Stella!”, rapid-fire sentences, and lyrics cause particular problems), but it gives us something to work with when I eventually come back and fix the timing.



So after the first pass we have a rough script, roughly timed. The aim of the second pass is to get it word-perfect. We watch through again, pausing to fix any Dragon errors, verify any proper nouns, and standardise spelling to our style guides. For Dragon errors, there’s a handy keyboard shortcut which cycles through homophones of a selected word, so you can quickly turn nay into neigh if the politician voting against the motion turns out to be Mr Ed. For proper noun verification we can use the credits, imdb, or else my company maintains an enviable database filled with soap opera family trees, the baffling names of reality TV contestants who Didn’t Come Here To Make Friends, and street directories for places that never were. I won’t subject you to all of our spelling standards (with some exceptions, reciting the dictionary isn’t the best way to dazzle readers), but one of my favourite documents is our official spelling list of non-verbal sounds. So “eugh” expresses disgust, but “ew” is used to “express disgust, Valley Girl-style”. “Ah” always expresses discovery and “uh” uncertainty (except where it’s within “uh-oh,” “uh-uh” or “uh-huh”), even though what you actually hear may sometimes be the other way around. “Oh” is surprise or an interjection, but “Ohh” is emotional pain (“O” sometimes comes up as a religious invocation, but usually needs specific verification). If it’s quizzical or contemplative it’s “hmm”, but if it conveys agreement or pleasure it’s “mmm”.

Another acceptable usage.


We also resplit captions at this point to be as readable as ephemeral text on a glowy rectangle can be. Here there can be some trade-offs. We try and keep sentences, or clauses, or concepts, or individual speakers, together. We try not to end either a line or a caption with a preposition (of, for, under…), a conjunction (and, but, so…), an article (the, a, an) or a verb – anything, basically, which belongs with the word that follows. We try not to have a book or movie title go over a line or caption break. We don’t use semicolons as they’re difficult to read without the ability to glance back over the first part. We avoid colons in most cases as when captioning they specifically mean “read the screen”. So if a phone number is printed onscreen, we might caption “For a free trial appendectomy, call:”. And more generally we err on the side of shorter sentences as they’re more readable onscreen.



So once we finish the second pass and a quick spell-check, that wraps up the scripting phase. Next comes a grab bag of chores under the heading “file/fix-up”. We now have a word-perfect script, the next big task is timing. A handy keyboard shortcut which moves the video to the beginning of the caption you’re editing becomes your friend at this point. Hopefully many of the captions will be sitting roughly where they need to be, but we go through and meticulously adjust the start and end times of each caption to correspond with when the speaker is doing their speak thing. There’s a few exceptions though – the captions should be no shorter than one second, even if the utterance is, because while a very short caption doesn’t take long to read, it might take a moment to notice. A caption can linger longer for readability if it exceeds 300 words per minute. If a pause is very short, like on a dachshund, we don’t put a gap between captions. It looks more polished, and it also means that when the captions do cease, the viewer will subconsciously know the conversation has paused and they can safely look around the rest of the mise-en-scène without having to be immediately yanked right back to the captions. For similar reasons, we try and align the beginning and end of captions with any relevant shot changes, even if the speaker begins talking slightly before or after the cut. And they often do – there’s a film editing technique called a “sound bridge” which involves using sound to smooth over a visual cut. Sound bridges can also mimic the way our senses work – we begin hearing a sound and then look up. But a cut involves a whole new slab of (visual) information to take in. If a caption comes just before, the viewer might be intently reading it and miss something visual. If it comes just after, they might be taking in the visuals and run out of time to read the caption. If it’s simultaneous, it maximises the time to take both in.



So back to the above Avengers clip, let’s look at the timing considerations. No-one is talking fast enough to present serious problems with reading speed, and not many of the pauses would be long enough to justify a gap in the captions. Tony’s “Why shouldn’t the guy let off a little steam?” gives a nice example of a sound bridge as the cut from a long shot to a mid shot happens just before he finishes talking. So we’d probably clear it right on the cut, which frees up the viewer to take in the visual tension between Steve and Tony. Whereas for Tony’s last close-up, he starts saying “Genius, billionaire, playboy, philanthropist” just after the cut, but close enough that you’d probably start the caption on the cut. As an added bonus, this makes it easy to see who is talking, as it ties the shot and the caption together, like a comic panel and a speech bubble.



We also do caption positioning at this point. I’ve mentioned the principles of this, the main things are avoiding speakers’ mouths and important visual information. By default we hug line 20 at the bottom of the screen, raising over any supers when necessary. Interestingly, we also have to raise for 10 seconds after ad breaks in case network promos are added in post-production, after we’ve done our thing (or when the show is repeated). We have to make sure there’s at least a second clear just before and after ad breaks, as that can cause “hanging caption” glitches. For the same reason, a blank caption is needed at the beginning so any late-running ad captions don’t get stuck. We insert labels where it isn’t clear who is speaking, sound effects where relevant, and a captioning company credit at the end. We run a battery of tests which check for errors, short gaps, minimum and maximum lengths, word rates, overlaps, captions too close to shot changes, spelling, homophones and invalid characters (some text we copy in comes with the wrong kinds of apostrophes, which is a headache), and then we…uh, watch the show. We watch it as it will air, or at double speed if we’re running short of time, and look for anything that seems wrong or unclear.

Nup, all seems in order.


And then we send to the editor, and go get a sandwich.



Disclaimer.

Wednesday, 26 November 2014

How do you like them Mishapples?

Hello to blogglers new and old! For the uninitiated, Mishaps is the kind of post where I share the gems of captionfails, mine and those of my colleagues, largely resulting from lapses in judgement on the part of the usually-excellent Dragon voice recognition software into which we murmur our days away.



Democracy in Hong Kong was the first cab off the rank this month. While Mong Kok may one day go down in history as the birthplace of Chinese democracy, “mong cock” as occasionally offered by Dragon might require a visit to the chemist and some awkward phone calls to former lovers. In a similarly intimate vein, the Gospels could certainly have been more fun if Jesus’ prophecy to Peter had climaxed when “the cock grows three times”, as Dragon inferred. Couldn’t help thinking of the Grinch’s condition.



It’s nice to see sophistry named for what it is in political debates, but Dragon decided Nigel Farage rather has to answer for his “soap history” – I guess no man should know that much about EastEnders. Farage also had plenty to say about EU bureaucracy – but since Dragon seems to kind of like paperwork, the captions instead offered “you rock receipt”. On the subject of UKIP, it was suggested we could “see coalitions” forming. Unfortunately, captions viewers were treated to the rather incomprehensible “sequel ocean is”. Personally, I didn’t think the Oceans sequels were so crash hot. Clooney kind of phoned them in.

And then I say “She looks like Julia Roberts.” Hilarious.


A random netball allocation provided some unintended entertainment. “Sweet netballing moves”, which is a valid if slightly vague observation about the state of play, was unfortunately captioned as “sweet and appalling moves”. Also instead of players who “wear bibs”, we had players who “wet beards”. Which to me sounds like a rather different game.

A dark, dark game.


Nature documentaries continue to provide counter-intuitive captioning fun. A particularly biodiverse environment was described as having a “huge variety” of birds. Unfortunately Dragon preferred “huge pariah tee” which sounds more like a controversial extra-large shirt. I laughed too at the lethargic pedagogy implied by “Galapagos taught us”. Taught us what? We may never know. Bunny domiciles were referred to as “rabbit warrants”, and I was heartened to hear those cotton-tailed homeowners knew their rights when the adorable rabbit cops came a-knocking. The rainforest canopy, we learned, offers woodland creatures the odd “shady brunch”. I can only infer surreptitious eggs benedicts and bootleg flat whites. Ducks took a turn for the emo, apparently having “wept feet”, and sparrows had their hip hop dreams dashed, due to having a “cracker-type beat” with which to eat their nuts and grains and so forth. And a wombat was referred to as a “big boy” by his handler, which is a fair observation. But Dragon took it in a very nihilistic direction, making him a “big void”.

Negative-space wombats next 5km.


Russian diplomacy remains a constant menace in the news. It perhaps doesn’t help when “in principle” support becomes “end printable support”. Those printouts could prove to be important. I did enjoy watching “Putin’s tirade” become “Putin’s Thai raid” though. When the man badly needs a chicken laksa…well, too bad for Ukraine I guess.

He may also enjoy French Canadian hangover food.


Usually Dragon is really good at context. But I’ve had problems before with countdowns. This time, we saw “Three, two, wine…” Hard to argue with his initiative.



Islamic State keep on making the news. Or more pedantically, “so-called Islamic State”. Awful, really. But it did make me laugh when Dragon rendered it “is Linux 8”. That wily OS.



Ebola is also still in the news, with Dragon unfortunately choosing to focus rather superficially on the fashion. Thus instead of “a full hazmat suit” we had “a fall hazmat suit”, presumably distinguished by a flirty off-the-shoulder hermetically sealed cardigan. And while international aid agencies need people who “can coordinate” their response, when captioned as “camcorder mate” it becomes that one guy who likes to record himself jumping off stuff and put it on YouTube. Presumably his heroic moment will come.

Immortality beckons.


Sometimes there’s a very fine line between right and wrong. Case in point: “children’s supply business” becomes “children supply business”. One makes toys, the other gets you on a watch list.



Sometimes it’s the simplest lines which go awry. Thus “there it is” became “buried as”, which turned an innocuous statement into a Kiwi synopsis of Kill Bill 2.

Buried as, bro.


Dragon keeps trying to convince me Indigenous Australians threw “returning meringues”. Who am I to disagree?



It was unfortunate when a character on a soap called “Evie” was named in a caption as “easy”. Not because it was wrong, but because spoilers!

Easy, let your hair hang down.


Dragon tried to convince viewers there was such a thing as “Barack architecture”. I wonder what that would look like?



This one was an offline file, so I had time to correct, and I knew I might need it. The presenter was talking like a pirate, and I figured I’d try my luck. For the record “me hearties” is interpreted by Dragon as “many parties”. I love how dogeish that sounds.

Wow.


Finally one from an overseas colleague, which apparently went to air during the news. “His Holiness the Pope” came out as “His whole penis the Pope.” ‘Nuff said.

“And then I said, consider it a Papal endowment!!”


Just to finish with, a few interesting things from the captioning world. This sketch on captioning beautifully demonstrates the problems which remain in unmediated voice recognition software. This paper on caption placement concisely analyses some excellent data on where our eyes tend to look while watching the captions. This BBC America ad for closed captions made me laugh. And this blog post on the choices we make as live captioners was well worth a read. Try as we might to be invisible and to translate faithfully, we always impose our own perspective.



Oh yeah, and after my last post decrying the unfortunate changes to captioning laws which were suggested by the Australian Federal Government, some good news. The changes were deferred, pending further input. I’m hopeful that means that now the changes won’t simply be waved through without resistance. Viewers, captioners and content creators have noticed – just in the nick of time.




Disclaimer.

Friday, 31 October 2014

Ill Communication

So Malcolm Turnbull has just announced that captioning requirements will fall within the remit of the Government’s next adorably labelled “Red Tape Repeal Day ™” on 29 October. This is extremely troubling, as captioning requirements in Australia represent carefully designed, evidence-based policy which in their current form balance the needs of stakeholders, who include regular viewers, occasional viewers, content creators, broadcasters, captioners, advertisers and any data analysts who may use captioning output as metadata. Also, as some broadcasters hew tenaciously to the bare minimum of their legal captioning requirements, any loosening of the latter runs a very tangible risk of making your life worse if you use our services.

Not all red tape is bad you guys.


I’ll state at the outset that because these changes to captioning requirements form a small part of a much larger and fairly intricate repeal of broadcasting legislation, I’m relying on relevant sections of the explanatory memoranda authorised by Malcolm Turnbull and the initial report by Media Access Australia (who will no doubt have more to say in future). If I’ve misunderstood anything by not digging through the raw text of The Broadcasting and Other Legislation Amendment (Deregulation) Bill 2014 then please let me know.



The headline change is that free-to-air broadcasters will no longer have to report annually on their compliance or otherwise with captioning requirements. Instead, per Turnbull, “we are moving from annual reporting to a complaints-based approach”. There are a number of problems here. Broadcasters (and in some cases the third-party captioning companies they contract) are the ones best placed to collect data on their own captioning quality. Every time we have a loss, the supervisor at the front desk has to take action. To wake up the captioner, to find another captioner, to check the gateway, to scramble tech support, to contact the network – to do something. Once the crisis is over, they’re therefore perfectly situated to jot down a quick report. Time of day, amount of loss, reason, solution. Or when our accuracy slips below an acceptable threshold, we can refer back to our comprehensive text logs, as well as auditing them randomly to make sure all captioners are maintaining 97.5 or 98 percent. Networks will now only have to keep AV logs for 30 days. At our end we don’t keep comprehensive AV logs, but I assume the networks do anyway, for many reasons.

Why wouldn't they?


So the existing statutory annual report does not involve providers collecting any information we don’t already collect (a phrase Turnbull and Brandis seem to like). Annual reporting is just compiling the info we already have, and I would hope that my employers will continue to collect it anyway, so they can at least satisfy themselves as to their quality – never mind the regulations. As I’ve noted before, we already have a “complaints-based approach” as well. ACMA can hear and investigate complaints, and have done so a few times this year. So don’t be under the impression we’re adding any new types of scrutiny. This is purely about taking away accountability and eliminating records of shitty performance.



And there are some problems with a solely complaints-based model, which is a bit like dissolving the Tax Office and letting people just dob in their tax-dodging neighbours. Viewers don’t necessarily know their rights with respect to captioning. If a show isn’t captioned, they can’t necessarily be sure that it’s required to be. They might complain only to find it exempt from the requirements, or they might not complain because they don’t know they can, or don’t know who to contact, or what kinds of accuracy to expect. If like so many, for instance, you were under the impression we type fast on a normal keyboard, you might be so dazzled by our speed as to be unduly tolerant of bad performance, or bewildered by all the homophones and uncertain whether to complain. Programs with a civically-engaged viewership, like Mediawatch or The Project, might get dozens of complaints in the event of a sloppily captioned episode, but a dodgy Monster Trucks Almanac might get far fewer. We mustn’t be in the business of discrimination.

But instead the business of awesome.


The next part includes a kernel of sensible policy, but still causes some consternation. ACMA will be required to take into account whether a program is pre-recorded or live, or (most intriguingly) a combination of both, when determining whether captions are up to scratch. In itself this may provide a more nuanced analytical instrument. Our standards do and should vary between pre-recorded and live content – our workflow certainly does. The danger of introducing this in the context of red-tape-obliteration is that live (or “late” – where the media gets to us beforehand but too late for the more involved offline captioning process) could become a get-out-of-jail-free, where the networks justify substandard content by calling it live. There is also a proposal to make “engineering or technical failures” a cause of exemption. While I hate having my accuracy stats compromised by technical failures, the simple problem here is that the engineering, too, forms part of the networks’ responsibilities. Now they will have less incentive to build and maintain backup and redundancy plans, and to exceed their captioning requirements in case something goes awry. Caption an extra hour every day and you can afford an hour-long catastrophe, any day of the year. They may have less incentive to give us early access to files, or to make scripts and other resources available which improve live captioning. I just fear it will manifest as an overall loosening of the standard.



A similar concern accompanies the planned extension of the deadline to apply for exemptions to captioning standards. I’m not bureaucrat or wonk enough to know the implications of this – I just make captions. But prima facie it looks like a less stringent standard. Like more things may slide.



Things are definitely gonna slide, in all directions but up, as a result of the plan to let associated groups of sports channels “average” across their networks to meet their targets. As it stands now, each individual channel must meet its targets. In some circumstances, such as during the Olympics or World Cup, which may show for example on two of their five channels, they will voluntarily exceed their requirements on those channels. This change, though, would let them use that voluntary and commercially viable exceeding of the rules as a licence to neglect their other targets. Sports networks usually have a predictable weekly schedule and caption accordingly, so it will also likely increase viewer uncertainty about where to expect captions. A sport which required captions last week to meet the quota might not this week, solely because the Commonwealth Games is on the other network. And this too decreases incentives on networks to exceed requirements, in case of technical faults or other problems.



The next requirement is that new subscription TV channels will be granted a 12-month exemption from captioning requirements. This kind of just makes me sad. A start-up TV channel obviously has its work cut out for it and I understand the impulse to make it easier on them. Anything for a bit of media diversity. But a brand new channel is also such an opportunity. If they had even a modest captioning requirement from the start, they would be more likely to build-in efficient systems and procedures, which could then be scaled up. The worst captioning clients to work with are those for whom we are an afterthought (or non-thought), the best are those for whom we are part of the production process.



The last proposal is that programs previously captioned on a different subscription channel will no longer have to be captioned. Firstly, I don’t like it because it discourages networks from including caption files with programs they sell or repeat, which is totally a thing that can be done. Secondly, it sounds a bit like a butchered response to Media Access Australia’s suggestion that, since quotas are confusing, and tracking what is a repeat is difficult, simple 24-hour quotas should be substituted. The profit motive should ensure that the low-hanging fruit of repeat captions get picked anyway. Of course, there doesn’t seem to be anything here about strengthening quotas, so I guess the point is diminished responsibility.

“You wouldn’t like me when I’m disappointed.”


The net effect is an overall decline in standards, stringency and enforcement. I want to finish by illustrating, though, how this will not just potentially, but certainly, damage the quality of captions. As I’ve mentioned, this 98% accuracy standard I maintain for my respeaking means one word in 50 is wrong. At the start of this sentence, this paragraph was at 51 words, so imagine an error in every block of text that size. Now, I maintain it by usually having about 30 minutes of prep or standby time rostered for every 30-minute live session, or before each block of three or four “15-on, 15-off”cycles. Live is live – if I’m unprepared I can still do my thing. But for every minute less prep time, I have to cut corners. Since it’s NFL season, I’ll use the example of an American college football game. Let’s say there’s a game on between the Massachusetts Coffee-Preferers and the New Mexico Coverups. The captioner has had a crash, and their co-pilot isn’t available. A supervisor puts me on the air, and I immediately start talking. Now while on the air, I’ll engage my NFL house-style. I’ll google the teams and put their quarterbacks in my temporary macro slots, then do the same with the coaches. My Dragon won’t know the player names, and I can’t program them in while live (American football teams are huge). But with the playmakers locked in, at least some of what the commentators say can be transmitted without saying “this fellow threw the ball to that fellow, and this other fellow blew the whistle as the first fellow was tackled by yet a third fellow…”. All this googling also means I’m not watching my output, and so any Dragon errors will definitely go uncorrected. It’s good enough for emergencies, but hardly optimal.



Let’s imagine the same situation but with five minutes of prep. Now I can google teams, organise player names into easy-to-read lists and copy them in as required once I’m on the air. I’ll still start off with the quarterbacks and coaches ready to hand, and now I can watch my captions go out while I’m live. A quick sequence of passes will still be beyond me, as I can only juggle a few names at a time this way, but it’s an improvement.



10 minutes of prep, and I’m doing the above but also training the names (both surnames and full names) into my Dragon. So instead of saying “macro one” I’ll actually be saying “Jafarius Jones III”. Now we’re getting somewhere, though once on air I find that half the time it’s coming out as “Jeff ARIAs Joan’s a turd”.



20 minutes of prep, and I start getting time to practice each name off-air, maybe using in a sentence. I see what works, what needs re-training, what needs a house-style to correct an obvious homophone. Now the player names will be a pretty well-oiled machine (I still keep the list open in case something goes pear-shaped). But then they start talking about past player Steeve Stevesen, whose record is about to be broken that day, and back-room talks about who has been approached to replace flagging coach Nigel Blowhard, and I’m a bit lost.



So with the full 30 minutes, I’ve also skimmed a few sports pages, harvesting the names of other coaching candidates Felix Moneyball and John Lapsalot, and finding out the details of Stevesen’s record. I’ve also got the name of the stadium (the Starbucks-Sons of Liberty Arena) and found its nickname (Whipless-Mochachino-Freedom Park). That and only that is how to exceed 98%.



The captioning industry is competitive, and we already don't have enough prep time. Whoever can satisfy the requirements for the lowest price will get the contract, and captioning companies undercut each other on price. There is no possible world where reduced requirements in volume and accuracy won’t be passed on to captioning companies in the form of pricing pressure. Which must then be passed on to captioners in the form of decreased preparation time. Which must then be passed on to viewers in the form of unreadable nonsense. The only winner is the networks’ profit margins, and you, dear reader – I’m going to have so many mishaps to report in the future.



Disclaimer.