Recently, while working on my novel, The Technowizard Guardians Of The Infinite Worlds Of Fandom, I decided to do a fun experiment. I went into Scrivener — which, incidentally, is the best writing software on the planet, and I'll soon be selling it here as an affiliate! — and wrote up a regular expression to look for tell-tell signs of that dreaded bugaboo, the "passive voice." You know what that is, right? The passive voice is when the action is not done by the subject of the sentence. Y'know, as in, "Frank was gruesomely murdered by a horde of zombies." The zombies committed the act of murder, but Frank is somehow the subject of the sentence! Weird, huh? Well, i did this, and oh . . . my . . . GOD. I have so many occurrences of this shit. I had no idea. I am apparently really bad at overusing this particular crutch of bad, lazy writing. I must suck, right? I mean, really Andy? You've been at this writing thing this long and you're still pulling this shit? Jeebus Cry-me-a-river! For reals, yo. Get with it, Andy; get with it. Do your job, inner editor! So, I decided to go through the manuscript and eliminate each and every occurrence of this passive voice bastard, wherever and whenever doing so was necessary. Active voice sentences for one and all!
(It's not quite as simple as that, of course; the passive voice does have its uses. Not every sentence can be in the active voice, nor should it. Some sentences need to remain in the passive voice in order to have their desired effect or to sound the right way, or to roll off the tongue correctly. So like I just said in the previous sentence: Not every single sentence can — or should — be scripted in the active voice; some should be passive. Just for variety's sake, for crying out loud! But for the sake of clarity, a majority should be in the active voice. Because you want your story to be active, to be full of life, to be full of piss and vinegar. You want your story to scream off the page, yelling, "READ ME! I AM ALIVE! READ ME GODDAMN IT!" Not lying there like a wimp, going, "Oh yeah. Pick me up. Or don't. Whatever.")
The first step was looking for all the "to be" verbs — "is, isn't, are, aren't, was, wasn't, were, weren't, be, being, been" — as those are dead giveaways of passive voice. The way I did this with regex is like so:
The next step is looking for "ed" and "ing" words followed by the word "by." (See? I just used passive voice, right there.) You do that with regex by doing this:
\b([a-z]+ed by)|[a-z]+ing by)\b
\b(is|isn\’t|are|aren\’t|was|were|wasn\’t|weren\’t|be|being|been|[a-z]+ed by|a-z]+ing by|[a-z]+y by|[a-z]+e by)\b
The next phase will be looking for words ending in "ed" and "ing," followed by a phrase, followed by the word "by." As soon as i can device a regex that will look for that. The trick is constructing a regex that will search for precisely that . . . regex is like a programming language all its own, and getting it to search for exactly what you want can be tricky. The problem is you need to search for the "ed" or "ing" word that is closest to the word "by" and the word "by" that is closest to the "ed" or "ing" word.
The thing about passive voice is that it's sneaky. It creeps into your writing a little at a time, and you really have to watch for it. Because it's insidious. Before you know it, you'll be writing "Frank was killed by zombies" all over the place. Buffy will be drained of life by the evil Vampire before you even know it, and all the Jedi will have been brutally murdered by Darth Vader before you can blink. Think it can't happen to you? Well it can, buster. So watch your writing for the passive voice. Watch it carefully. Because when you're not looking, those passive voice constructions will awaken in the middle of the night, crawl out of your manuscript, and breed like rabbits and then crawl back into it and infest it with creeping death. It happened to me, and it can happen to you. So BE VIGILANT! Watch for passive voice!
UPDATE: I found the regex I was looking for. Apparently, there's not that many ways to search for what I wanted to search for — i.e., an "ed" or "ing" word followed by a phrase, followed by the word "by" — without getting a lot of false positives. However, if you're willing to accept a lot of false positives and still turn up a few good ones, you can use the following:
As you can see, yeah, false positives will result from this. However, it will find all sorts of passive voice sentences for you, so that's good. The problem is, of course, that a lot of those false positives will contain true positives as well, but you might easily dismiss them because the larger result will be a false positive that, though it contains a true positive within it, is on the whole false because it's too large and encompassing — say an entire paragraph instead of just one sentence. I haven't figured out a way to compensate for this yet, but I'm working on it.
UPDATE 2: Well, I've found a better one! This one is pretty good, and works almost in every instance. It uses more strictly defined parameters, limiting the search to fifteen words before and after the verb, and before and after the word "by". You can adjust the number of words to your preference, basically setting it to however long your sentences tend to run. But it works damned near perfectly. Awesomesauce thanks to my friend Ken Persinger who came up with this one for me. I take no credit for its creation; this was all Ken's doing!
As you can see, it's a bit more complex than the others. Just change that "15" to however many words you want to include before and after the word "by," and after your verb phrase.