We create a function to get a sentence range around on a given location in the document. It naturally accounts for double quotes or parentheses based on whether they are present on both sides of the sentence.
Thanks for your interest
This content is part of a paid plan.
Get any sentence range
A function tackles a specific task which is generally useful to other macros, and finding a sentence range is a general enough task that it makes a good candidate function.
Editing macros should play nice with common document elements, and dialog is … let's use a fancy word, ubiquitous in fiction. Our function should intelligently provide a sentence range while taking the dialog structure into account if it's present. Should a sentence range should include the double quotes or just the sentence text? A similar argument applies to the parentheses in work related documents.
Create the empty function
The empty function is basically the same as most of the others we’ve created except we add an optional object parameter. Open the VBA editor with Alt+F11 in Word for Windows (or Option+F11 on a Mac) and type the following.
All functions require the Function keyword followed by a unique name. Any external data are listed as parameters (basically variables) inside the parentheses. Functions always return a value, so we need to specify its data type at the end. The first line inside the function starts with a single quote (called a "comment character") to begin a descriptive comment about what the function does.
What is the parameter?
We want the sentence range around a given target range, so we specify a parameter name TargetRange and give it a data type As Range.
A common use case is just getting the range around the current cursor position, so we'll make the argument Optional and assume the current position if it isn't given when the function is used.
Optional parameters require a fixed default value. Since a Range is an object, the only valid default value is Nothing.
Nothing is the value assigned to objects not yet assigned to a valid document element. See our introduction to functions and subroutines for more information about optional parameters.
What is the returned result?
We’re returning the sentence range, so the return data type is a range, so we add As Range at the end of the header line.
Nothing is a value VBA assigns to object variables not yet associated with a valid document element (or whatever it represents). The returned result is assigned to the function hand. It's temporarily set to Nothing since we do not yet have a sentence range.
What’s the plan?
Roughly speaking, the function should work as follows:
- We begin the function with a given target range around which we want to know the sentence range.
- If a range is not given, the function defaults to the current selection or insertion point position in the document.
- Trim any paragraph marks and spaces from the right side of the range.
- Trim double quotes or parentheses if they occur only on one side of the range.
Even though Word includes any trailing paragraph marks in automatic range extensions by default, we remove them because they shouldn’t be part of a sentence range. Spaces are also trimmed because we need to check for punctuation on either side which is easier to do after removing the spaces.
Assign the initial working range
We need a separate working range to manipulate inside the function. If we used the target range parameter directly, any changes would affect the argument on the outside.
Declare the working variables
We declare the working range as well as two other plain text variables to make the following logic easier to read.
Dim is the VBA keyword to declare a variable. Each variable needs a separate data type, or it will default to a Variant type than can store anything. Each variable is explained when it is used.
What is the initial working range?
The initial working range depends on what was given in the target range. A rough conditional statement looks like:
We need to know whether the target range is valid. It may seem awkward to include the invalid case as the first condition, but this structure is easier to read later.
What is Nothing?
Nope, we're not getting into metaphysics or vacuum energy fluctuations (although, that's cool stuff).
VBA literally assigns a value called Nothing to any object not yet assigned to a valid document element. If a target range argument is not provided when this function is used, the target range parameter will default to the fixed value we provided which is Nothing.
Check for a default target range
Since the target range is an object parameter, we need to use the keyword "Is," rather than an equals = sign, to compare it to Nothing.
This is a True or False (Boolean) condition which we can use in a conditional statement to make a decision. This condition also catches an invalid target range argument if the user provides an unassigned range variable.
Define the default working range
When the target range is Nothing, we define the working range based on the Range property of the Selection.
Set is required in VBA when assigning any object variables. This command assigns the current selection or insertion point in the document to our working range variable. In Word, an insertion point is just an empty Selection spanning no content meaning it's Start and End positions in the document would be the same.
We’re using a simple working range name r, so it’s easier to type.
Assign the given target range
When the target range is valid, we just need to Duplicate it as our working range.
We duplicate the target range argument into a separate variable, so any changes made to it inside the function do no leak outside the function.
Extend the range over the sentence
With the above initial assignments, the working range r is valid in either case, so we can proceed. We need to extend it over the sentence range at that location.
Collapse the range (optional)
If the target range spanned any text, it could result in some unexpected sentence extensions.
Why?
We now know the working range is positioned at the beginning of the initial target range allowing us to identify the most logical single sentence. If you prefer allowing more than one sentence, just omit this collapse command.
Expand over the sentence range
Now, we Expand the range over the sentence.
Expand requires a unit option, so we assign the sentence constant wdSentence (from a table of Word constants). All option assignments require a colon equals := symbol. The Expand command extends the range both directions to the respective beginning or ending of the given unit but no farther.
The default behavior may include more than one sentence if the range spans a sentence boundary. This isn't inherently incorrect. The above Collapse command keeps the logic simple by focusing on a single sentence at the beginning of the working range.
Omit paragraph mark (suggested)
By default, Word includes an ending paragraph mark in an automatic sentence range extension (or selection) if it exists after a sentence. In fact, Word will include every trailing paragraph mark as empty paragraphs until it encounters some regular text. This behavior isn’t intuitive (to me), so let’s get rid of them.
The majority of the time, only one paragraph mark will be spanned, but we can remove all paragraph marks with a single command using the MoveEndWhile method.
This command literally moves the End position of the range backward or forward in the document. It requires a set of characters to include or exclude (in any order or combination) depending on the direction. A paragraph mark vbCr is a special character from a table of miscellaneous constants, so we assign that character to the Cset option.
An additional Count option specifies how many characters to check. The default is any number of characters forward (up to about a billion). We need to retract the End position backward, so we instead assign the constant wdBackward to the Count option. The two options must be separated by a comma.
Why omit paragraph marks?
A paragraph mark is a paragraph mark. It's in the name. In my opinion, omitting them from a sentence range is more intuitive and consistent with the meaning of a sentence within a paragraph.
Allow dialog
Editing functions and macros should work intuitively, and they shouldn’t make us clean up the text after they finish. It half defeats their purpose.
A typical novel contains thousands of lines of dialog at about half of the total word count. If we're manipulating sentences, the macros should properly interpret dialog. is a single sentence or just one of several inside double quotes. We need the current function to perform the grunt work when identifying the appropriate sentence range.
A similar effect occurs with parentheses which are more likely in novel notes or even work-related documents. I generally want to exclude a parenthesis if it occurs only on one side of a sentence. For the remainder of this article, we’ll work with double quotes since the logic extends trivially to parentheses.
Use double quote constants
To make this function easier to read, we’ll use the double quote constants mentioned in a separate article. We'll refer to the left and right double quote characters as LeftDQ and RightDQ, respectively.
If you prefer not to use module-level constants (just constants declared at the top of the macro file outside any function or subroutine), copy and paste the definitions into the function somewhere near the top, or you can type and copy the actual text characters from a typical Word document. The latter approach is the most direct, but the characters are similar to straight double quotes, so be careful while testing the function.
Example text selections
What do we want when determining a sentence range? For concreteness, we'll assume a main macro is using the function to find the sentence range and then selecting the text.
No dialog in the text
An example selection will regular novel text but no dialog is:
The function should just span the whole individual sentence range like normal, so the steps to handle the dialog or parentheses shouldn’t mess anything up.
Single sentence of dialog
If the entire sentence is dialog, our function should span all of it including the double quotes on both sides.
Multiple sentences of dialog
If the text consists of multiple sentences of dialog, then the function should just return the current sentence.
Regardless of which sentence is selected, it would have at most one double quote bordering it. The function should span only the current sentence range and exclude the double quote.
See where this gets a little tricky? Also, see the gotchas below for an exception where it gets even trickier.
Trim any ending spaces
Word automatically extends ranges over any trailing spaces. Identifying the double quotes or parentheses will be easier if we trim them from the end of the range. We again use the MoveEndWhile method.
This time, we're assigning a space " " to the character set option.
Wait a second …
This uses the same MoveEndWhile command as before, even moving backward, except it removes spaces now.
Combine with trimming any paragraph marks step
We can just combine this command with the earlier one by including a space along with the paragraph mark in the character search string.
We add the two characters together to create a revised Cset search string. This is called concatenation which basically just means jam the two strings together to make a new, longer string.
Trim any starting spaces (optional)
Most automatic range extensions (or selections) do not include any preceding spaces, but it can happen for the first sentence of a paragraph. Just in case, we'll trim the beginning spaces using the MoveStartWhile method.
MoveStartWhile works just like MoveEndWhile mentioned above except it moves just the Start position of the range. I find the default Word behavior for preceding spaces unintuitive, so if these spaces are present, we will not restore them to the working range.
Trim double quotes logic
Taking a step closer toward the VBA steps, a rough conditional statement to detect a double quote only on the left side would look like:
Unfortunately, the left double quote test must be performed by itself because it will be trimmed from the left side of the range. The right double quote test is separate.
For the left double quote test, we need the first character of the range, and for the right double quote test, we need the last character.
Check whether to trim a left double quote
We'll start with the left double quote.
Get the first and last characters
We need to get the characters at each end of the range using the Characters collection.
The result of the First property is the range of the first character, so we need to reference its Text property for the actual text. We store this in a variable sFirst using the equals = sign for later use.
Similarly, we get the Last character of the range and store it in a conveniently named variable.
Presumably, the range is not empty, but the logic below still handles everything well.
Left double quote condition
We check whether the first character is a left double quote.
This statement will be interpreted as a True-False (Boolean) value when it's used in an If statement to make a decision. The overlapping notation between assignments and conditions in VBA is unfortunate, but we’re stuck with it.
Missing right double quote condition
The second condition needs to detect a missing right double quote.
The not equals <> symbol literally reads as less then or greater than which is … not equal to something. For text, it basically asks if they two string are not exactly the same.
Compound left double quote condition
If the range only contains a left double quote, both of the above conditions must be True. We use “And” between them to create a compound condition.
Trim the left double quote from the range
The command to remove the double quote from the left side of the range is the MoveStart method:
The MoveStart command moves the Start position of the range. The default movement is forward by one. That is exactly what we need at this point, so we can omit both the Unit and the Count options. The command does not change the End position unless Start exceeds the End position.
Since the If statement is relatively simple, we can condense it only one line.
Check whether to trim a right double quote
Similarly, we may need remove the right double quote. We need to make sure the left double quote doesn’t exist at the start, but a right double quote does exist at the end. If so, we trim the right double quote from the range.
Trim the right double quote from the range
We trim the right double quote using the MoveEnd method.
The MoveEnd command moves the End position of the range without changing the Start position (unless End precedes Start). The negative Count value moves backward in the document by 1 Unit. The default unit is a character, so we can omit it.
The If statement to remove a right double quote, if appropriate, is:
Allowing parentheses
We can do the same thing for parentheses simply by swapping out the respective left and right characters in the above conditions.
An open parenthesis "(" and a close parenthesis ")" are plain text characters, so we just include them in left double quotes for the text comparisons. Handling parentheses as an extra feature that is convenient for some documents.
We could combine the double quotes and parentheses character comparisons where all the conditions are evaluated together, but the compound conditions are cumbersome. It’s not really necessary either.
Extend over any ending spaces
Since Word normally includes spaces at the end of an automatic selection or range extension, our range function should mimic that behavior. We again use the MoveEndWhile method, but we need to extend over all spaces at the end of the working range.
We assigned a space " " character to the Cset option, but the default movement is forward, so we can omit the Count option. Writers will probably expect this behavior in Word even if they’re not aware of it.
Gotchas
We should get in the habit of considering potential problems.
What if the range ends up empty?
During the function, we remove several possible characters from the range including spaces, paragraph marks, a double quote, or a parenthesis. If the range ends up being empty by the time the function finishes, is that a problem?
It's not an error in any way because the function just ends as normal and returns the empty range. In fact, the empty range result would indicate something did not work as expected, so it's valid result. In most use cases, it's probably a logical error where the writer ran the macro on some strange chunk of text, but this function isn’t responsible for catching and correcting anything like that other than not crashing or returning gibberish.
No problem exists here, but that doesn’t mean we shouldn’t consider it when creating the macro. Sometimes such cases can sneak around and bite us.
What about dialog tags?
If the dialog happens to include dialog tags, the function won't work as expected based on the purported intuitive dialog behavior. For example, based on the above range extension logic, the following dialog would look like:
Dialog tags are common in novels, but unfortunately, the correction is beyond the scope of this article. An upcoming member article will handle the general case, but it a bit more technical.
What if the target range spans more than one sentence?
If the initial range spans one or more sentence boundaries, the Expand method, when used by itself, would extend over all of them. We sidestepped any issue above using the Collapse method, so no gotcha exists in this function.
Personally, I like the bump in generality from not including the Collapse method, but not everyone realizes mint chocolate chip is the best ice cream flavor either. Ho, hum.
Final Function
Putting it all together, the function to get the current sentence range while accounting for dialog or parentheses is:
These functions require the LeftDQ and RightDQ functions given in a separate article. If you prefer not to use the double quote functions, a quick and dirty solution is to replace every LeftDQ constant with a literal left double quote character "“" in straight double quotes. Every right double quote RightDQ constant should also be replaced with its plain text character "”". Yeah, it's a little difficult to read which is why the function uses the quote character constants.
What are the uses?
A simpler version of this function was used with the aforementioned move sentence macros. Where else might it be used?
We’ve previously created a delete sentence macro, and we could simplify and generalize that macro at the same time. We could further create macros to select, cut, copy, and italicize (for internal dialog) sentences. These additional macros would be nearly trivial to create using this workhorse function.
Examples of using the function
If we're creating a bigger macro, we can use this function as follows. We'll assume our macro has a valid range variable named SomeRange.
This use of the function omits a target range, so the function will assume the current editing position in the document.
We could also specify a range directly in the parentheses.
The parentheses are required because we are assigning the result to a variable.
The details are a little trickier than it seems here. The argument must be a Range variable because we declared the TargetRange parameter As Range.
Since a paragraph range is already a VBA Range, we could skip the intermediate paragraph range assignment.
It looks longer, but it's not really hard to read.
Improvements
What could we do better?
Messy character trimming conditions
Given the similarity in the quote and parentheses detection steps, we could extract these steps into a separate function. This would condense and even generalize them. It’s not a bad idea, and I would probably do so with my own macros. Other uses of the function are not as likely (perhaps including square brackets or braces?), so this would be mostly for aesthetics. I omitted it from this function.
Account for dialog tags
In my opinion, omitting dialog tags (e.g., she said, he asked, etc.) is a glaring omission in this function. It purports to account for dialog but then omits a common dialog variation. I use dialog tags sparingly (preferring action tags as needed), so it works for most of my use cases.
If we think about it briefly, what would we need?
We need steps that would intelligently handle all the variations for a double quote inside or at the end of the sentence. We need a somewhat general way to handle it. It's definitely worth the effort, but the logic is trickier than the above function, so it is relegated to a slightly more technical member article.
Catch sentence punctuation problems
Word's default sentence parsing algorithm will choke on some common abbreviations and mistake them as the end of a sentence.
- Professional titles like Dr. Sally Doe will split a sentence.
- Latin abbreviations can be problematic like etc. or e.g.
- Many common abbreviations are also suspect such as Vol. or Inc.
- Geographical or address related abbreviations like Ave. or D.C. will do the same.
- Time or measurement based abbreviations like a.m. or ft. still include a pesky period.
- Names with middle initials will also cause trouble, but these are more difficult to catch, in general.
The list is probably much longer than you first thought. We could include all the steps to catch and correct such inconsistencies in a sentence range function making the main macro easier to understand.
We would need some logic to catch and correct the sentence range and then handle each case, so the solution will stretch out more than you might think. The cleanest way to implement it would be with another function, but that's a lesson for another day and probably one for intermediate to advanced users. Some of these may be worth the effort in certain work-related documents if a particular job deals with them frequently. Correcting for all of them every time would be cumbersome.