PCCON Paragraph Formatter   (Version 2.2.3)

Last Updated:   5/3/25  14:08                 Jeffrey Knauth

Contents

(click to jump to a section)

PCCON is a KEDIT macro which allows you to easily format text paragraphs. To be formatted a paragraph must be delimited by blank lines or by top/bottom of file/range lines. Formatting consists of flowing the text to put the proper number of spaces between words, taking punctuation into account. It also means ensuring that each line is the correct length and that each line is appropriately indented from column 1.

PCCON is similar to the KEDIT FLOW command, but has some significant differences and ease-of-use improvements. These are described in PCCON vs the KEDIT FLOW Command at the end of this document.

PCCON Variables

Three variables control how PCCON formats a paragraph. See PCCON Syntax for the invocation syntax.

"margin" specifies the rightmost column a line can use. It corresponds to the margins.2 value (right margin) in the KEDIT SET MARGINS command. Of course if a line consists of a single large word which cannot fit within the right margin, it is allowed to flow over that margin. This may not look pretty, but no text is lost.

"hanging indent" specifies how far the first paragraph line is offset from the second and following lines of the paragraph. A hanging indent of 0 means all the lines of the paragraph are aligned in the same column on the left. A positive hanging indent means the first line starts "hanging indent" columns to the left of the second and subsequent lines. A leading "+" in the hanging indent value is ignored; i.e., "+6" is treated as "6". The default hanging indent value is 3, which covers the very common case of a paragraph starting with a three character overhang, e.g., "1) " or "a) " as the first word and following space. A negative hanging indent means the first line starts "hanging indent" columns to the right of the subsequent lines. Note this use of positive and negative is exactly reversed from the use by the KEDIT SET MARGINS command.

"inset"   specifies how far the leftmost text of the paragraph (in any line) is shifted right from column 1. An inset of 0 means the leftmost text is in column 1. The default value is 0. Contrast this with the way KEDIT SET MARGINS specifies the left margin of a paragraph. PCCON uses the leftmost text, whether it is in the first or second line of the paragraph, to calculate inset. SET MARGINS uses the second line (and subsequent lines) of the paragraph, i.e., the bulk of the paragraph, to specify the left margin and doesn't care if the first line is a hanging indent line that may be further left than the rest of the paragraph.

This figure shows how these PCCON and KEDIT variables apply to the following two paragraphs with hanging indents, one positive and one negative.

Formatting diagram

Paragraph Types

By adjusting the inset and hanging indent values, PCCON makes it very easy to format paragraphs of various types, such as

The following figure illustrates these concepts and the PCCON invocations used to create a succession of paragraphs. See PCCON Syntax for the invocation syntax. Note how USESHAPE is used to analyze the shape of a paragraph and to save the learned INSET and HANGIND shape values for use as defaults in subsequent PCCON invocations.

Paragraph formatting options

PCCON Invocation

You can assign PCCON invocations with various parameter/option settings to whatever key combinations you want. In the table below are some examples you might put in your WINPROF.KEX. The keys shown in the table are the ones I use, but any others could be used. See PCCON Syntax for the invocation syntax.

"PCCON (USESHAPE" is especially handy if paragraph shapes change frequently in the document you are creating. To record a shape soon to be used on other paragraphs, first use "PCCON (USESHAPE" on an existing paragraph with the desired shape, per its first two lines. The current offsets from column 1 of those two lines are the only offsets that matter; the rest of that paragraph's lines can be offset randomly. The "PCCON (USESHAPE" invocation neatly formats this paragraph per the shape of its first two lines, but it also records this shape for future use of PCCON on other paragraphs. See Using "PCCON (USESHAPE" for an example. In fact you can do this shape determination on an already fully formatted paragraph, just to learn and record its shape. The paragraph will be reformatted, but will not change visibly since it had already been formatted.

Command line vs key execution:  If you don't want to use keys for all these ways to invoke PCCON, you can define one key as in the S-C-F10 example below. Then type the desired macro invocation on the command line, put the cursor in the target paragraph, and press the key to which you assigned 'cmdline.3(); "SOS SAVE QCMND RESTORE"'. This key will then act as if you had defined it to have the command line text assigned. (Thanks to Kent Downs for this technique.)

An example line in WINPROF.KEX might be something like this to have the C-F3 key combination invoke the PCCON macro with the specified parameters:    'DEFINE C-F3  "MACRO PCCON (INSET CP HANGIND 0"' (the triple-quote ending mark shown here is actually a double quote followed by a single quote).

Key Macro Invocation Result of Formatting
These five PCCON invocations are used most frequently.
C-F1 PCCON (INSET 0 HANGIND 0 Ordinary paragraph, left edge in column 1
C-F2 PCCON List item paragraph  (normally has a hanging indent)
C-F3 PCCON (INSET CP HANGIND 0 Continuation paragraph  (aligned with indented text)
C-F5 PCCON (USESHAPE Save paragraph shape (lines 1 and 2), then format
C-F4 PCCON (CHGMARGIN Change the right margin
These three macros format the lines  FROM THE CURSOR LINE DOWN  thru the rest of the paragraph.
S-C-F2 PCCON (CURSOR List item paragraph using the previously saved shape
S-C-F3 PCCON (CURSOR INSET CP HANGIND 0 Continuation paragraph using previously saved shape
S-C-F5 PCCON (CURSOR USESHAPE Save shape (cursor line and next line), then format
The following PCCON invocations are used less frequently.
C-F7 PCCON (HANGIND 0 List paragraph  (normally aligned with hanging indent)
S-C-F1 PCCON 1 (INSET 0 HANGIND Split into single word lines
C-F6 PCCON (SENTENCE Split on sentence ends
S-C-F6 PCCON (PUNCMARK Split at all punctuation marks
S-C-F8 PCCON (DISPVALS Display PCCON values
These two entries show some related useful actions you can specify with a key DEFINE.
S-C-F4 "SET MARGINS" margins.1() "72"; "COLMARK 73" Reset right margin to 72; column marker to 73
S-C-F10 cmdline.3(); "SOS SAVE QCMND RESTORE" See Command line vs key execution above

PCCON Syntax

Format:  PCCON   [ margin | ? ]   [ ( options ]

margin specifies the right margin used to format the paragraph. The default for this parameter is the current KEDIT margins.2 value, which can be changed by the KEDIT SET MARGINS command or by using "PCCON (CHGMARGIN".

 ?          displays help information.

Options:

HANGIND hanging-indent-value Format paragraph using the specified hanging indent. A positive value produces a left hanging indent; a negative value produces a right hanging indent. The default is 3.

INSET inset-value | CP Format paragraph using the specified inset, where "CP" means "inset = saved inset + saved hanging indent". The default is 0.

USESHAPE Use and save the shape of the cursor-designated paragraph, where shape means the inset and hanging indent created by the first two lines of the paragraph.

CURSOR Format the paragraph starting with the cursor-designated line, treating that line as if it were line 1 of the paragraph and the following line as line two. Lines prior to the cursor-designated line are not changed. This option is assumed if the cursor is in the prefix area.

SENTENCE Split the paragraph on sentence boundaries.

PUNCMARK Split the paragraph on punctuation boundaries.

DISPVALS Display the current saved paragraph shape values, saved by "PCCON (USESHAPE".

KEDITVALS Derive and save inset and hanging indent values from the current KEDIT margins.1 and margins.3 values.

CHGMARGIN Change the right margin (KEDIT margins.2 value). The cursor position designates the desired right margin.

Example:  PCCON 56 (INSET 10  HANGIND 5

This would format the cursor-designated paragraph with a right margin of 56. The first line would start in column 11 (1 + the inset of 10). Subsequent lines would be indented an additional 5 spaces. These values apply only for this PCCON invocation. To make these the default values, you can first use "PCCON (CHGMARGIN" (or KEDIT SET MARGINS) to set the default right margin. Then use "PCCON (USESHAPE" on a paragraph whose first two lines have the desired inset and hanging indent.

Miscellaneous Considerations

  1. You can format a paragraph with PCCON as many times as you like, in any way you like, without losing any text except extra spaces. If you make a mistake, just format the paragraph again the way you want it.
  2. Before using PCCON to format a paragraph, be sure to delimit the paragraph with blank lines. PCCON will keep formatting until it finds a blank line or the top or bottom of the file/range. (When the CURSOR option is used, the line above the cursor-designated line is considered to be a blank line.) If you forget to do this paragraph delimiting and accidentally format more than you intended, you can use the KEDIT UNDO command to recover to the pre-format state. Then insert temporary blank lines to delimit the paragraph; next, format the delimited paragraph; finally, remove any extraneous blank lines.
  3. You can add more abbreviations to the ab1, ab2, ab3, etc., variables in the PCCON macro. Abbreviations found in these variables will not be treated as the end of a sentence and thus will not have a second space appended. The abbreviations included are for the most part honorifics that occur before a proper (capitalized) name. The intent is to have "Mr. Smith" be considered as two ordinary words in a sentence and not consider the "." in "Mr." as a sentence ending full stop with the capitalized "Smith" as the start of the next sentence. Only abbreviations which might often be followed by a capitalized word cause this problem and are included in these variables. Other abbreviations, which are not usually followed by capitalized words, don't have to be in the variables.
  4. Sometimes when you are editing a paragraph, you want to move a sentence. This usually requires making several splits in the text to isolate the sentence to be moved and to provide a place for it to move to. After the move, you flow the paragraph back together. PCCON provides the SENTENCE option to automatically split a paragraph into its component sentences. The sentences can then be easily moved around before PCCON is invoked to flow the paragraph back together. The PUNCMARK option splits the paragraph into even smaller, punctuation-bounded pieces, if the SENTENCE option separation isn't granular enough.
  5. Sometimes you want to sort the words of a paragraph consisting of a number of adjacent lines, each of which contains one or more single-word items. Ordinarily you would have to split each line into its component words, one per line, and then invoke KEDIT SORT. If instead you use "PCCON 1 (INSET 0 HANGIND 0" on the paragraph, that will split the paragraph into a number of lines with a single word per line. (The margin value of 1 forces the paragraph formatting to start a new line for each word.) You can then invoke KEDIT SORT. Finally (if desired) you can invoke "PCCON (INSET 0 HANGIND 0" to flow the lines back into a compact paragraph using the default right margin.
  6. PCCON uses the X'FF' character for some special processing. PCCON sometimes substitutes X'FF' for a space character (X'20') to prevent that space's position from being deleted during flow processing. After the flow processing completes, all the X'FF' characters in the paragraph are replaced by normal X'20' space characters. If a X'FF' character, an "umlauted y" (ÿ), had previously been in the paragraph as normal text, it will get translated to a space and/or be deleted by PCCON processing of that paragraph. If this is a problem, the PCCON rqd_blnk variable can be set to some other character, one that never occurs in your text and therefore is safe for PCCON to add and delete.
  7. Leading spaces are not part of a positive hanging indent, i.e., one that hangs off to the left. Those leading spaces are part of the paragraph's inset, not part of the hanging indent. However a positive hanging indent can have trailing spaces; those are part of the hanging indent and are preserved by PCCON.
  8. If the overhang of a positive hanging indent paragraph contains spaces, even multiple spaces between "words", that exact spacing is preserved. This lets you create list paragraphs with complicated overhangs, e.g., "12/25/99  14:35:01".
  9. If you accidentally include text in the right part of a long positive hanging indent in positions which you really wanted to be blank, just move the text to the right (or split it to the next line) to get it outside the overhang area. Then format the paragraph again.
  10. If a paragraph contains any excluded (shadow) lines, PCCON will refuse to format that paragraph. This prevents results that are usually undesired because the user had forgotten that some lines were excluded.
  11. The PCCONALL macro formats all the paragraphs of the file being edited. It uses the current values of inset, hanging indent, and margin. The current margin value can be overridden with the margin parameter. To do this formatting, PCCONALL invokes the PCCON macro for each paragraph in the file. Because excluded lines are not permitted for PCCON, the KEDIT ALL command is issued first to display all file lines, including those previously excluded. PCCONZ is similar, but it uses 0 values for inset and hanging indent.
  12. In Version 2.2, I added a warning message when PCCON is used to format a KEDIT Untitled.# file: "YOU SHOULD NAME THIS FILE BEFORE WORKING ON IT!'" The intent is to warn the user not to do much work on such a file since it would be easy to lose that work.
  13. In version 2.2, I added automatic breaks to start a new line if PCCON encounters a '<a', '(<a', '<span', or '(<span' -- lower case required. This neatens some HTML coding.
  14. In version 2.2.1, if the first and only word of a line is just triple-double quotes ("""), the line is treated as an "as is" line (see section below). This allows easy formatting of large comment blocks (documentation strings) in Python.
  15. In version 2.2.2, if a line starts with 'variable-name = """' (without the initial and ending single quotes) and has no other text, then the line is treated as an "as is" line (see section below). This allows a Python programmer to assign variable names to individual documentation strings so specific strings can be accessed later, e.g., for a help display.
  16. Starting in version 2.2.3, if the cursor is in the prefix area when PCCON is invoked, the CURSOR option is assumed.

Using "PCCON (USESHAPE"

"PCCON (USESHAPE" is used to learn the desired shape of a selected paragraph, i.e., the inset and and hanging indent values derived form the indentations of the paragraph's first two lines. The macro records those values and then finishes formatting the selected paragraph using that desired shape. The recorded shape values are later used as defaults when PCCON formats other paragraphs. Note that the right margin value is not learned and saved by "PCCON (USESHAPE"; it must be set separately with "PCCON (CHGMARGIN" or KEDIT SET MARGINS. The selected paragraph does not have to be neatly formatted to begin with. Only the indentation of the first two lines is important; the specific text currently on the lines doesn't matter.

Below is a before and after view of two paragraphs showing how the USESHAPE option works. "PCCON (USESHAPE" operated on the Example 1 "before" paragraph to produce the Example 1 "after" paragraph. The indentations of ONLY the first two lines of the Example 1 "before" paragraph were used to produce the "after" paragraph format. Then the Example 2 "before" paragraph was operated on by just "PCCON" (no INSET or HANGIND options required) to produce the Example 2 "after" paragraph. The Example 2 formatting used the shape characteristics learned when Example 1 was formatted. Before the formatting of the Example 2 paragraph, its line indentations could have been completely random, including its lines 1 and 2.

The Example 1 and 2 paragraphs are in the same shape family, whose shape (inset and hanging indent) was determined by formatting Example 1. Any other paragraphs (list item paragraphs, continuation paragraphs, and list paragraphs) later produced using the values recorded from the Example 1 formatting would be in the same shape family.

################################## BEFORE #######################################

    Example 1:  This paragraph
                is sufficiently formatted to allow
  "PCCON (USESHAPE" to determine its desired shape.  Only the
           indentations of the first two lines are used to
                figure out the correct INSET and HANGIND values
                  to format the paragraph and to be saved so just
           "PCCON" can be used to format other paragraphs to the same shape.

                 Example 2:    Those other
  paragraphs can have completely arbitrary placement
           of text, including the indentations of their first two lines
           because the proper INSET and HANGIND values were saved by the
           "PCCON (USESHAPE".  Note the margin used is not learned and
           saved by "PCCON (USESHAPE"; whatever margin is specified on
           "PCCON" or the current default margin will be used.


################################## AFTER ########################################

    Example 1:  This paragraph is sufficiently formatted to allow "PCCON
                (USESHAPE" to determine its desired shape.  Only the
                indentations of the first two lines are used to figure
                out the correct INSET and HANGIND values to format the
                paragraph and to be saved so just "PCCON" can be used to
                format other paragraphs to the same shape.

    Example 2:  Those other paragraphs can have completely arbitrary
                placement of text, including the indentations of their
                first two lines because the proper INSET and HANGIND
                values were saved by the "PCCON (USESHAPE".  Note the
                margin used is not learned and saved by "PCCON
                (USESHAPE"; whatever margin is specified on "PCCON" or
                the current default margin will be used.

"As Is" Lines:  Bypass Formatting Certain Lines

You may have some special lines that you don't want PCCON to format. You want the surrounding lines of the paragraph to be flowed, but you want the special lines to be left "as is". No inset, hanging indent, or margin processing applies to those lines. Perhaps they contain a table or figure where spacing and alignment is important. PCCON lets you flag such "as is" lines by making the first character of the first word of each "as is" line be a "_". If "_" is undesirable to use for this purpose, you can edit the PCCON "break_char" variable to choose another character. You can also set the "break_char" variable to "" (empty string) to completely disable the "as is" function. Also, see PYTHON comments above for some special "as is" handling in PYTHON files.

Below is a before and after example showing the formatting of a large paragraph which has an embedded set of leave "as is" lines.

################################## BEFORE #######################################

   Here is some unformatted text occurring above
     a few "as is" lines.  You want this text
flowed above those lines,
   but you want the "as is" lines left unchanged.
_                                                   ==============================
_         Name    Age   Score                                           A
_        ------   ---   -----     +---------------------+               |
_        Howard    25     6       |                     |          the range of
_        John      17     4       |    (example of)     |         "as is" lines,
_        Mary      19     5       |    ( a figure )     |         flagged with
_        Charlie    3    12       |                     |         a leading "_"
_                                 +---------------------+               |
_                                                                       V
_                                                   ==============================
      Here is some text after the "as is" lines, but
         still in the same paragraph.  These lines will be flowed after the
         "as is" lines.
      Note that the "_" in each "as is" line is the first character of
the first word of each such line.  It doesn't have to be the only character
         in the word (although it is in this example), nor does it have to be
      at any particular position in the line.  Put it where it looks best,
       but before any other text in the line.


################################## AFTER ########################################

Here is some unformatted text occurring above a few "as is" lines.  You want this
text flowed above those lines, but you want the "as is" lines left unchanged.
_                                                   ==============================
_         Name    Age   Score                                           A
_        ------   ---   -----     +---------------------+               |
_        Howard    25     6       |                     |          the range of
_        John      17     4       |    (example of)     |         "as is" lines,
_        Mary      19     5       |    ( a figure )     |         flagged with
_        Charlie    3    12       |                     |         a leading "_"
_                                 +---------------------+               |
_                                                                       V
_                                                   ==============================
Here is some text after the "as is" lines, but still in the same paragraph.  These
lines will be flowed after the "as is" lines.  Note that the "_" in each "as is"
line is the first character of the first word of each such line.  It doesn't have
to be the only character in the word (although it is in this example), nor does it
have to be at any particular position in the line.  Put it where it looks best,
but before any other text in the line.

PCCON vs the KEDIT FLOW Command

As a paragraph formatter, PCCON is analogous to the KEDIT FLOW command. However, there are some significant differences.

  1. Sometimes FLOW adds an undesired extra space after certain punctuation, e.g., after an abbreviation that it mistakes for the end of a sentence. It also adds an extra space after drive identifiers, e.g., "C:".  PCCON corrects some of these problems. (There is no 100% accurate solution for this because exactly the same text can be interpreted multiple ways, as shown below.) Here are some examples of FLOW vs PCCON processing:
Mr. Smith formatted his C:  drive accidentally.   FLOW often produces extra spaces.
Dr.  Smith formatted his C:  drive accidentally.  FLOW often produces extra spaces.

Mr. Smith formatted his C: drive accidentally.    PCCON almost always does not.
Dr. Smith formatted his C: drive accidentally.    PCCON almost always does not.

How Mr. Smith did this is unknown.                PCCON handles this correctly.
How Dr.  Smith did this is unknown.    PCCON cannot avoid this extra space  (see below).

Step 1:  Do this...     PCCON inserts an extra space after a ":" if the next word is init-capped  (my style).
animals: cats, dogs...  PCCON does NOT insert an extra space if the next word is not init-capped.
Step 1:  Do this...    animals:  cats, dogs...  FLOW inserts an extra space in both cases.

In contrast to FLOW, PCCON manages to avoid adding an extra space after most common honorific abbreviations, e.g., Adm., Amb., Capt., Chm., Col., Cpl., Ct., Gen., Gov., Lt., Sgt., Maj., Pfc., Pres., Prof., Pvt., Rep., Rev., Sec., Sen., Sgt., and Supt.

However three abbreviations are ambiguous: Ct., Dr., and St., which all can be part of a street name as well as being an honorific. "Go to Whitebud Dr.   Brown dogs live there." needs the extra space and PCCON provides one. In contrast, consider "I know Dr. Brown has a practice in Raleigh." for which PCCON properly does NOT provide an extra space.

Now consider this unavoidable extra-space situation: "Young Dr.   Brown has a practice in Raleigh." The case of the first character of the word immediately before Dr. ("Young" in this case) is the deciding factor. Usually if it is uppercase, that word is part of the street name ending in Dr., but not always.

  1. PCCON leaves the cursor inside the just-formatted paragraph, whereas FLOW always moves the cursor after that paragraph to prepare for formatting the next one. The FLOW cursor movement can cause the just FLOWed paragraph to be scrolled off the screen if that paragraph was near the bottom of the screen and the current line field is set near the top (as I do). I like to have the cursor more stable when I write a paragraph. Often I type, edit, format (flow), edit, type some more, format again, etc., on the same paragraph and don't like having to drag the cursor back to where I was typing every time I do a format with the FLOW command. I definitely don't like having to scroll to get the paragraph visible again so I can continue editing it. In contrast, PCCON takes care to keep the just PCCONed paragraph on the screen with the cursor placed in it near where the cursor was last located, allowing the editing of that paragraph to continue easily.
  2. Although most paragraphs are left-aligned in column 1, sometimes you need to create paragraphs that have a different shape. For example, the paragraph you are now reading uses a hanging indent. Also, you might want to inset the whole paragraph a certain amount from column 1 to provide emphasis or to align it with the indented text of a previous hanging indent paragraph. The next paragraph is an example of the latter.

    With KEDIT's FLOW command, you must first specify the desired shape of a paragraph by using the SET MARGINS command. Whenever you go to a different shape, you must reissue SET MARGINS with new values, which you must calculate. With PCCON, you do not ever need to calculate parameters and issue SET MARGINS; instead things are much more WYSIWYG -- "What You See Is What You Get." Just choose a model paragraph; if needed, shift around its first two lines to get the desired shape. Then point to the paragraph with the cursor and invoke "PCCON (USESHAPE". The macro neatly formats the model paragraph, but also figures out and records its shape values to later use as PCCON defaults when formatting other paragraphs until you tell PCCON to learn a new shape.

    PCCON also provides an easy way to create continuation and list paragraphs, which align with a previously created hanging indent paragraph. These form a family of paragraphs, all tied to the shape of an initial hanging indent paragraph.

  3. If you have named lines in a paragraph by using the SET POINT command and later format the paragraph with the FLOW command, all the line names for that paragraph will be deleted. Instead, PCCON preserves line names. Any names for a particular named line will now be associated with the line that ends up containing the first word of the original named line in the paragraph before PCCON was invoked.
  4. PCCON supports several types of leave "as is" lines.
  5. In contrast to FLOW, PCCON does not support the settings of the KEDIT SET FORMAT command. PCCON always doublespaces at the end of a sentence and recognizes only blank lines (or range delimiters) as paragraph delimiters. Also, PCCON always formats paragraphs with a ragged instead of a justified right edge. Finally, PCCON does not have a target parameter; PCCON formats only a single paragraph; it must be designated by the cursor. The PCCONALL macro, which invokes PCCON multiple times, can be used to format all the paragraphs in a file to the same shape -- same inset and hanging indent for each paragraph. PCCONZ does a similar thing, but uses 0 for the inset and hanging indent values for all the paragraphs.
  6. As previously noted, PCCON's use of positive and negative for hanging indents is exactly reversed from the use by the KEDIT SET MARGINS command. PCCON and FLOW also differ on which paragraph lines determine a paragraph's "inset".
Jeffrey Knauth