Skip to main content

Standards for Path and Filename Nomenclature

Personal Standards For Path and Filename Terminology

I've been writing and designing Unix-based software for 20+ years. In this software domain, as in others, using proper names/terminology for things is crucial to conveying understanding. Despite a strong effort to name things properly, I plead guilty to being inconsistent and imprecise when dealing with names for files and paths. A recent encounter with some old code of mine pushed me to spend a few minutes to create a nomenclature that is accurate and well-defined.

I felt the need to document my conclusions and the thought process I followed in arriving at the nomenclature, thus this note.

Authoritative Documents

A logical first step is to examine the authoritative documents and see what they offer.

The POSIX Standard

The POSIX Standard provides a good starting point. Here are some relevant excerpts:

pathname - A string used to identify a file. It has optional beginning / characters, followed by zero or more filenames separated by / characters. A pathname can optionally contain one or more trailing / characters. Multiple successive / characters are considered to be the same as one /, except for the case of exactly two leading / characters.

filename - A sequence of one or more bytes used to name a file. A filename is sometimes referred to as a pathname component.

That's all there is on the subject in the standard.

Unix Man Pages

The following are excerpts from the System V and BSD man page for the basename command:

BASENAME(1)

NAME
    basename, dirname -- return filename or directory portion
                         of pathname

SYNOPSIS
    basename string
    dirname string

DESCRIPTION (BSD)
    The basename utility deletes any prefix ending with
    the last slash character present in string.

    The dirname utility deletes the filename portion,
    beginning with the last slash character to the end
    of string

DESCRIPTION (SYSV)
    Basename deletes any prefix ending in '/' from string.

    Dirname places on standard output the name of the
    directory in which a file named string would nominally
    be found.

The NAME section above is taken from the BSD man page. I chose it because it specifically used the term pathname from the POSIX standard, although BSD predates the standard.

Disappointingly, most Unix man pages (e.g., Seventh Edition, SVR4, etc.), have a NAME section stating: strip filename affixes, though the SUMMARY more precisely uses pathname. The latter is an example of the fungible use of filename, pathname, etc., that I'm trying to avoid.

Filename Components

Unix assigns no special meaning to the characters in a filename, other than the lone . and ... This means that Unix has no concept of file extensions as an indicator of the contents of a file or the application associated with a file.

In practice, however, filenames are often comprised of two components: a name and an extension, separated by a .. For example: photo.jpg.

Nomenclature

Given the preceding, I have defined the following terms for use by me in code and documentation:

pathname - Entire path necessary to unambiguously identify a file.

path - A portion of a pathname.

filename - The rightmost component of a pathname. The value that is returned by basename pathname.

filename extension - If a filename contains a . in other than the first character position, the characters following the . are the filename extension.

filename base - If a filename contains a . in other than the first character position, the characters preceding the . are the filename base.

Discussion and Example Usage

  1. pathname is an absolute or fully-qualified path identifying a file. Relative paths can not be a pathname; a relative path is simply a path.

  2. When referring to a true pathname, use pathname rather than 'a path to a file'. In the case of a relative path, it is permissible to use 'a relative path to a file'.

  3. I don't care for file base due to the similarity to the basename command, which returns the full filename. file name, on its own, seems a better choice, but it would be too easily confused with filename.