Reckon 0.5.1-dev
A Tool to Count Logical Lines of Code
Loading...
Searching...
No Matches
reckon.h File Reference

The primary API of the Reckon library. More...

#include <stddef.h>
#include <stdbool.h>
#include <stdint.h>
#include "reckon_export.h"

Go to the source code of this file.

Data Structures

struct  RcnResultState
 The result status type of an operation indicating success or failure. More...
struct  RcnCountResult
 The result type for a single code analysis operation. More...
struct  RcnCountResultGroup
 Result type for a group of analysis operations on a single source entity. More...
struct  RcnSourceText
 A block of source text. More...
struct  RcnSourceFile
 A structure representing a text source file. More...
struct  RcnCountResultSet
 The count results for a set of source files. More...
struct  RcnCountStatistics
 A collection of source code metrics. More...
struct  RcnStatOptions
 Options to customize the behaviour of counting operations. More...

Macros

#define RECKON_NUM_SUPPORTED_FORMATS   3
 The total number of supported text formats, including supported programming languages.
#define RECKON_MK_FRMT_OPT(frmt)
 Macro to create a format option bitmask.
#define RECKON_ENV_VAR_DEBUG   "RECKON_DEBUG"
 The name of the environment variable to control debug logging.

Typedefs

typedef enum RcnTextFormat RcnTextFormat
 Enumeration of supported text formats and programming languages.
typedef enum RcnErrorCode RcnErrorCode
 Enumeration of error states.
typedef struct RcnResultState RcnResultState
 The result status type of an operation indicating success or failure.
typedef uint64_t RcnCount
 A count number of some metric within source text.
typedef struct RcnCountResult RcnCountResult
 The result type for a single code analysis operation.
typedef struct RcnCountResultGroup RcnCountResultGroup
 Result type for a group of analysis operations on a single source entity.
typedef struct RcnSourceText RcnSourceText
 A block of source text.
typedef enum RcnFileOpStatus RcnFileOpStatus
 Enumeration of file processing operation status codes.
typedef struct RcnSourceFile RcnSourceFile
 A structure representing a text source file.
typedef struct RcnCountResultSet RcnCountResultSet
 The count results for a set of source files.
typedef struct RcnCountStatistics RcnCountStatistics
 A collection of source code metrics.
typedef enum RcnCountOption RcnCountOption
 Options to specify which counting operations to perform.
typedef enum RcnFormatOption RcnFormatOption
 Options for format-specific analysis behaviours.
typedef struct RcnStatOptions RcnStatOptions
 Options to customize the behaviour of counting operations.

Enumerations

enum  RcnTextFormat { RCN_TEXT_UNFORMATTED = 0 , RCN_LANG_C = 1 , RCN_LANG_JAVA = 2 }
 Enumeration of supported text formats and programming languages. More...
enum  RcnErrorCode {
  RCN_ERR_NONE , RCN_ERR_UNSUPPORTED_FORMAT , RCN_ERR_INVALID_INPUT , RCN_ERR_INPUT_TOO_LARGE ,
  RCN_ERR_SYNTAX_ERROR , RCN_ERR_ALLOC_FAILURE , RCN_ERR_UNKNOWN
}
 Enumeration of error states. More...
enum  RcnFileOpStatus {
  RCN_FILE_OP_OK , RCN_FILE_OP_INVALID_PATH , RCN_FILE_OP_FILE_NOT_FOUND , RCN_FILE_OP_IO_ERROR ,
  RCN_FILE_OP_ALLOC_FAILURE , RCN_FILE_OP_FILE_TOO_LARGE , RCN_FILE_OP_UNKNOWN_ERROR
}
 Enumeration of file processing operation status codes. More...
enum  RcnCountOption { RCN_OPT_COUNT_CHARACTERS = 0x01 , RCN_OPT_COUNT_WORDS = 0x02 , RCN_OPT_COUNT_PHYSICAL_LINES = 0x04 , RCN_OPT_COUNT_LOGICAL_LINES = 0x08 }
 Options to specify which counting operations to perform. More...
enum  RcnFormatOption { RCN_OPT_TEXT_UNFORMATTED = RECKON_MK_FRMT_OPT(RCN_TEXT_UNFORMATTED) , RCN_OPT_LANG_C = RECKON_MK_FRMT_OPT(RCN_LANG_C) , RCN_OPT_LANG_JAVA = RECKON_MK_FRMT_OPT(RCN_LANG_JAVA) }
 Options for format-specific analysis behaviours. More...

Functions

RcnCountStatisticsrcnCreateCountStatistics (const char *path)
 Creates a new RcnCountStatistics struct for the specified file path.
void rcnFreeCountStatistics (RcnCountStatistics *stats)
 Frees a previously allocated RcnCountStatistics struct.
void rcnCount (RcnCountStatistics *stats, RcnStatOptions options)
 Performs counting operations using the specified statistics options.
RcnCountResult rcnCountLogicalLines (RcnTextFormat language, RcnSourceText sourceCode)
 Counts the number of logical lines of code in the specified source text.
RcnSourceText rcnMarkLogicalLinesInFile (const char *path)
 Marks the counted logical lines in the source code of the specified file.
RcnSourceText rcnMarkLogicalLinesInSourceText (RcnTextFormat language, RcnSourceText sourceCode)
 Marks the counted logical lines in the specified source code text.
void rcnFreeSourceText (RcnSourceText *source)
 Frees the previously allocated data of a RcnSourceText struct.
RcnCountResult rcnCountPhysicalLines (RcnSourceText source)
 Counts the number of hard physical lines in the specified source text.
RcnCountResult rcnCountWords (RcnSourceText source)
 Counts the number of words in the specified source text.
RcnCountResult rcnCountCharacters (RcnSourceText source)
 Counts the number of characters in the specified source text.

Detailed Description

The primary API of the Reckon library.

Exposes types and function declarations for source code metrics. Provides functionality to count the occurrences of various source code related concepts such as number of words, physical lines and logical lines of code, and other related metrics. The library supports multiple programming languages and file formats. Supported formats are enumerated by the RcnTextFormat enum.

The Reckon library only supports processing text that is encoded in UTF-8 or UTF-16. In the case of UTF-16, a BOM must be present at the start of the text to indicate endianness. For any operation provided by the library, if the input text has encoding errors, the operation finishes gracefully but the computed result is undefined.

The typical usage is to create a RcnCountStatistics struct for either a single file or directory path using the rcnCreateCountStatistics() function. Choose the desired counting operations, formats and other options using a RcnStatOptions struct. Then pass both the created statistics and options to the rcnCount() function to perform the counting. Finally, after having evaluated the computed statistics, free the allocated statistics using the rcnFreeCountStatistics() function.

What follows are definitions of metrics that are computed by this library.

  • Logical Lines of Code (LLC):
    The number of programming-language-specific, non-empty, non-comment program source constructs that correspond each to one complete semantically cohesive statement or declaration in the grammatical sense of the underlying language, counted independently of physical line breaks, formatting and other visual layout aspects. Logical lines in a source code file are partitions within the top-level statement/declaration units recognized by the language grammar or an approximation thereof. Such units include, but are not limited to, executable statements (e.g. expression statements, return, if, for, while, switch cases), declarations/definitions (e.g. variable, function, type/class definitions), other language-defined standalone constructs (e.g. import/use/module directives). The LLC count is the number of such units after segmentation. Thus, and in comparison to physical lines of code, multiple statements on one physical line count as multiple LLCs. One statement spanning multiple physical lines counts as one LLC.
  • Physical Lines (PHL):
    The number of hard physical lines in the source text, including blank lines and comments.
  • Words (WRD):
    The number of non-zero-length sequences of printable characters delimited by white space.
  • Characters (CHR):
    The number of Unicode code points. This includes printable as well as non-printable characters. Therefore, this metric includes control characters, like newlines.

Please note that the above definitions themselves are not strictly formal and not part of the API contract. Both the definitions as well as the library implementation may evolve in future releases, such that different versions of the Reckon library may compute slightly different results for a particular metric and input combination. For more information, please refer to the official Reckon documentation.

The functions in this library are not MT-safe.

See also
https://docs.raven-computing.com/reckon/latest
Author
Phil Gaiser

Macro Definition Documentation

◆ RECKON_ENV_VAR_DEBUG

#define RECKON_ENV_VAR_DEBUG   "RECKON_DEBUG"

The name of the environment variable to control debug logging.

If the environment has a variable with this name set to "1", then debug logging is enabled and for certain operations additional information is printed on stdout. A variable value of "0" disables all debug logging. If the environment variable is not set, debug logging is disabled by default. The definition of the environment variable only has an effect if the library is compiled as a debug build.

◆ RECKON_MK_FRMT_OPT

#define RECKON_MK_FRMT_OPT ( frmt)
Value:
(1ULL << (frmt))

Macro to create a format option bitmask.

Users should prefer to use the RcnFormatOption enumeration, instead of using this macro directly.

Parameters
frmtThe RcnTextFormat enumerator value.

◆ RECKON_NUM_SUPPORTED_FORMATS

#define RECKON_NUM_SUPPORTED_FORMATS   3

The total number of supported text formats, including supported programming languages.

Typedef Documentation

◆ RcnCount

typedef uint64_t RcnCount

A count number of some metric within source text.

This type is used to represent, for example, the count of lines within source text. Shall be treated as a non-negative integer number of arbitrary bit width. In the unlikely event of an overflow, count values wrap around according to standard unsigned integer arithmetic.

◆ RcnCountOption

Options to specify which counting operations to perform.

Users can combine multiple options using a bitwise OR operation. Do not rely on concrete numeric enumerator values.

◆ RcnCountResult

typedef struct RcnCountResult RcnCountResult

The result type for a single code analysis operation.

Represents the end result of one concrete type of count operation. For example, it will only contain the count of logical lines of code, or only the count of physical lines, depending on the operation performed.

◆ RcnCountResultGroup

typedef struct RcnCountResultGroup RcnCountResultGroup

Result type for a group of analysis operations on a single source entity.

Represents the end results of possibly multiple count operations performed on a single named source entity, like a specific source file. This is used to group multiple different count metrics together into a single type.

◆ RcnCountResultSet

typedef struct RcnCountResultSet RcnCountResultSet

The count results for a set of source files.

Contains a list of source files that are subject to analysis, along with their corresponding count results. Each file in the files list has a corresponding result in the results list at the same index. No checks are performed regarding duplicate files in the list, as a result, uniqueness is not guaranteed.

◆ RcnCountStatistics

typedef struct RcnCountStatistics RcnCountStatistics

A collection of source code metrics.

This type is used to track and store the results for code analysis operations. It contains statistics about a set of source code files, where conceptually every text file that would be part of a source tree is considered a source code file, even if it doesn't contain actual code.

◆ RcnErrorCode

typedef enum RcnErrorCode RcnErrorCode

Enumeration of error states.

All count operations return a RcnResultState struct that contains information about the operation's success or failure, and in latter case, the error code indicates the type of error that has occurred.

◆ RcnFileOpStatus

Enumeration of file processing operation status codes.

RcnSourceFile structs carry this status to indicate the processing state of the file, allowing to differentiate between various error conditions. It is guaranteed that the status code indicating success (i.e. no error) evaluates to zero, whereas all codes indicating a detected error evaluate to non-zero values.

◆ RcnFormatOption

Options for format-specific analysis behaviours.

Users can use these options to enable or disable specific formats when processing source files. Multiple options can be combined using a bitwise OR operation. Do not rely on concrete numeric enumerator values.

◆ RcnResultState

typedef struct RcnResultState RcnResultState

The result status type of an operation indicating success or failure.

Count operations return result types that contain this type of state. For a single operation, e.g. rcnCountLogicalLines(), an ok value of true indicates that the operation was fully successful, implying that errorCode is set to RCN_ERR_NONE and errorMessage is NULL. Therefore, if ok is false, then errorCode indicates the type of error that has occurred and errorMessage may or may not be set to provide additional information. For compound operations, e.g. rcnCount(), an ok value of true indicates that the operation was at least partially successful and parts of the computed compound result are usable. In such a case, errorCode may still indicate one of the encountered errors, usually the last encountered one, and errorMessage may or may not provide more information. This implies that for compound operations an ok value of true might only indicate that no critical error has occurred.

◆ RcnSourceFile

typedef struct RcnSourceFile RcnSourceFile

A structure representing a text source file.

Holds metadata and content of a source file to be analyzed. A source file may or may not contain source code written in a programming language. It may as well be regular text, formatted or unformatted. Users typically don't need to create, manipulate or manage the memory of this struct directly, as it is managed internally by the Reckon library to represent and track files.

The file content may or may not be loaded at any given time. Check the isContentRead field to determine if the content was read from the file system. The content.size of a not yet read file is defined to be zero. Thus, empty files that were read will have isContentRead equal to true, content.size of zero and an empty string in content.text.

◆ RcnSourceText

typedef struct RcnSourceText RcnSourceText

A block of source text.

Holds a pointer to the text content and its size in bytes. The source text may or may not be null-terminated. A different type that is composed of this type may further define this explicitly. The text field may contain the bytes of text encoded in any of the supported encodings.

◆ RcnStatOptions

typedef struct RcnStatOptions RcnStatOptions

Options to customize the behaviour of counting operations.

Allows users to specify various options that control how counting operations are performed.

A zero-initialized RcnStatOptions struct will select default behaviour.

◆ RcnTextFormat

Enumeration of supported text formats and programming languages.

Users should not rely on the numeric enumerator value as it may change when support for new formats or programming languages is added in the future and the enumerators are reordered.

Enumeration Type Documentation

◆ RcnCountOption

Options to specify which counting operations to perform.

Users can combine multiple options using a bitwise OR operation. Do not rely on concrete numeric enumerator values.

Enumerator
RCN_OPT_COUNT_CHARACTERS 

Count the number of characters (CHR).

This metric includes control characters, like newlines. The count therefore includes non-printable characters.

RCN_OPT_COUNT_WORDS 

Count the number of words (WRD).

RCN_OPT_COUNT_PHYSICAL_LINES 

Count hard physical lines (PHL).

This option includes all lines, including blank lines and comments.

RCN_OPT_COUNT_LOGICAL_LINES 

Count logical lines of code (LLC).

This option is generally only applicable to source files containing text with a format that supports the notion of logical lines of code. This includes files containing source code written in a programming language but not, for example, plain text files (.txt).

◆ RcnErrorCode

Enumeration of error states.

All count operations return a RcnResultState struct that contains information about the operation's success or failure, and in latter case, the error code indicates the type of error that has occurred.

Enumerator
RCN_ERR_NONE 

No error has occurred.

RCN_ERR_UNSUPPORTED_FORMAT 

The input format or programming language is not supported.

RCN_ERR_INVALID_INPUT 

The input provided was invalid.

RCN_ERR_INPUT_TOO_LARGE 

The input is too large to be processed.

This indicates that the input size exceeds internal limits.

RCN_ERR_SYNTAX_ERROR 

A syntax error was detected.

This usually indicates that an attempt was made to parse programming language source text that is syntactically incorrect in that specific programming language.

RCN_ERR_ALLOC_FAILURE 

A memory allocation failure has occurred.

This usually indicates that the system is out of memory (OOM error).

RCN_ERR_UNKNOWN 

An unknown error has occurred.

This is used as a catch-all for errors that are not further specified.

◆ RcnFileOpStatus

Enumeration of file processing operation status codes.

RcnSourceFile structs carry this status to indicate the processing state of the file, allowing to differentiate between various error conditions. It is guaranteed that the status code indicating success (i.e. no error) evaluates to zero, whereas all codes indicating a detected error evaluate to non-zero values.

Enumerator
RCN_FILE_OP_OK 

No error has occurred.

RCN_FILE_OP_INVALID_PATH 

A provided file path is invalid or malformed.

This could mean that a path was deemed invalid either by the Reckon library or the operating system.

RCN_FILE_OP_FILE_NOT_FOUND 

The provided file was not found in the file system.

RCN_FILE_OP_IO_ERROR 

An I/O error has occurred during file processing.

This could indicate issues such as permission denied, file not found, or read/write errors.

RCN_FILE_OP_ALLOC_FAILURE 

A memory allocation failure has occurred during file processing.

RCN_FILE_OP_FILE_TOO_LARGE 

The file is too large to be processed.

This indicates that the file size exceeds internal limits set by the Reckon library.

RCN_FILE_OP_UNKNOWN_ERROR 

An unknown error has occurred.

This is used as a catch-all for errors that are not further specified.

◆ RcnFormatOption

Options for format-specific analysis behaviours.

Users can use these options to enable or disable specific formats when processing source files. Multiple options can be combined using a bitwise OR operation. Do not rely on concrete numeric enumerator values.

Enumerator
RCN_OPT_TEXT_UNFORMATTED 

Option to select statistics for plain text files written without any explicit formatting.

These are usually files with a '.txt' extension.

RCN_OPT_LANG_C 

Option to select statistics for source code files written in the C programming language.

RCN_OPT_LANG_JAVA 

Option to select statistics for source code files written in the Java programming language.

◆ RcnTextFormat

Enumeration of supported text formats and programming languages.

Users should not rely on the numeric enumerator value as it may change when support for new formats or programming languages is added in the future and the enumerators are reordered.

Enumerator
RCN_TEXT_UNFORMATTED 

Text with no specific formatting, as usually found in files with a '.txt' extension.

RCN_LANG_C 

Source files for the C programming language.

RCN_LANG_JAVA 

Source files for the Java programming language.

Function Documentation

◆ rcnCount()

void rcnCount ( RcnCountStatistics * stats,
RcnStatOptions options )

Performs counting operations using the specified statistics options.

Processes the source files of the specified statistics and performs analysis operations, e.g. counting the number of logical lines of code, according to the given options. The files inside the given statistics must exist and be readable regular text files.

This function is not idempotent with respect to the same stats struct. Calling it multiple times on the same RcnCountStatistics struct is undefined behaviour.

Parameters
statsThe statistics to evaluate.
optionsOptions to customize the analysis behaviour.

◆ rcnCountCharacters()

RcnCountResult rcnCountCharacters ( RcnSourceText source)

Counts the number of characters in the specified source text.

A character is defined as a Unicode code point. This metric includes control characters, like newlines. The returned count therefore includes non-printable characters. See header documentation for details on supported encodings.

Parameters
sourceThe source text to count characters in.
Returns
A RcnCountResult containing the character count.

◆ rcnCountLogicalLines()

RcnCountResult rcnCountLogicalLines ( RcnTextFormat language,
RcnSourceText sourceCode )

Counts the number of logical lines of code in the specified source text.

See header documentation for details on how logical lines of code are defined and for supported encodings.

Parameters
languageThe format of the specified source text. Must denote a supported programming language.
sourceCodeThe source code text to count logical lines in.
Returns
A RcnCountResult struct containing the line count.

◆ rcnCountPhysicalLines()

RcnCountResult rcnCountPhysicalLines ( RcnSourceText source)

Counts the number of hard physical lines in the specified source text.

The count includes all physical lines, including blank lines and comments, not only physical lines of code. The result of this function is therefore independent of any programming language. A physical line count can be computed for every text file, independent of its format.

See header documentation for details on how hard physical lines are defined and for supported encodings.

Parameters
sourceThe source text to count physical lines in.
Returns
A RcnCountResult struct containing the line count.

◆ rcnCountWords()

RcnCountResult rcnCountWords ( RcnSourceText source)

Counts the number of words in the specified source text.

A word is a non-zero-length sequence of printable characters delimited by white space. See header documentation for details on supported encodings.

Parameters
sourceThe source text to count words in.
Returns
A RcnCountResult containing the word count.

◆ rcnCreateCountStatistics()

RcnCountStatistics * rcnCreateCountStatistics ( const char * path)

Creates a new RcnCountStatistics struct for the specified file path.

The specified file path can denote either a single regular file or a directory containing multiple files and subdirectories. In the case of a directory, all regular files within the directory and subdirectories therein will be part of the RcnCountResultSet of the returned statistics. A relative file path will be interpreted as relative to the underlying current working directory.

A user takes ownership of the returned struct and must free it with rcnFreeCountStatistics().

Parameters
pathA path in the file system. Is interpreted as a byte sequence in the underlying platform's native encoding.
Returns
A newly allocated RcnCountStatistics struct, or NULL on error.

◆ rcnFreeCountStatistics()

void rcnFreeCountStatistics ( RcnCountStatistics * stats)

Frees a previously allocated RcnCountStatistics struct.

Must have been previously allocated using rcnCreateCountStatistics().

Parameters
statsThe RcnCountStatistics struct to free. May be NULL.

◆ rcnFreeSourceText()

void rcnFreeSourceText ( RcnSourceText * source)

Frees the previously allocated data of a RcnSourceText struct.

Use this deallocation function for RcnSourceText structs that were returned by functions of this API that allocate new source text. The struct must not be used after calling this function.

Parameters
sourceThe RcnSourceText struct to free. The provided struct and the text field may be NULL.

◆ rcnMarkLogicalLinesInFile()

RcnSourceText rcnMarkLogicalLinesInFile ( const char * path)

Marks the counted logical lines in the source code of the specified file.

Reads the file located at the specified file system path and adds source code comments to lines that are counted as logical lines of code. The comments are according to the syntax of the underlying used programming language and indicate the count number plus the type of syntactic construct that contributes to the logical line count. One physical line of code can contain an annotation for multiple logical lines. This function can only be used for files that contain text formatted in a supported programming language.

See header documentation for details on how logical lines of code are defined. The text in the file must be encoded with UTF-8. Other encodings are not supported by this function and result in undefined behaviour.

Parameters
pathThe file system path of the source code file to annotate. Is interpreted as a byte sequence in the underlying platform's native encoding. Relative paths are interpreted relative to the current working directory.
Returns
The read source code of the specified file with comments added to the counted lines, as a RcnSourceText with a null-terminated string. The caller takes ownership of the returned struct and must free it with rcnFreeSourceText(). Returns a struct with text set to NULL on error.

◆ rcnMarkLogicalLinesInSourceText()

RcnSourceText rcnMarkLogicalLinesInSourceText ( RcnTextFormat language,
RcnSourceText sourceCode )

Marks the counted logical lines in the specified source code text.

Creates a copy of the specified source code text and adds source code comments to lines that are counted as logical lines of code. The comments are according to the syntax of the used programming language and indicate the count number plus the type of syntactic construct that contributes to the logical line count. One physical line of code can contain an annotation for multiple logical lines. This function can only be used with RcnTextFormat enumerators that represent a supported programming language.

See header documentation for details on how logical lines of code are defined. The specified source code text must be encoded with UTF-8. Other encodings are not supported by this function and result in undefined behaviour.

Parameters
languageThe format of the specified source code. Must denote a supported programming language.
sourceCodeThe source code text to annotate.
Returns
A copy of the specified source code with comments added to the counted lines, as a RcnSourceText with a null-terminated string, regardless whether the input string is null-terminated. The caller takes ownership of the returned struct and must free it with rcnFreeSourceText(). Returns a struct with text set to NULL on error.