Custom parsing

For many users, the data provided by the simple API is enough. In some advanced cases you may find it necessary to use this more customizable parsing mechanism.

First, define a visitor that implements the CxxVisitor protocol. Then you can create an instance of it and pass it to the CxxParser.

visitor = MyVisitor()
parser = CxxParser(filename, content, visitor)
parser.parse()

# do something with the data collected by the visitor

Your visitor should do something with the data as the various callbacks are called. See the SimpleCxxVisitor for inspiration.

API

class cxxheaderparser.parser.CxxParser(filename, content, visitor, options=None, encoding=None)

Single-use parser object

Parameters:
  • filename (str) –

  • content (Optional[str]) –

  • visitor (CxxVisitor) –

  • options (Optional[ParserOptions]) –

  • encoding (Optional[str]) –

parse()

Parse the header contents

Return type:

None

class cxxheaderparser.visitor.CxxVisitor(*args, **kwargs)

Defines the interface used by the parser to emit events

on_class_end(state)

Called when the end of a class/struct/union is encountered.

When a variable like this is declared:

struct X {

} x;

Then on_class_start, .. on_class_end are emitted, along with on_variable for each instance declared.

Parameters:

state (ClassBlockState) –

Return type:

None

on_class_field(state, f)

Called when a field of a class is encountered

Parameters:
Return type:

None

on_class_friend(state, friend)

Called when a friend declaration is encountered

Parameters:
Return type:

None

on_class_method(state, method)

Called when a method of a class is encountered inside of a class

Parameters:
Return type:

None

on_class_start(state)

Called when a class/struct/union is encountered

When part of a typedef:

typedef struct { } X;

This is called first, followed by on_typedef for each typedef instance encountered. The compound type object is passed as the type to the typedef.

If this function returns False, the visitor will not be called for any items inside this class (including on_class_end)

Parameters:

state (ClassBlockState) –

Return type:

Optional[bool]

on_concept(state, concept)
template <class T>
concept Meowable = is_meowable<T>;
Parameters:
Return type:

None

on_deduction_guide(state, guide)

Called when a deduction guide is encountered

Parameters:
Return type:

None

on_enum(state, enum)

Called after an enum is encountered

Parameters:
Return type:

None

on_extern_block_end(state)

Called when an extern block ends

Parameters:

state (ExternBlockState) –

Return type:

None

on_extern_block_start(state)
extern "C" {

}

If this function returns False, the visitor will not be called for any items inside this block (including on_extern_block_end)

Parameters:

state (ExternBlockState) –

Return type:

Optional[bool]

on_forward_decl(state, fdecl)

Called when a forward declaration is encountered

Parameters:
Return type:

None

on_function(state, fn)

Called when a function is encountered that isn’t part of a class

Parameters:
Return type:

None

on_include(state, filename)

Called once for each #include directive encountered

Parameters:
Return type:

None

on_method_impl(state, method)

Called when a method implementation is encountered outside of a class declaration. For example:

void MyClass::fn() {
    // does something
}

Note

The above implementation is ambiguous, as it technically could be a function in a namespace. We emit this instead as it’s more likely to be the case in common code.

Parameters:
Return type:

None

on_namespace_alias(state, alias)

Called when a namespace alias is encountered

Parameters:
Return type:

None

on_namespace_end(state)

Called at the end of a namespace block

Parameters:

state (NamespaceBlockState) –

Return type:

None

on_namespace_start(state)

Called when a namespace directive is encountered

If this function returns False, the visitor will not be called for any items inside this namespace (including on_namespace_end)

Parameters:

state (NamespaceBlockState) –

Return type:

Optional[bool]

on_parse_start(state)

Called when parsing begins

Parameters:

state (NamespaceBlockState) –

Return type:

None

on_pragma(state, content)

Called once for each #pragma directive encountered

Parameters:
Return type:

None

on_template_inst(state, inst)

Called when an explicit template instantiation is encountered

Parameters:
Return type:

None

on_typedef(state, typedef)

Called for each typedef instance encountered. For example:

typedef int T, *PT;

Will result in on_typedef being called twice, once for T and once for *PT

Parameters:
Return type:

None

on_using_alias(state, using)
using foo = int;

template <typename T>
using VectorT = std::vector<T>;
Parameters:
Return type:

None

on_using_declaration(state, using)
using NS::ClassName;
Parameters:
Return type:

None

on_using_namespace(state, namespace)
using namespace std;
Parameters:
Return type:

None

on_variable(state, v)

Called when a global variable is encountered

Parameters:
Return type:

None

class cxxheaderparser.visitor.NullVisitor

This visitor does nothing

on_class_end(state)
Parameters:

state (ClassBlockState) –

Return type:

None

on_class_field(state, f)
Parameters:
Return type:

None

on_class_friend(state, friend)
Parameters:
Return type:

None

on_class_method(state, method)
Parameters:
Return type:

None

on_class_start(state)
Parameters:

state (ClassBlockState) –

Return type:

Optional[bool]

on_concept(state, concept)
Parameters:
Return type:

None

on_deduction_guide(state, guide)
Parameters:
Return type:

None

on_enum(state, enum)
Parameters:
Return type:

None

on_extern_block_end(state)
Parameters:

state (ExternBlockState) –

Return type:

None

on_extern_block_start(state)
Parameters:

state (ExternBlockState) –

Return type:

Optional[bool]

on_forward_decl(state, fdecl)
Parameters:
Return type:

None

on_function(state, fn)
Parameters:
Return type:

None

on_include(state, filename)
Parameters:
Return type:

None

on_method_impl(state, method)
Parameters:
Return type:

None

on_namespace_alias(state, alias)
Parameters:
Return type:

None

on_namespace_end(state)
Parameters:

state (NamespaceBlockState) –

Return type:

None

on_namespace_start(state)
Parameters:

state (NamespaceBlockState) –

Return type:

Optional[bool]

on_parse_start(state)
Parameters:

state (NamespaceBlockState) –

Return type:

None

on_pragma(state, content)
Parameters:
Return type:

None

on_template_inst(state, inst)
Parameters:
Return type:

None

on_typedef(state, typedef)
Parameters:
Return type:

None

on_using_alias(state, using)
Parameters:
Return type:

None

on_using_declaration(state, using)
Parameters:
Return type:

None

on_using_namespace(state, namespace)
Parameters:
Return type:

None

on_variable(state, v)
Parameters:
Return type:

None

Parser state

class cxxheaderparser.parserstate.BaseState(parent, location)
Parameters:
location: Location

Approximate location that the parsed element was found at

parent: Union[NamespaceBlockState[TypeVar(T), TypeVar(PT)], ExternBlockState[TypeVar(T), TypeVar(PT)], ClassBlockState[TypeVar(T), TypeVar(PT)], None]

parent state

user_data: TypeVar(T)

Uninitialized user data available for use by visitor implementations. You should set this in a *_start method.

class cxxheaderparser.parserstate.ClassBlockState(parent, location, class_decl, access, typedef, mods)
Parameters:
access: str

Current access level for items encountered

class_decl: ClassDecl

class decl block being processed

mods: ParsedTypeModifiers

modifiers to apply to following variables

parent: Union[NamespaceBlockState[TypeVar(T), TypeVar(PT)], ExternBlockState[TypeVar(T), TypeVar(PT)], ClassBlockState[TypeVar(T), TypeVar(PT)]]

parent state

typedef: bool

Currently parsing as a typedef

class cxxheaderparser.parserstate.ExternBlockState(parent, location, linkage)
Parameters:
linkage: str

The linkage for this extern block

parent: Union[ExternBlockState[TypeVar(T), TypeVar(PT)], NamespaceBlockState[TypeVar(T), TypeVar(PT)]]

parent state

class cxxheaderparser.parserstate.NamespaceBlockState(parent, location, namespace)
Parameters:
namespace: NamespaceDecl

The incremental namespace for this block

parent: Union[ExternBlockState[TypeVar(T), TypeVar(PT)], NamespaceBlockState[TypeVar(T), TypeVar(PT)]]

parent state

class cxxheaderparser.parserstate.PT

type of custom user data for a parent state

alias of TypeVar(‘PT’)

class cxxheaderparser.parserstate.ParsedTypeModifiers(vars, both, meths)
both: Dict[str, LexToken]

Alias for field number 1

meths: Dict[str, LexToken]

Alias for field number 2

validate(*, var_ok, meth_ok, msg)
Parameters:
  • var_ok (bool) –

  • meth_ok (bool) –

  • msg (str) –

Return type:

None

vars: Dict[str, LexToken]

Alias for field number 0

class cxxheaderparser.parserstate.T

custom user data for this state type

alias of TypeVar(‘T’)

Preprocessor

Contains optional preprocessor support functions

exception cxxheaderparser.preprocessor.PreprocessorError
cxxheaderparser.preprocessor.make_gcc_preprocessor(*, defines=[], include_paths=[], retain_all_content=False, encoding=None, gcc_args=['g++'], print_cmd=True)

Creates a preprocessor function that uses g++ to preprocess the input text.

gcc is a high performance and accurate precompiler, but if an #include directive can’t be resolved or other oddity exists in your input it will throw an error.

Parameters:
  • defines (List[str]) – list of #define macros specified as “key value”

  • include_paths (List[str]) – list of directories to search for included files

  • retain_all_content (bool) – If False, only the parsed file content will be retained

  • encoding (Optional[str]) – If specified any include files are opened with this encoding

  • gcc_args (List[str]) – This is the path to G++ and any extra args you might want

  • print_cmd (bool) – Prints the gcc command as its executed

Return type:

Callable[[str, Optional[str]], str]

pp = make_gcc_preprocessor()
options = ParserOptions(preprocessor=pp)

parse_file(content, options=options)
cxxheaderparser.preprocessor.make_msvc_preprocessor(*, defines=[], include_paths=[], retain_all_content=False, encoding=None, msvc_args=['cl.exe'], print_cmd=True)

Creates a preprocessor function that uses cl.exe from Microsoft Visual Studio to preprocess the input text. cl.exe is not typically on the path, so you may need to open the correct developer tools shell or pass in the correct path to cl.exe in the msvc_args parameter.

cl.exe will throw an error if a file referenced by an #include directive is not found.

Parameters:
  • defines (List[str]) – list of #define macros specified as “key value”

  • include_paths (List[str]) – list of directories to search for included files

  • retain_all_content (bool) – If False, only the parsed file content will be retained

  • encoding (Optional[str]) – If specified any include files are opened with this encoding

  • msvc_args (List[str]) – This is the path to cl.exe and any extra args you might want

  • print_cmd (bool) – Prints the command as its executed

Return type:

Callable[[str, Optional[str]], str]

pp = make_msvc_preprocessor()
options = ParserOptions(preprocessor=pp)

parse_file(content, options=options)
cxxheaderparser.preprocessor.make_pcpp_preprocessor(*, defines=[], include_paths=[], retain_all_content=False, encoding=None, passthru_includes=None)

Creates a preprocessor function that uses pcpp (which must be installed separately) to preprocess the input text.

If missing #include files are encountered, this preprocessor will ignore the error. This preprocessor is pure python so it’s very portable, and is a good choice if performance isn’t critical.

Parameters:
  • defines (List[str]) – list of #define macros specified as “key value”

  • include_paths (List[str]) – list of directories to search for included files

  • retain_all_content (bool) – If False, only the parsed file content will be retained

  • encoding (Optional[str]) – If specified any include files are opened with this encoding

  • passthru_includes (Optional[Pattern]) – If specified any #include directives that match the compiled regex pattern will be part of the output.

Return type:

Callable[[str, Optional[str]], str]

pp = make_pcpp_preprocessor()
options = ParserOptions(preprocessor=pp)

parse_file(content, options=options)