Scoping

RWYQ阿伟 | 2025-02-05 | 笔记 | 0条留言 | 10 | 2025-02-05更新

Scoping

Namespaces

With few exceptions, place code in a namespace. Namespaces should have unique names based on the project name, and possibly its path. Do not use using-directives (e.g., using namespace foo). Do not use inline namespaces. For unnamed namespaces, see Internal Linkage.


Namespaces subdivide the global scope into distinct, named scopes, and so are useful for preventing name collisions in the global scope.


Namespaces provide a method for preventing name conflicts in large programs while allowing most code to use reasonably short names.

For example, if two different projects have a class Foo in the global scope, these symbols may collide at compile time or at runtime. If each project places their code in a namespace, project1::Foo and project2::Foo are now distinct symbols that do not collide, and code within each project's namespace can continue to refer to Foo without the prefix.

Inline namespaces automatically place their names in the enclosing scope. Consider the following snippet, for example:

namespace outer {
inline namespace inner {
  void foo();
}  // namespace inner
}  // namespace outer

The expressions outer::inner::foo() and outer::foo() are interchangeable. Inline namespaces are primarily intended for ABI compatibility across versions.


Namespaces can be confusing, because they complicate the mechanics of figuring out what definition a name refers to.

Inline namespaces, in particular, can be confusing because names aren't actually restricted to the namespace where they are declared. They are only useful as part of some larger versioning policy.

In some contexts, it's necessary to repeatedly refer to symbols by their fully-qualified names. For deeply-nested namespaces, this can add a lot of clutter.


Namespaces should be used as follows:

  • Follow the rules on Namespace Names.

  • Terminate multi-line namespaces with comments as shown in the given examples.

  • Namespaces wrap the entire source file after includes, gflags definitions/declarations and forward declarations of classes from other namespaces.

    // In the .h file
    namespace mynamespace {
    
    // All declarations are within the namespace scope.
    // Notice the lack of indentation.
    class MyClass {
     public:
      ...
      void Foo();
    };
    
    }  // namespace mynamespace
    // In the .cc file
    namespace mynamespace {
    
    // Definition of functions is within scope of the namespace.
    void MyClass::Foo() {
      ...
    }
    
    }  // namespace mynamespace

    More complex .cc files might have additional details, like flags or using-declarations.

    #include "a.h"
    
    ABSL_FLAG(bool, someflag, false, "a flag");
    
    namespace mynamespace {
    
    using ::foo::Bar;
    
    ...code for mynamespace...    // Code goes against the left margin.
    
    }  // namespace mynamespace
  • To place generated protocol message code in a namespace, use the package specifier in the .proto file. See Protocol Buffer Packages for details.

  • Do not declare anything in namespace std, including forward declarations of standard library classes. Declaring entities in namespace std is undefined behavior, i.e., not portable. To declare entities from the standard library, include the appropriate header file.

  • You may not use a using-directive to make all names from a namespace available.

    // Forbidden -- This pollutes the namespace.
    using namespace foo;
  • Do not use Namespace aliases at namespace scope in header files except in explicitly marked internal-only namespaces, because anything imported into a namespace in a header file becomes part of the public API exported by that file.

    // Shorten access to some commonly used names in .cc files.
    namespace baz = ::foo::bar::baz;
    // Shorten access to some commonly used names (in a .h file).
    namespace librarian {
    namespace internal {  // Internal, not part of the API.
    namespace sidetable = ::pipeline_diagnostics::sidetable;
    }  // namespace internal
    
    inline void my_inline_function() {
      // namespace alias local to a function (or method).
      namespace baz = ::foo::bar::baz;
      ...
    }
    }  // namespace librarian
  • Do not use inline namespaces.

  • Use namespaces with "internal" in the name to document parts of an API that should not be mentioned by users of the API.

    // We shouldn't use this internal name in non-absl code.
    using ::absl::container_internal::ImplementationDetail;
  • Single-line nested namespace declarations are preferred in new code, but are not required.

Internal Linkage

When definitions in a .cc file do not need to be referenced outside that file, give them internal linkage by placing them in an unnamed namespace or declaring them static. Do not use either of these constructs in .h files.


All declarations can be given internal linkage by placing them in unnamed namespaces. Functions and variables can also be given internal linkage by declaring them static. This means that anything you're declaring can't be accessed from another file. If a different file declares something with the same name, then the two entities are completely independent.


Use of internal linkage in .cc files is encouraged for all code that does not need to be referenced elsewhere. Do not use internal linkage in .h files.

Format unnamed namespaces like named namespaces. In the terminating comment, leave the namespace name empty:

namespace {
...
}  // namespace

Nonmember, Static Member, and Global Functions

Prefer placing nonmember functions in a namespace; use completely global functions rarely. Do not use a class simply to group static members. Static methods of a class should generally be closely related to instances of the class or the class's static data.


Nonmember and static member functions can be useful in some situations. Putting nonmember functions in a namespace avoids polluting the global namespace.


Nonmember and static member functions may make more sense as members of a new class, especially if they access external resources or have significant dependencies.


Sometimes it is useful to define a function not bound to a class instance. Such a function can be either a static member or a nonmember function. Nonmember functions should not depend on external variables, and should nearly always exist in a namespace. Do not create classes only to group static members; this is no different than just giving the names a common prefix, and such grouping is usually unnecessary anyway.

If you define a nonmember function and it is only needed in its .cc file, use internal linkage to limit its scope.

Local Variables

Place a function's variables in the narrowest scope possible, and initialize variables in the declaration.

C++ allows you to declare variables anywhere in a function. We encourage you to declare them in a scope as local as possible, and as close to the first use as possible. This makes it easier for the reader to find the declaration and see what type the variable is and what it was initialized to. In particular, initialization should be used instead of declaration and assignment, e.g.,:

int i;
i = f();      // Bad -- initialization separate from declaration.
int i = f();  // Good -- declaration has initialization.
int jobs = NumJobs();
// More code...
f(jobs);      // Bad -- declaration separate from use.
int jobs = NumJobs();
f(jobs);      // Good -- declaration immediately (or closely) followed by use.
std::vector<int> v;
v.push_back(1);  // Prefer initializing using brace initialization.
v.push_back(2);
std::vector<int> v = {1, 2};  // Good -- v starts initialized.

Variables needed for ifwhile and for statements should normally be declared within those statements, so that such variables are confined to those scopes. E.g.:

while (const char* p = strchr(str, '/')) str = p + 1;

There is one caveat: if the variable is an object, its constructor is invoked every time it enters scope and is created, and its destructor is invoked every time it goes out of scope.

// Inefficient implementation:
for (int i = 0; i < 1000000; ++i) {
  Foo f;  // My ctor and dtor get called 1000000 times each.
  f.DoSomething(i);
}

It may be more efficient to declare such a variable used in a loop outside that loop:

Foo f;  // My ctor and dtor get called once each.
for (int i = 0; i < 1000000; ++i) {
  f.DoSomething(i);
}

Static and Global Variables

Objects with static storage duration are forbidden unless they are trivially destructible. Informally this means that the destructor does not do anything, even taking member and base destructors into account. More formally it means that the type has no user-defined or virtual destructor and that all bases and non-static members are trivially destructible. Static function-local variables may use dynamic initialization. Use of dynamic initialization for static class member variables or variables at namespace scope is discouraged, but allowed in limited circumstances; see below for details.

As a rule of thumb: a global variable satisfies these requirements if its declaration, considered in isolation, could be constexpr.


Every object has a storage duration, which correlates with its lifetime. Objects with static storage duration live from the point of their initialization until the end of the program. Such objects appear as variables at namespace scope ("global variables"), as static data members of classes, or as function-local variables that are declared with the static specifier. Function-local static variables are initialized when control first passes through their declaration; all other objects with static storage duration are initialized as part of program start-up. All objects with static storage duration are destroyed at program exit (which happens before unjoined threads are terminated).

Initialization may be dynamic, which means that something non-trivial happens during initialization. (For example, consider a constructor that allocates memory, or a variable that is initialized with the current process ID.) The other kind of initialization is static initialization. The two aren't quite opposites, though: static initialization always happens to objects with static storage duration (initializing the object either to a given constant or to a representation consisting of all bytes set to zero), whereas dynamic initialization happens after that, if required.


Global and static variables are very useful for a large number of applications: named constants, auxiliary data structures internal to some translation unit, command-line flags, logging, registration mechanisms, background infrastructure, etc.


Global and static variables that use dynamic initialization or have non-trivial destructors create complexity that can easily lead to hard-to-find bugs. Dynamic initialization is not ordered across translation units, and neither is destruction (except that destruction happens in reverse order of initialization). When one initialization refers to another variable with static storage duration, it is possible that this causes an object to be accessed before its lifetime has begun (or after its lifetime has ended). Moreover, when a program starts threads that are not joined at exit, those threads may attempt to access objects after their lifetime has ended if their destructor has already run.


Decision on destruction

When destructors are trivial, their execution is not subject to ordering at all (they are effectively not "run"); otherwise we are exposed to the risk of accessing objects after the end of their lifetime. Therefore, we only allow objects with static storage duration if they are trivially destructible. Fundamental types (like pointers and int) are trivially destructible, as are arrays of trivially destructible types. Note that variables marked with constexpr are trivially destructible.

const int kNum = 10;  // Allowed

struct X { int n; };
const X kX[] = {{1}, {2}, {3}};  // Allowed

void foo() {
  static const char* const kMessages[] = {"hello", "world"};  // Allowed
}

// Allowed: constexpr guarantees trivial destructor.
constexpr std::array<int, 3> kArray = {1, 2, 3};
// bad: non-trivial destructor
const std::string kFoo = "foo";

// Bad for the same reason, even though kBar is a reference (the
// rule also applies to lifetime-extended temporary objects).
const std::string& kBar = StrCat("a", "b", "c");

void bar() {
  // Bad: non-trivial destructor.
  static std::map<int, int> kData = {{1, 0}, {2, 0}, {3, 0}};
}

Note that references are not objects, and thus they are not subject to the constraints on destructibility. The constraint on dynamic initialization still applies, though. In particular, a function-local static reference of the form static T& t = *new T; is allowed.

Decision on initialization

Initialization is a more complex topic. This is because we must not only consider whether class constructors execute, but we must also consider the evaluation of the initializer:

int n = 5;    // Fine
int m = f();  // ? (Depends on f)
Foo x;        // ? (Depends on Foo::Foo)
Bar y = g();  // ? (Depends on g and on Bar::Bar)

All but the first statement expose us to indeterminate initialization ordering.

The concept we are looking for is called constant initialization in the formal language of the C++ standard. It means that the initializing expression is a constant expression, and if the object is initialized by a constructor call, then the constructor must be specified as constexpr, too:

struct Foo { constexpr Foo(int) {} };

int n = 5;  // Fine, 5 is a constant expression.
Foo x(2);   // Fine, 2 is a constant expression and the chosen constructor is constexpr.
Foo a[] = { Foo(1), Foo(2), Foo(3) };  // Fine

Constant initialization is always allowed. Constant initialization of static storage duration variables should be marked with constexpr or constinit. Any non-local static storage duration variable that is not so marked should be presumed to have dynamic initialization, and reviewed very carefully.

By contrast, the following initializations are problematic:

// Some declarations used below.
time_t time(time_t*);      // Not constexpr!
int f();                   // Not constexpr!
struct Bar { Bar() {} };

// Problematic initializations.
time_t m = time(nullptr);  // Initializing expression not a constant expression.
Foo y(f());                // Ditto
Bar b;                     // Chosen constructor Bar::Bar() not constexpr.

Dynamic initialization of nonlocal variables is discouraged, and in general it is forbidden. However, we do permit it if no aspect of the program depends on the sequencing of this initialization with respect to all other initializations. Under those restrictions, the ordering of the initialization does not make an observable difference. For example:

int p = getpid();  // Allowed, as long as no other static variable
                   // uses p in its own initialization.

Dynamic initialization of static local variables is allowed (and common).

Common patterns

  • Global strings: if you require a named global or static string constant, consider using a constexpr variable of string_view, character array, or character pointer, pointing to a string literal. String literals have static storage duration already and are usually sufficient. See TotW #140.

  • Maps, sets, and other dynamic containers: if you require a static, fixed collection, such as a set to search against or a lookup table, you cannot use the dynamic containers from the standard library as a static variable, since they have non-trivial destructors. Instead, consider a simple array of trivial types, e.g., an array of arrays of ints (for a "map from int to int"), or an array of pairs (e.g., pairs of int and const char*). For small collections, linear search is entirely sufficient (and efficient, due to memory locality); consider using the facilities from absl/algorithm/container.h for the standard operations. If necessary, keep the collection in sorted order and use a binary search algorithm. If you do really prefer a dynamic container from the standard library, consider using a function-local static pointer, as described below .

  • Smart pointers (std::unique_ptrstd::shared_ptr): smart pointers execute cleanup during destruction and are therefore forbidden. Consider whether your use case fits into one of the other patterns described in this section. One simple solution is to use a plain pointer to a dynamically allocated object and never delete it (see last item).

  • Static variables of custom types: if you require static, constant data of a type that you need to define yourself, give the type a trivial destructor and a constexpr constructor.

  • If all else fails, you can create an object dynamically and never delete it by using a function-local static pointer or reference (e.g., static const auto& impl = *new T(args...);).

thread_local Variables

thread_local variables that aren't declared inside a function must be initialized with a true compile-time constant, and this must be enforced by using the constinit attribute. Prefer thread_local over other ways of defining thread-local data.


Variables can be declared with the thread_local specifier:

thread_local Foo foo = ...;

Such a variable is actually a collection of objects, so that when different threads access it, they are actually accessing different objects. thread_local variables are much like static storage duration variables in many respects. For instance, they can be declared at namespace scope, inside functions, or as static class members, but not as ordinary class members.

thread_local variable instances are initialized much like static variables, except that they must be initialized separately for each thread, rather than once at program startup. This means that thread_local variables declared within a function are safe, but other thread_local variables are subject to the same initialization-order issues as static variables (and more besides).

thread_local variables have a subtle destruction-order issue: during thread shutdown, thread_local variables will be destroyed in the opposite order of their initialization (as is generally true in C++). If code triggered by the destructor of any thread_local variable refers to any already-destroyed thread_local on that thread, we will get a particularly hard to diagnose use-after-free.


  • Thread-local data is inherently safe from races (because only one thread can ordinarily access it), which makes thread_local useful for concurrent programming.

  • thread_local is the only standard-supported way of creating thread-local data.


  • Accessing a thread_local variable may trigger execution of an unpredictable and uncontrollable amount of other code during thread-start or first use on a given thread.

  • thread_local variables are effectively global variables, and have all the drawbacks of global variables other than lack of thread-safety.

  • The memory consumed by a thread_local variable scales with the number of running threads (in the worst case), which can be quite large in a program.

  • Data members cannot be thread_local unless they are also static.

  • We may suffer from use-after-free bugs if thread_local variables have complex destructors. In particular, the destructor of any such variable must not call any code (transitively) that refers to any potentially-destroyed thread_local. This property is hard to enforce.

  • Approaches for avoiding use-after-free in global/static contexts do not work for thread_locals. Specifically, skipping destructors for globals and static variables is allowable because their lifetimes end at program shutdown. Thus, any "leak" is managed immediately by the OS cleaning up our memory and other resources. By contrast, skipping destructors for thread_local variables leads to resource leaks proportional to the total number of threads that terminate during the lifetime of the program.


thread_local variables at class or namespace scope must be initialized with a true compile-time constant (i.e., they must have no dynamic initialization). To enforce this, thread_local variables at class or namespace scope must be annotated with constinit (or constexpr, but that should be rare):

   constinit thread_local Foo foo = ...;

thread_local variables inside a function have no initialization concerns, but still risk use-after-free during thread exit. Note that you can use a function-scope thread_local to simulate a class- or namespace-scope thread_local by defining a function or static method that exposes it:

Foo& MyThreadLocalFoo() {
  thread_local Foo result = ComplicatedInitialization();
  return result;
}

Note that thread_local variables will be destroyed whenever a thread exits. If the destructor of any such variable refers to any other (potentially-destroyed) thread_local we will suffer from hard to diagnose use-after-free bugs. Prefer trivial types, or types that provably run no user-provided code at destruction to minimize the potential of accessing any other thread_local.

thread_local should be preferred over other mechanisms for defining thread-local data.


博客内容遵循 署名-非商业性使用-相同方式共享4.0国际(CC BY-NC-SA 4.0)协议。

本文链接:https://rwyqboy.top/post/1893.html

版权声明:本文由阿伟的笔记本发布,如需转载请注明出处。

...

上一篇
Header Files
下一篇
Classes
发表评论:

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。