Socket
Socket
Sign inDemoInstall

github.com/golang/tools/go/pointer

Package Overview
Dependencies
3
Alerts
File Explorer

Install Socket

Detect and block malicious and high-risk dependencies

Install

    github.com/golang/tools/go/pointer

Package pointer implements Andersen's analysis, an inclusion-based pointer analysis algorithm first described in (Andersen, 1994). A pointer analysis relates every pointer expression in a whole program to the set of memory locations to which it might point. This information can be used to construct a call graph of the program that precisely represents the destinations of dynamic function and method calls. It can also be used to determine, for example, which pairs of channel operations operate on the same channel. The package allows the client to request a set of expressions of interest for which the points-to information will be returned once the analysis is complete. In addition, the client may request that a callgraph is constructed. The example program in example_test.go demonstrates both of these features. Clients should not request more information than they need since it may increase the cost of the analysis significantly. Our algorithm is INCLUSION-BASED: the points-to sets for x and y will be related by pts(y) ⊇ pts(x) if the program contains the statement y = x. It is FLOW-INSENSITIVE: it ignores all control flow constructs and the order of statements in a program. It is therefore a "MAY ALIAS" analysis: its facts are of the form "P may/may not point to L", not "P must point to L". It is FIELD-SENSITIVE: it builds separate points-to sets for distinct fields, such as x and y in struct { x, y *int }. It is mostly CONTEXT-INSENSITIVE: most functions are analyzed once, so values can flow in at one call to the function and return out at another. Only some smaller functions are analyzed with consideration of their calling context. It has a CONTEXT-SENSITIVE HEAP: objects are named by both allocation site and context, so the objects returned by two distinct calls to f: are distinguished up to the limits of the calling context. It is a WHOLE PROGRAM analysis: it requires SSA-form IR for the complete Go program and summaries for native code. See the (Hind, PASTE'01) survey paper for an explanation of these terms. The analysis is fully sound when invoked on pure Go programs that do not use reflection or unsafe.Pointer conversions. In other words, if there is any possible execution of the program in which pointer P may point to object O, the analysis will report that fact. By default, the "reflect" library is ignored by the analysis, as if all its functions were no-ops, but if the client enables the Reflection flag, the analysis will make a reasonable attempt to model the effects of calls into this library. However, this comes at a significant performance cost, and not all features of that library are yet implemented. In addition, some simplifying approximations must be made to ensure that the analysis terminates; for example, reflection can be used to construct an infinite set of types and values of those types, but the analysis arbitrarily bounds the depth of such types. Most but not all reflection operations are supported. In particular, addressable reflect.Values are not yet implemented, so operations such as (reflect.Value).Set have no analytic effect. The pointer analysis makes no attempt to understand aliasing between the operand x and result y of an unsafe.Pointer conversion: It is as if the conversion allocated an entirely new object: The analysis cannot model the aliasing effects of functions written in languages other than Go, such as runtime intrinsics in C or assembly, or code accessed via cgo. The result is as if such functions are no-ops. However, various important intrinsics are understood by the analysis, along with built-ins such as append. The analysis currently provides no way for users to specify the aliasing effects of native code. ------------------------------------------------------------------------ The remaining documentation is intended for package maintainers and pointer analysis specialists. Maintainers should have a solid understanding of the referenced papers (especially those by H&L and PKH) before making making significant changes. The implementation is similar to that described in (Pearce et al, PASTE'04). Unlike many algorithms which interleave constraint generation and solving, constructing the callgraph as they go, this implementation for the most part observes a phase ordering (generation before solving), with only simple (copy) constraints being generated during solving. (The exception is reflection, which creates various constraints during solving as new types flow to reflect.Value operations.) This improves the traction of presolver optimisations, but imposes certain restrictions, e.g. potential context sensitivity is limited since all variants must be created a priori. A type is said to be "pointer-like" if it is a reference to an object. Pointer-like types include pointers and also interfaces, maps, channels, functions and slices. We occasionally use C's x->f notation to distinguish the case where x is a struct pointer from x.f where is a struct value. Pointer analysis literature (and our comments) often uses the notation dst=*src+offset to mean something different than what it means in Go. It means: for each node index p in pts(src), the node index p+offset is in pts(dst). Similarly *dst+offset=src is used for store constraints and dst=src+offset for offset-address constraints. Nodes are the key datastructure of the analysis, and have a dual role: they represent both constraint variables (equivalence classes of pointers) and members of points-to sets (things that can be pointed at, i.e. "labels"). Nodes are naturally numbered. The numbering enables compact representations of sets of nodes such as bitvectors (or BDDs); and the ordering enables a very cheap way to group related nodes together. For example, passing n parameters consists of generating n parallel constraints from caller+i to callee+i for 0<=i<n. The zero nodeid means "not a pointer". For simplicity, we generate flow constraints even for non-pointer types such as int. The pointer equivalence (PE) presolver optimization detects which variables cannot point to anything; this includes not only all variables of non-pointer types (such as int) but also variables of pointer-like types if they are always nil, or are parameters to a function that is never called. Each node represents a scalar part of a value or object. Aggregate types (structs, tuples, arrays) are recursively flattened out into a sequential list of scalar component types, and all the elements of an array are represented by a single node. (The flattening of a basic type is a list containing a single node.) Nodes are connected into a graph with various kinds of labelled edges: simple edges (or copy constraints) represent value flow. Complex edges (load, store, etc) trigger the creation of new simple edges during the solving phase. Conceptually, an "object" is a contiguous sequence of nodes denoting an addressable location: something that a pointer can point to. The first node of an object has a non-nil obj field containing information about the allocation: its size, context, and ssa.Value. Objects include: Many objects have no Go types. For example, the func, map and chan type kinds in Go are all varieties of pointers, but their respective objects are actual functions (executable code), maps (hash tables), and channels (synchronized queues). Given the way we model interfaces, they too are pointers to "tagged" objects with no Go type. And an *ssa.Global denotes the address of a global variable, but the object for a Global is the actual data. So, the types of an ssa.Value that creates an object is "off by one indirection": a pointer to the object. The individual nodes of an object are sometimes referred to as "labels". For uniformity, all objects have a non-zero number of fields, even those of the empty type struct{}. (All arrays are treated as if of length 1, so there are no empty arrays. The empty tuple is never address-taken, so is never an object.) An tagged object has the following layout: The T node's typ field is the dynamic type of the "payload": the value v which follows, flattened out. The T node's obj has the otTagged flag. Tagged objects are needed when generalizing across types: interfaces, reflect.Values, reflect.Types. Each of these three types is modelled as a pointer that exclusively points to tagged objects. Tagged objects may be indirect (obj.flags ⊇ {otIndirect}) meaning that the value v is not of type T but *T; this is used only for reflect.Values that represent lvalues. (These are not implemented yet.) Variables of the following "scalar" types may be represented by a single node: basic types, pointers, channels, maps, slices, 'func' pointers, interfaces. Pointers Basic types (bool, string, numbers, unsafe.Pointer) Channels Maps Slices Functions Interfaces reflect.Value reflect.Type Aggregate types: Aggregate types are treated as if all directly contained aggregates are recursively flattened out. Structs Arrays Tuples (T, ...) FUNCTION CALLS We implement Hash-Value Numbering (HVN), a pre-solver constraint optimization described in Hardekopf & Lin, SAS'07. This is documented in more detail in hvn.go. We intend to add its cousins HR and HU in future. The solver is currently a naive Andersen-style implementation; it does not perform online cycle detection, though we plan to add solver optimisations such as Hybrid- and Lazy- Cycle Detection from (Hardekopf & Lin, PLDI'07). It uses difference propagation (Pearce et al, SQC'04) to avoid redundant re-triggering of closure rules for values already seen. Points-to sets are represented using sparse bit vectors (similar to those used in LLVM and gcc), which are more space- and time-efficient than sets based on Go's built-in map type or dense bit vectors. Nodes are permuted prior to solving so that object nodes (which may appear in points-to sets) are lower numbered than non-object (var) nodes. This improves the density of the set over which the PTSs range, and thus the efficiency of the representation. Partly thanks to avoiding map iteration, the execution of the solver is 100% deterministic, a great help during debugging. Andersen, L. O. 1994. Program analysis and specialization for the C programming language. Ph.D. dissertation. DIKU, University of Copenhagen. David J. Pearce, Paul H. J. Kelly, and Chris Hankin. 2004. Efficient field-sensitive pointer analysis for C. In Proceedings of the 5th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (PASTE '04). ACM, New York, NY, USA, 37-42. http://doi.acm.org/10.1145/996821.996835 David J. Pearce, Paul H. J. Kelly, and Chris Hankin. 2004. Online Cycle Detection and Difference Propagation: Applications to Pointer Analysis. Software Quality Control 12, 4 (December 2004), 311-337. http://dx.doi.org/10.1023/B:SQJO.0000039791.93071.a2 David Grove and Craig Chambers. 2001. A framework for call graph construction algorithms. ACM Trans. Program. Lang. Syst. 23, 6 (November 2001), 685-746. http://doi.acm.org/10.1145/506315.506316 Ben Hardekopf and Calvin Lin. 2007. The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI '07). ACM, New York, NY, USA, 290-299. http://doi.acm.org/10.1145/1250734.1250767 Ben Hardekopf and Calvin Lin. 2007. Exploiting pointer and location equivalence to optimize pointer analysis. In Proceedings of the 14th international conference on Static Analysis (SAS'07), Hanne Riis Nielson and Gilberto Filé (Eds.). Springer-Verlag, Berlin, Heidelberg, 265-280. Atanas Rountev and Satish Chandra. 2000. Off-line variable substitution for scaling points-to analysis. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation (PLDI '00). ACM, New York, NY, USA, 47-56. DOI=10.1145/349299.349310 http://doi.acm.org/10.1145/349299.349310 This program demonstrates how to use the pointer analysis to obtain a conservative call-graph of a Go program. It also shows how to compute the points-to set of a variable, in this case, (C).f's ch parameter.


Version published

FAQs

Last updated on 01 Jun 2023

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc