Heap Abstraction

Author

Bor-Yuh Evan Chang

Published

Thursday, November 20, 2025

\(\newcommand{\TirName}[1]{\text{#1}} \newcommand{\inferlab}[3]{ \let\and\qquad \begin{array}{l} \TirName{#1} \\ \displaystyle \frac{#2}{#3} \end{array} } \newcommand{\inferright}[3]{ \let\and\qquad \displaystyle \frac{#2}{#3}\TirName{#1} } \newcommand{\inferrule}[3][]{\inferlab{#1}{#2}{#3}} \newcommand{\infer}[3][]{\inferrule[#1]{#2}{#3}} \)

What questions do you have? What questions does your neighbor have?

Announcements

  • Project presentation (Mon Dec 8 1:30-4 p.m.)

  • Project paper (Fri Dec 12 AoE)

  • Extra support

    • Extra office hours (before and after class today 8:30am-1:00pm)
    • Happy to review your “proposals” over break

See advice on final projects.

Questions?

Exercise: Chaotic Iteration

\[ \begin{array}{rrll} \hat{\Sigma}( \ell_0 ) & = & \hat{\rho}_0 \\ \hat{\Sigma}( \ell_i ) & = & \bigsqcup_{ \ell_j \mathrel{\mathord{-}\mkern-4mu[c]\mkern-4mu\mathord{\rightarrow}} \ell_i \in g} \{ \hat{\rho}' \mid \hat{\Sigma}(\ell_j) \vdash c \Downarrow\hat{\rho}' \} & \text{for $i \neq 0$} \end{array} \]

The chaotic iteration algorithm computes the least fixed-point invariant \(\hat{\Sigma}_\omega\) by iterating at each location \(\ell_i\) until convergence.

r = 1;
while
  (n > 0) {
    r = r * x;
    n = n - 1;
  }
  

Fill in the following table where each \(\hat{\Sigma}_i\) is the \(i\)th iterate using the sign abstract domain. Cell \(\hat{\Sigma}_0(\ell_0)\) is \(\hat{\rho}_0\), which consider to be \(\top\).

\(\hat{\Sigma}_0\) \(\hat{\Sigma}_1\) \(\hat{\Sigma}_2\) \(\hat{\Sigma}_3\) \(\hat{\Sigma}_4\) \(\hat{\Sigma}_5\) \(\hat{\Sigma}_6\) \(\hat{\Sigma}_7\) \(\hat{\Sigma}_8\) \(\hat{\Sigma}_9\) \(\hat{\Sigma}_{10}\) \(\hat{\Sigma}_\omega\)
\(\ell_0\) \(\top\)
\(\ell_1\) \(\bot\)
\(\ell_2\) \(\bot\)
\(\ell_3\) \(\bot\)
\(\ell_4\) \(\bot\)
\(\ell_5\) \(\bot\)
\(\ell_6\) \(\bot\)

Chaotic Iteration

Let us use the convention that empty cells mean that the abstract store at location is unchanged from the previous iterate and \(\cdot\) means we have detected a fixed point at that location.

\(\hat{\Sigma}_0\) \(\hat{\Sigma}_1\) \(\hat{\Sigma}_2\) \(\hat{\Sigma}_3\) \(\hat{\Sigma}_4\) \(\hat{\Sigma}_5\) \(\hat{\Sigma}_6\) \(\hat{\Sigma}_7\) \(\hat{\Sigma}_8\) \(\hat{\Sigma}_9\) \(\hat{\Sigma}_{10}\) \(\hat{\Sigma}_{\omega}\)
\(\ell_0\) \(\top\)
\(\ell_1\) \(\bot\) \(\texttt{r} \mapsto\mathsf{+}\)
\(\ell_2\) \(\bot\) \(\texttt{r} \mapsto\mathsf{+}\) \(\top\) \(\cdot\)
\(\ell_3\) \(\bot\) \(\texttt{r} \mapsto\mathsf{+}\) \(\top\)
\(\ell_4\) \(\bot\) \(\top\) \(\cdot\)
\(\ell_5\) \(\bot\) \(\top\) \(\cdot\)
\(\ell_6\) \(\bot\) \(\top\)

Exercise: Reflection: Chaotic Iteration

What is the worst-case complexity of chaotic iteration?

For efficiency, chaotic iteration is typically done by following the reverse post-order of the weak topological order of the strongly-connected components. What does this mean?

How could we generalize chaotic interation to infinite height domains?

Reflection: Chaotic Iteration

What is the worst-case complexity of chaotic iteration?

For a lattice with height \(h\) and procedure with \(n\) program locations, chaotic iteration is \(O(h \cdot n)\) in the worst case.

For efficiency, chaotic iteration is typically done by following the reverse post-order of the weak topological order of the strongly-connected components. What does this mean?

Intuitively, find the fixed points of inner loops before outer loops.

How could we generalize chaotic interation to infinite height domains?

Intuitively, apply widening \(\mathbin{\nabla}\) at some program location within each loop.

Exercise: Widening

i = 0;
while
  (i < n) {
    i = i + 1;
  }
  

Fill in the following table where each \(\hat{\Sigma}_i\) is the \(i\)th iterate using the interval abstract domain with widening at the loop head \(\ell_2\). Cell \(\hat{\Sigma}_0(\ell_0)\) is \(\hat{\rho}_0\), which consider to be \(\top\).

\(\hat{\Sigma}_0\) \(\hat{\Sigma}_1\) \(\hat{\Sigma}_2\) \(\hat{\Sigma}_3\) \(\hat{\Sigma}_4\) \(\hat{\Sigma}_5\) \(\hat{\Sigma}_6\) \(\hat{\Sigma}_7\) \(\hat{\Sigma}_8\) \(\hat{\Sigma}_9\) \(\hat{\Sigma}_{10}\) \(\hat{\Sigma}_\omega\)
\(\ell_0\) \(\top\)
\(\ell_1\) \(\bot\)
\(\ell_2\) \(\bot\)
\(\ell_3\) \(\bot\)
\(\ell_4\) \(\bot\)
\(\ell_5\) \(\bot\)

Widening

\(\hat{\Sigma}_0\) \(\hat{\Sigma}_1\) \(\hat{\Sigma}_2\) \(\hat{\Sigma}_3\) \(\hat{\Sigma}_4\) \(\hat{\Sigma}_5\) \(\hat{\Sigma}_6\) \(\hat{\Sigma}_7\) \(\hat{\Sigma}_8\) \(\hat{\Sigma}_\omega\)
\(\ell_0\) \(\top\)
\(\ell_1\) \(\bot\) \(\texttt{i} \mapsto[0, 0]\)
\(\ell_2\) \(\bot\) \(\texttt{i} \mapsto[0, 0]\) \(\texttt{i} \mapsto[0, \infty)\) \(\cdot\)
\(\ell_3\) \(\bot\) \(\texttt{i} \mapsto[0, 0]\) \(\texttt{i} \mapsto[0, \infty)\)
\(\ell_4\) \(\bot\) \(\texttt{i} \mapsto[1, 1]\) \(\texttt{i} \mapsto[1, \infty)\)
\(\ell_5\) \(\bot\) \(\texttt{i} \mapsto[0, \infty)\)

Summary: An Abstract Interpretation Template

Syntax

expressions \(\quad e\)

statements \(\quad s\)

Concrete Semantic Domains

values \(\quad v\)

states \(\quad \sigma\)

Concrete Semantics

\(\fbox{$\sigma \vdash e \Downarrow v$}\)

\(\fbox{$\sigma \vdash s \Downarrow\sigma'$}\)

Forward-Reachable Collecting Semantics

\(\begin{array}{rrrl} \text{value properties} & V& \mathrel{::=}& \circ\mid V,v \end{array}\)

\(\begin{array}{rrrl} \text{state properties} & \Sigma& \mathrel{::=}& \circ\mid\Sigma,\sigma \end{array}\)

\(\fbox{$\lbrace\!\lbrack\!\lbrack e \rbrack\!\rbrack\!\rbrace = \lambda \Sigma.V$}\)

\(\fbox{$\lbrace\!\lbrack\!\lbrack s \rbrack\!\rbrack\!\rbrace = \lambda \Sigma.\Sigma'$}\)

\(\lbrace\!\lbrack\!\lbrack e \rbrack\!\rbrack\!\rbrace\,\Sigma = \{ v \mid \text{$\sigma \vdash e \Downarrow v$ and $\sigma\in \Sigma$} \}\)

\(\lbrace\!\lbrack\!\lbrack s \rbrack\!\rbrack\!\rbrace\,\Sigma = \{ \sigma' \mid \text{$\sigma \vdash s \Downarrow\sigma'$ and $\sigma\in \Sigma$} \}\)

Abstract Domains

\(\begin{array}{rrrl} \text{abstract values} & \hat{v} \end{array}\)

\(\begin{array}{rrrl} \text{abstract states} & \hat{\sigma} \end{array}\)

\(\fbox{$\gamma(\hat{v}) = V$}\)

\(\fbox{$\gamma(\hat{\sigma}) = \Sigma$}\)

\(\fbox{$\lbrack\!\lbrack e \rbrack\!\rbrack^\sharp = \lambda \hat{\sigma}.\hat{v}$}\)

\(\fbox{$\lbrack\!\lbrack s \rbrack\!\rbrack^\sharp = \lambda \hat{\sigma}.\hat{\sigma}'$}\)

\(\gamma( \lbrack\!\lbrack e \rbrack\!\rbrack^\sharp\,\hat{\sigma} ) \supseteq \lbrace\!\lbrack\!\lbrack e \rbrack\!\rbrack\!\rbrace\, \gamma(\hat{\sigma}) \quad \)for all \(\hat{\sigma}\)

\(\gamma( \lbrack\!\lbrack s \rbrack\!\rbrack^\sharp\,\hat{\sigma} ) \supseteq \lbrace\!\lbrack\!\lbrack s \rbrack\!\rbrack\!\rbrace\, \gamma(\hat{\sigma}) \quad \)for all \(\hat{\sigma}\)

Dynamic Memory Allocation

JavaScripty Syntax

\(\begin{array}{rrrl} \text{expressions} & e& \mathrel{::=}& \cdots \mid e\texttt{.}f \end{array}\)

\(\begin{array}{rrrl} \text{statements} & s& \mathrel{::=}& \cdots \mid x \mathrel{\texttt{=}}\texttt{\{\}} \mid e_1\texttt{.}f \mathrel{\texttt{=}}e_2 \end{array}\)

\[ \text{fields} \quad f \]

Concrete Semantic Domains

\(\begin{array}{rrrl} \text{values} & v& \mathrel{::=}& \cdots \mid a \\ \text{addresses} & a \end{array}\)

\(\begin{array}{rrrl} \text{states} & \sigma& \mathrel{::=}& \rho \\ \text{stores} & \rho& \mathrel{::=}& \circ \mid\rho,l \mapsto v \\ \text{l-values} & l& \mathrel{::=}& x\mid a\mathord{.}f \end{array}\)

Exercise: Abstract Domains for Dynamic Memory Allocation

What are concrete semantic components are unbounded? Suggest some abstraction ideas.

Points-To Analysis

Allocation Sites

\(\begin{array}{rrrl} \text{statements} & s& \mathrel{::=}& \cdots \mid\ell\colon s \end{array}\)

locations \(\quad \ell\)

Statement \(\ell\colon s\) is a location-labeled statement. For example, \(\ell\colon x \mathrel{\texttt{=}}\texttt{\{\}}\) is an object allocation statement \(x \mathrel{\texttt{=}}\texttt{\{\}}\) at location \(\ell\).

Pointer Language: Syntax

\(\begin{array}{rrrl} \text{statements} & s& \mathrel{::=}& x \mathrel{\texttt{=}}y \mid\ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} \mid x \mathrel{\texttt{=}}y\texttt{.}f \mid x\texttt{.}f \mathrel{\texttt{=}}y \mid s_1\;s_2 \end{array}\)

variables \(\quad x , y\)

\(\begin{array}{rrrl} \text{values} & v& \mathrel{::=}& a^{\ell} \\ \text{addresses} & a \\ \text{locations} & \ell \end{array}\)

\(\begin{array}{rrrl} \text{states} & \sigma& \mathrel{::=}& \rho \\ \text{stores} & \rho& \mathrel{::=}& \circ \mid\rho,l \mapsto v \\ \text{l-values} & l& \mathrel{::=}& x\mid a^{\ell} \mathord{.}f \end{array}\)

The composite statement \(s_1\;s_2\) executes an arbitrary interleaving of \(s_1\) and \(s_2\) an arbitrary number of times (i.e., the program is viewed as a bag of the atomic statements).

Variants of this language \(s\) are called Andersen-style statements. Note that there is no expression language, and statements are in an A-normal form.

Exercise: Pointer Language: Concrete Semantics

\[ \fbox{$\sigma \vdash s \Downarrow\sigma'$} \]

Define the judgment form \(\sigma \vdash s \Downarrow\sigma'\) to say, “In state \(\sigma\), pointer statement \(s\) can evaluate to \(\sigma'\).” Note that evaluation for pointer statements is non-deterministic.

Pointer Language: Concrete Semantics

\[ \fbox{$\sigma \vdash s \Downarrow\sigma'$} \]

\(\infer{ }{ \sigma \vdash x \mathrel{\texttt{=}}y \Downarrow \sigma(x \leftarrow \sigma(y) ) }\)

\(\infer{ a\notin \operatorname{dom}(\sigma) }{ \sigma \vdash \ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} \Downarrow \sigma(x \leftarrow a^{\ell} ) }\)

\(\infer{ a^{\ell} = \sigma(y) }{ \sigma \vdash x \mathrel{\texttt{=}}y\texttt{.}f \Downarrow \sigma(x \leftarrow \sigma( a^{\ell}\mathord{.}f ) ) }\)

\(\infer{ a^{\ell} = \sigma(x) }{ \sigma \vdash x\texttt{.}f \mathrel{\texttt{=}}y \Downarrow \sigma( a^{\ell}\mathord{.}f \leftarrow \sigma(y) ) }\)

\(\infer{ \sigma \vdash s_1 \Downarrow\sigma' \and \sigma' \vdash s_1\;s_2 \Downarrow\sigma'' }{ \sigma \vdash s_1\;s_2 \Downarrow\sigma'' }\)

\(\infer{ \sigma \vdash s_2 \Downarrow\sigma' \and \sigma' \vdash s_1\;s_2 \Downarrow\sigma'' }{ \sigma \vdash s_1\;s_2 \Downarrow\sigma'' }\)

\(\infer{ }{ \sigma \vdash s_1\;s_2 \Downarrow\sigma }\)

Exercise: Allocation-Site Address Abstraction

Define an abstract domain \(\hat{v}\) and its concretization \(\gamma(\hat{v})\) for the allocation-site abstraction.

Allocation-Site Address Abstraction: Domain

\(\begin{array}{rrrl} \text{value properties} & V& \mathrel{::=}& A \\ \text{address properties} & A& \mathrel{::=}& \circ\mid A,a^{\ell} \end{array}\)

\(\begin{array}{rrrl} \text{abstract values} & \hat{v}& \mathrel{::=}& \circ\mid\hat{v},\hat{a} \\ \text{abstract addresses} & \hat{a}& \mathrel{::=}& \ell \end{array}\)

Abstract addresses \(\hat{a}\) are allocation sites, and abstract values \(\hat{v}\) are sets of abstract addresses. Sets of abstract addresses are also called points-to sets.

Allocation-Site Address Abstraction: Concretization

\(\fbox{$\gamma(\hat{v}) = V$}\)

\(\fbox{$\gamma(\hat{a}) = A$}\)

\(\begin{array}{rrl} \gamma(\circ) & \mathrel{:=}& \emptyset \\ \gamma( \hat{v},\hat{a} ) & \mathrel{:=}& \gamma(\hat{v}) \cup \gamma(\hat{a}) \end{array}\)

\(\begin{array}{rrl} \gamma(\ell) & \mathrel{:=}& \{ a^{\ell} \mid \text{true} \} \end{array}\)

Exercise: Store Abstraction

Define an abstract domain \(\hat{\rho}\) and its concretization \(\gamma(\hat{\rho})\) to lift the allocation-site abstraction to stores.

Field-Sensitive Store Abstraction

\[ \begin{array}{rrrl} \text{abstract states} & \hat{\sigma}& \mathrel{::=}& \hat{\rho} \\ \text{abstract stores} & \hat{\rho}& \mathrel{::=}& \circ \mid\hat{\rho},\hat{l} \mapsto\hat{v} \\ \text{abstract l-values} & \hat{l}& \mathrel{::=}& x\mid\hat{a}\mathord{.}f \end{array} \]

The field-sensitive store abstraction is a non-relational lifting of the allocation-site abstraction to stores.

Field-Sensitive Store Abstraction: Concretization

\(\fbox{$\gamma(\hat{\rho}) = P$}\)

\(\begin{array}{rrl} \gamma(\circ) & \mathrel{:=}& \emptyset \\ \gamma( \hat{\rho},x \mapsto\hat{v} ) & \mathrel{:=}& \{ \rho,x \mapsto v \mid v\in \gamma(\hat{v}) \} \\ \gamma( \hat{\rho},\hat{a}\mathord{.}f \mapsto\hat{v} ) & \mathrel{:=}& \{ \rho,a^{\ell}\mathord{.}f \mapsto v \mid \text{$a^{\ell} \in \gamma(\hat{a})$ and $v\in \gamma(\hat{v})$} \} \end{array}\)

Points-To Analysis as an Abstract Interpretation

Define points-to analysis as an abstract interpretation \(\lbrack\!\lbrack s \rbrack\!\rbrack^\sharp = \lambda \hat{\sigma}.\hat{\sigma}'\) on pointer statements \(s\).

Points-To Analysis as an Abstract Interpretation

\[ \fbox{$\lbrack\!\lbrack s \rbrack\!\rbrack^\sharp = \lambda \hat{\sigma}.\hat{\sigma}'$} \]

\[ \begin{array}{rrl} \lbrack\!\lbrack x \mathrel{\texttt{=}}y \rbrack\!\rbrack^\sharp\,\hat{\sigma} & \mathrel{:=}& \hat{\sigma}(x \leftarrow \hat{\sigma}(y) ) \\ \lbrack\!\lbrack \ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} \rbrack\!\rbrack^\sharp\,\hat{\sigma} & \mathrel{:=}& \hat{\sigma}(x \leftarrow \ell) \\ \lbrack\!\lbrack x \mathrel{\texttt{=}}y\texttt{.}f \rbrack\!\rbrack^\sharp\,\hat{\sigma} & \mathrel{:=}& \hat{\sigma}(x \leftarrow \bigcup_{\hat{a}\in \hat{\sigma}(y)} \hat{\sigma}( \hat{a}\mathord{.}f ) ) \\ \lbrack\!\lbrack x\texttt{.}f \mathrel{\texttt{=}}y \rbrack\!\rbrack^\sharp\,\hat{\sigma} & \mathrel{:=}& \bigsqcup_{\hat{a}\in \hat{\sigma}(x)} \hat{\sigma}( \hat{a}\mathord{.}f \leftarrow \hat{\sigma}(y) ) \\ \lbrack\!\lbrack s_1\;s_2 \rbrack\!\rbrack^\sharp\,\hat{\sigma} & \mathrel{:=}& \operatorname{lfp}\,\lambda \hat{\sigma}'. \hat{\sigma}\mathbin{\sqcup}\lbrack\!\lbrack s_1 \rbrack\!\rbrack^\sharp\,\hat{\sigma}' \mathbin{\sqcup}\lbrack\!\lbrack s_2 \rbrack\!\rbrack^\sharp{\hat{\sigma}'} \end{array} \]

Points-To Analysis as Set Constraints

\[ \fbox{$\hat{\sigma}( \hat{l}) \supseteq \hat{v}$} \]

\(\infer{ x \mathrel{\texttt{=}}y }{ \hat{\sigma}( x) \supseteq \hat{\sigma}(y) }\)

\(\infer{ \ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} }{ \hat{\sigma}( x) \supseteq \{ \ell \} }\)

\(\infer{ x \mathrel{\texttt{=}}y\texttt{.}f \and \hat{a}\in \hat{\sigma}(y) }{ \hat{\sigma}( x) \supseteq \hat{\sigma}( \hat{a}\mathord{.}f ) }\)

\(\infer{ x\texttt{.}f \mathrel{\texttt{=}}y \and \hat{a}\in \hat{\sigma}(x) }{ \hat{\sigma}( \hat{a}\mathord{.}f ) \supseteq \hat{\sigma}( y) }\)

The judgment form \(s\) says, “Atomic statement \(s\) is in the program (i.e., the bag of atomic statements).” The program is an implicit parameter of the judgment form.

The result of a points-to analysis is the least \(\hat{\sigma}\) that satisfies \(\hat{\sigma}( \hat{l}) \supseteq \hat{v}\) for a given program.

This subset-based formulation is called Andersen’s Analysis. Replacing subset \(\supseteq\)-constraints with equality \(=\)-constraints is called unification-based or Steensgaard’s Analysis.

Points-To Analysis as Set Constraints

\[ \fbox{$\operatorname{pt}( \hat{l}) \supseteq \hat{v}$} \]

\(\infer{ x \mathrel{\texttt{=}}y }{ \operatorname{pt}( x) \supseteq \operatorname{pt}(y) }\)

\(\infer{ \ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} }{ \operatorname{pt}( x) \supseteq \{ \ell \} }\)

\(\infer{ x \mathrel{\texttt{=}}y\texttt{.}f \and \hat{a}\in \hat{\sigma}(y) }{ \operatorname{pt}( x) \supseteq \operatorname{pt}( \hat{a}\mathord{.}f ) }\)

\(\infer{ x\texttt{.}f \mathrel{\texttt{=}}y \and \hat{a}\in \operatorname{pt}(x) }{ \operatorname{pt}( \hat{a}\mathord{.}f ) \supseteq \operatorname{pt}( y) }\)

Field-Insensitive Store Abstraction

\(\begin{array}{rrrl} \text{abstract states} & \hat{\sigma}& \mathrel{::=}& \hat{\rho} \\ \text{abstract stores} & \hat{\rho}& \mathrel{::=}& \circ \mid\hat{\rho},\hat{l} \mapsto\hat{v} \\ \text{abstract l-values} & \hat{l}& \mathrel{::=}& x\mid\hat{a} \end{array}\)

\(\infer{ x \mathrel{\texttt{=}}y }{ \hat{\sigma}( x) \supseteq \hat{\sigma}(y) }\)

\(\infer{ \ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} }{ \hat{\sigma}( x) \supseteq \{ \ell \} }\)

\(\infer{ x \mathrel{\texttt{=}}y\texttt{.}f \and \hat{a}\in \hat{\sigma}(y) }{ \hat{\sigma}( x) \supseteq \hat{\sigma}( \hat{a}) }\)

\(\infer{ x\texttt{.}f \mathrel{\texttt{=}}y \and \hat{a}\in \hat{\sigma}(x) }{ \hat{\sigma}( \hat{a}) \supseteq \hat{\sigma}( y) }\)

Field-Based Store Abstraction

\(\begin{array}{rrrl} \text{abstract states} & \hat{\sigma}& \mathrel{::=}& \hat{\rho} \\ \text{abstract stores} & \hat{\rho}& \mathrel{::=}& \circ \mid\hat{\rho},\hat{l} \mapsto\hat{v} \\ \text{abstract l-values} & \hat{l}& \mathrel{::=}& x\mid f \end{array}\)

\(\infer{ x \mathrel{\texttt{=}}y }{ \hat{\sigma}( x) \supseteq \hat{\sigma}(y) }\)

\(\infer{ \ell\colon x \mathrel{\texttt{=}}\texttt{\{\}} }{ \hat{\sigma}( x) \supseteq \{ \ell \} }\)

\(\infer{ x \mathrel{\texttt{=}}y\texttt{.}f }{ \hat{\sigma}( x) \supseteq \hat{\sigma}( f) }\)

\(\infer{ x\texttt{.}f \mathrel{\texttt{=}}y }{ \hat{\sigma}( f) \supseteq \hat{\sigma}( y) }\)