# Data Flow Analyzer
The dataflow analyzer is a helper in lint which makes writing certain
kinds of lint checks a lot easier.
Let's say you have an API which creates an object, and then you want to
make sure that at some point a particular method is called on the same
instance.
There are a lot of scenarios like this;
* Calling `show` on a message in a Toast or Snackbar
* Calling `commit` or `apply` on a transaction
* Calling `recycle` on a TypedArray
* Calling `enqueue` on a newly created work request
and so on. I didn't include calling close on a file object since you
typically use try-with-resources for those.
Here are some examples:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
getFragmentManager().beginTransaction().commit() // OK
val t1 = getFragmentManager().beginTransaction() // NEVER COMMITTED
val t2 = getFragmentManager().beginTransaction() // OK
t2.commit()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here we are creating 3 transactions. The first one is committed
immediately. The second one is never committed. And the third one
is.
This example shows us creating multiple transactions, and that
demonstrates that solving this problem isn't as simple as just visiting
the method and seeing if the code invokes `Transaction#commit`
anywhere; we have to make sure that it's invoked on all the instances
we care about.
## Usage
To use the dataflow analyzer, you basically extend the
`DataFlowAnalyzer` class, and override one or more of its callbacks,
and then tell it to analyze a method scope.
!!! Tip
In recent versions of lint, there is a new special subclass of the
`DataFlowAnalyzer`, `TargetMethodDataFlowAnalyzer`, which makes it
easier to write flow analyzers where you are looking for a specific
“cleanup” or close function invoked on an instance. See the separate
section on `TargetMethodDataFlowAnalyzer` below for more information.
For the above transaction scenario, it might look like this:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
override fun getApplicableMethodNames(): List =
listOf("beginTransaction")
override fun visitMethodCall(context: JavaContext, node: UCallExpression, method: PsiMethod) {
val containingClass = method.containingClass
val evaluator = context.evaluator
if (evaluator.extendsClass(containingClass, "android.app.FragmentManager", false)) {
// node is a call to FragmentManager.beginTransaction(),
// so this expression will evaluate to an instance of
// a Transaction. We want to track this instance to see
// if we eventually call commit on it.
var foundCommit = false
val visitor = object : DataFlowAnalyzer(setOf(node)) {
override fun receiver(call: UCallExpression) {
if (call.methodName == "commit") {
foundCommit = true
}
}
}
val method = node.getParentOfType(UMethod::class.java)
method?.accept(visitor)
if (!foundCommit) {
context.report(Incident(...))
}
}
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As you can see, the `DataFlowAnalyzer` is a visitor, so when we find a
call we're interested in, we construct a `DataFlowAnalyzer` and
initialize it with the instance we want to track, and then we visit the
surrounding method with this visitor.
The visitor will invoke the `receiver` method whenever the instance is
invoked as the receiver of a method call; this is the case with
`t2.commit()` in the above example; here “t2” is the receiver, and
`commit` is the method call name.
With the above setup, basic value tracking is working; e.g. it will
correctly handle the following case:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
val t = getFragmentManager().beginTransaction().commit()
val t2 = t
val t3 = t2
t3.commit()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
However, there's a lot that can go wrong, which we'll need to deal with. This is explained in the following sections
## Self-referencing Calls
The Transaction API has a number of utility methods; here's a partial
list:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~java linenumbers
public abstract class FragmentTransaction {
public abstract int commit();
public abstract int commitAllowingStateLoss();
public abstract FragmentTransaction show(Fragment fragment);
public abstract FragmentTransaction hide(Fragment fragment);
public abstract FragmentTransaction attach(Fragment fragment);
public abstract FragmentTransaction detach(Fragment fragment);
public abstract FragmentTransaction add(int containerViewId, Fragment fragment);
public abstract FragmentTransaction add(Fragment fragment, String tag);
public abstract FragmentTransaction addToBackStack(String name);
...
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The reason all these methods return a `FragmentTransaction` is to make it easy to chain calls; e.g.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~java linenumbers
final int id = getFragmentManager().beginTransaction()
.add(new Fragment(), null)
.addToBackStack(null)
.commit();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to correctly analyze this, we'd need to know what the implementation of `add` and `addToBackStack` return. If we know that they simply return “this”, then it's easy; we can transfer the instance through the call.
And this is what the `DataFlowAnalyzer` will try to do by default. When
it encounters a call on our tracked receivers, it will try to guess
whether that method is returning itself. It has several heuristics for
this:
* The return type is the same as its surrounding class, or a subtype of
it
* It's an extension method returning the same type
* It's not named something which indicates a new instance (such as
clone, copy, or to*X*), unless `ignoreCopies()` is overridden to
return false
In our example, the above heuristics work, so out of the box, the lint check would correctly handle this scenario.
But there may be cases where you either don't want these heuristics, or you want to add your own. In these cases, you would override the `returnsSelf` method on the flow analyzer and apply your own logic:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
val visitor = object : DataFlowAnalyzer(setOf(node)) {
override fun returnsSelf(call: UCallExpression): Boolean {
return super.returnsSelf(call) || call.methodName == "copy"
}
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Kotlin Scoping Functions
With this in place, lint will track the flow through the method.
This includes handling Kotlin's scoping functions as well. For
example, it will automatically handle scenarios like the
following:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
transaction1.let { it.commit() }
transaction2.apply { commit() }
with (transaction3) { commit() }
transaction4.also { it.commit() }
getFragmentManager.let {
it.beginTransaction()
}.commit()
// complex (contrived and unrealistic) example:
transaction5.let {
it.also {
it.apply {
with(this) {
commit()
}
}
}
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Limitations
It doesn't try to “execute”, constant evaluation (maybe)
if/else
## Escaping Values
What if your check gets invoked on a code snippet like this:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
fun createTransaction(): FragmentTransaction =
getFragmentManager().beginTransaction().add(new Fragment(), null)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here, we're not calling `commit`, so our lint check would issue a
warning. However, it's quite possible and likely that elsewhere,
there's code using it, like this:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
val transaction = createTransaction()
...
transaction.commit()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Ideally, we'd perform global analysis to handle this, but that's not
currently possible. However, we *can* analyze some additional non-local
scenarios, and more importantly, we need to ensure that we don't offer false positive warnings in the above scenario.
### Returns
In the above case, our tracked transaction “escapes” the method that
we're analyzing through either an implicit return as in the above
Kotlin code or via an explicit return.
The analyzer has a callback method to let us know when this is happening. We can override that callback to remember that the value escapes, and if so, ignore the missing commit:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
var foundCommit = false
var escapes = false
val visitor = object : DataFlowAnalyzer(setOf(node)) {
override fun returns(expression: UReturnExpression) {
escapes = true
}
override fun argument(call: UCallExpression, reference: UElement) {
super.argument(call, reference)
}
override fun field(field: UElement) {
super.field(field)
}
}
node.getParentOfType(UMethod::class.java)?.accept(visitor)
if (!escapes && !foundCommit) {
context.report(Incident(...))
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
### Parameters
Another way our transaction can “escape” out of the method such that we
no longer know for certain whether it gets committed is via a method
call.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
fun test() {
val transaction = getFragmentManager().beginTransaction()
process(transaction)
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here, it's possible that the `process` method will proceed to actually
commit the transaction.
If we have source, we could resolve the call and take a look at the
method implementation (see the “Non Local Analysis” section below), but
in the general case, if a value escapes, we'll want to do something similar to a returned value. The analyzer has a callback for this, `argument`, which is invoked whenever our tracked value is passed into a method as an argument. The callback gives us both the argument and the call in case we want to handle conditional logic based on the specific method call.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
var escapes = false
val visitor = object : DataFlowAnalyzer(setOf(node)) {
...
override fun argument(call: UCallExpression, reference: UElement) {
escapes = true
}
...
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(By default, the analyzer will ignore calls that look like logging calls since those are probably safe and not true escapes; you can
customize this by overriding `ignoreArgument()`.)
### Fields
Finally, a value may escape a local method context if it gets stored
into a field:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
fun initialize() {
this.transaction = createTransaction()
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As with returns and method calls, the analyzer has a callback to make
it easy to handle when this is the case:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
var escapes = false
val visitor = object : DataFlowAnalyzer(setOf(node)) {
...
override fun field(field: UElement) {
escapes = true
}
...
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As you can see, it's passing in the field that is being stored to, in
case you want to perform additional analysis to track field values; see
the next section.
!!! Tip
There is a special subclass of the `DataFlowAnalyzer`, called
`EscapeCheckingDataFlowAnalyzer`, which you can extend instead. This
handles recording all the scenarios where the instance escapes from
the method, and at the end you can just check its `escaped` property.
## Non Local Analysis
In the above examples, if we found that the value escaped via a return
or method call or storage in a field, we simply gave up. In some cases
we can do better than that.
* If the field we stored it into is a private field, we can visit
the surrounding class, and check each reference to the field. If we
can see that the field never escapes the class, we can perform the
same analysis (using the data flow analyzer!) on each method where
it's referenced.
* Similarly, if the method which returns the value is private, we can
visit the surrounding class and see how the method is invoked, and
track the value returned from it in each usage.
* Finally, if the value escapes as an argument to a call, we can
resolve that call, and if it's to a method we have source for (which
doesn't have to be in the same class, as long as it's in the same
module), we can perform the analysis in that method as well, even
reusing the same flow analyzer!
Complications: - storing in a field, returning, intermediate variables, self-referencing methods, scoping functions,
## Examples
Here are some existing usages of the data flow analyzer in lint's
built-in rules.
### Simple Example
For WorkManager, ensure that newly created work tasks eventually
get enqueued:
[Source](https://cs.android.com/android-studio/platform/tools/base/+/mirror-goog-studio-main:lint/libs/lint-checks/src/main/java/com/android/tools/lint/checks/WorkManagerDetector.kt)
[Test](https://cs.android.com/android-studio/platform/tools/base/+/mirror-goog-studio-main:lint/libs/lint-tests/src/test/java/com/android/tools/lint/checks/WorkManagerDetectorTest.kt)
### Complex Example
For the Slices API, apply a number of checks on chained calls constructing slices, checking that you only specify a single timestamp, that you don't mix icons and actions, etc etc.
[Source](https://cs.android.com/android-studio/platform/tools/base/+/mirror-goog-studio-main:lint/libs/lint-checks/src/main/java/com/android/tools/lint/checks/SliceDetector.kt)
[Test](https://cs.android.com/android-studio/platform/tools/base/+/mirror-goog-studio-main:lint/libs/lint-tests/src/test/java/com/android/tools/lint/checks/SliceDetectorTest.kt)
## TargetMethodDataFlowAnalyzer
The `TargetMethodDataFlowAnalyzer` is a special subclass of the
`DataFlowAnalyzer` which makes it simple to see if you eventually wind up
calling a target method on a particular instance. For example, calling
`close` on a file that was opened, or calling `start` on an animation you
created.
In addition, there is an extension function on `UMethod` which visits
this analyzer, and then checks for various conditions, e.g. whether the
instance “escaped” (for example by being stored in a field or passed to
another method), in which case you probably don't want to conclude (and
report) that the close method is never called. It also handles failures
to resolve, where it remembers whether there was a resolve failure, and
if so it looks to see if it finds a likely match (with the same name as
the target function), and if so also makes sure you don't report a false
positive.
A simple way to do this is as follows:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~kotlin linenumbers
val targets = mapOf("show" to listOf("android.widget.Toast",
"com.google.android.material.snackbar.Snackbar")
val analyzer = TargetMethodDataFlowAnalyzer.create(node, targets)
if (method.isMissingTarget(analyzer)) {
context.report(...)
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can subclass `TargetMethodDataFlowAnalyzer` directly and override the
`getTargetMethod` methods and any other UAST visitor methods if you want
to customize the behavior further.
One advantage of using the `TargetMethodDataFlowAnalyzer` is that it also
correctly handles method references.