BlockCanary源码准确分析

发布时间：2023-03-24 11:06:59 所属栏目：教程来源：

导读：卡顿的来源
通过屏幕渲染机制我们知道:Android的屏幕渲染是通过vsync实现的，软件层将数据计算好后，放入缓冲区，硬件层再从缓冲区将数据读出来绘制到屏幕上，其中渲染周期是16ms，这样我们就看到了不断变化的画面。

卡顿的来源
通过屏幕渲染机制我们知道:Android的屏幕渲染是通过vsync实现的，软件层将数据计算好后，放入缓冲区，硬件层再从缓冲区将数据读出来绘制到屏幕上，其中渲染周期是16ms，这样我们就看到了不断变化的画面。

如果超过了16ms，就会发生卡顿，当然这个卡顿肯定是软件层的(如果发生在硬件层，换设备就行了)。那么，软件层的计算时间就需要小于16ms了，那么这个计算是在哪里执行的呢？

就在Handler中，准确点说，是在UI的Handler中。

Android进程间的交互是通过binder的，线程间的通信是通过Handler的。

软件层收到硬件层的vsync信号后，在Java层就会向UI的Handler中投递一个消息，去进行view数据的计算，也就是执行测量布局绘制，表现在代码层就是：执行ViewRootImpl里的performTraversals()函数，这个我们在View的测量布局绘制中提及过。

我们知道，Handler的消息处理都是通过Looper派发的，所以我们可以先拿到UI的Looper，然后在它派发消息的执行前后植入检测代码，然后添加检测逻辑，就可以分析并得出本次消息执行耗费的时间了。

// 获取UI的Looper
Looper uiLooper = Looper.getMainLooper();
// 消息分发前的处理逻辑: 记录时间
void preHandle(){
time = System.currentTimeMillis();
}
// 消息分发后的处理逻辑，计算时间差并提示
void postHandle(){
long delay = System.currentTimeMillis() - time;
   if(delay > 16) {
   // 认为卡顿了，可以做一些处理，比如打印当前线程堆栈
}
}
// 将上述两个方法插入到uiLooper的消息派发前后(假如有这个方法)
uiLooper.xxxxxxx();
那么，怎么将这两个函数植入到uiLooper中呢？其实Looper中已经有可用的API了。

如何检测应用卡顿
根据上文，我们只要在message执行前后来记录一下时间，然后计算出时间差，再用这个时间差对比我们传入的卡顿阈值，如果大于这个阈值，就认为发生了卡顿，此时就去dump主线程的堆栈，然后展示给开发者即可。

那么，怎么找到message的执行前和执行后的插入点呢？

其实Looper本身提供了一个方法，用来设置日志打印类:

/**
* Control logging of messages as they are processed by this Looper. If
* enabled,a log message will be written to <var>printer</var>
* at the beginning and ending of each message dispatch,identifying the
* target Handler and message contents.
*
* @param printer A Printer object that will receive log messages,or
* null to disable message logging.
*/
public void setMessageLogging(@Nullable Printer printer) {
mLogging = printer;
}
意思就是: 在message被执行之前和执行之后，会使用我们设置的这个printer来打印日志，具体代码在Looper的loop()函数中，如下：

// 消息执行之前打印
// This must be in a local variable,in case a UI event sets the logger
final Printer logging = me.mLogging;
if (logging != null) {
logging.println(">>>>> dispatching to " + msg.target + " " +
msg.callback + ": " + msg.what);
}
// 消息被执行完毕打印
if (logging != null) {
logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
}
利用这个原理，我们可以传入一个自定义的Printer，然后复写println()方法，然后在message执行前和执行后之间计算时间差，如果大于目标值(比如500ms)，就认为发生了卡顿。

那么，怎么区分 println()是被调用在 message执行前还是执行后呢？

我们可以使用Println打印的消息内容来判断，比如执行前打印的是>>>>> dispatching to....，执行后打印的是<<<<< Finished to，就可以这样：

public void println(String msg) {
if(msg.startsWith(">>>>> dispatching to")) {
   // 这是执行前
}else {
   // 这是执行后
}
}
但是这样太low了，而且字符串匹配效率本来就差，我们可以采用另一种方法。

由于message执行前后的日志打印是成对出现的，有前就有后，所以我们可以定义一个boolean值，表示是否是在message执行前的打印，当日志打印一次就改变一次值，就可以了。比如:

// 是否是在message执行前的打印，因为第一次打印肯定是在message执行前，所以初始值为true
private boolean isPre = true;
public void println(String msg) {
if(isPre) {
   // 是在message执行前打印，那么接下来就要开始执行message了，可以开始dump主线程堆栈了
}else {
   // 在message执行后的打印，可以停止dump线程的堆栈了
}
   // 执行一次就改变值，本次是在message执行前，下次肯定是在执行后；本次是执行后，下次肯定是在执行前
isPre = !isPre;
}
好，核心原理我们已经知道了，现在我们就来看下已有的工程代码BlockCanary的实现吧。

BlockCanary
简单使用
1 添加依赖
dependencies {
   // 在debug和release版本都使用，如果卡顿则会弹出通知提示
compile 'com.github.markzhai:blockcanary-android:1.5.0'
   // 只在debug的时候使用
// debugCompile 'com.github.markzhai:blockcanary-android:1.5.0'
// releaseCompile 'com.github.markzhai:blockcanary-no-op:1.5.0'
}
2 代码集成
首先定义一个AppBlockCanaryContext继承BlockCanaryContext，需要重写里面的几个方法，这里只贴出关键部分:

public class AppBlockCanaryContext extends BlockCanaryContext {
private static final String TAG = "AppBlockCanaryContext";
/**
* 返回一个识别码，可以传入app_name，版本号，渠道等作为识别
*/
public String provideQualifier() {
return "my_app" + BuildConfig.VERSION_CODE;
}
/**
* 返回一个用户id来作为识别
*/
public String provideUid() {
               return "10086";
}
/**
* 返回网络类型，比如:2G,3G,4G,wifi等
*/
public String provideNetworkType() {
return "wifi";
}
/**
* Config monitor duration,after this time BlockCanary will stop,use
* with {@code BlockCanary}'s isMonitorDurationEnd
*
* @return monitor last duration (in hour)
*/
public int provideMonitorDuration() {
return -1;
}
/**
* 返回你认为卡顿的阈值，单位是毫秒，应该根据不同设备的性能传入不同大小的值
*/
public int provideBlockThreshold() {
return 500;
}
/**
* 线程的转储时间间隔，当卡顿发生时，会每隔一段时间来dump主线程
*/
public int provideDumpInterval() {
return provideBlockThreshold();
}
/**
* 保存日志的路径
*/
public String providePath() {
return "/blockcanary_log/"
}
/**
* 卡顿时是否弹出通知
*/
public boolean displayNotification() {
return true;
}
/**
* 卡顿时会调用，可以在这里打印出来日志，或者上传到自己的服务器
*/
public void onBlock(Context context, BlockInfo blockInfo) {
Log.d(TAG, "onBlock: " + blockInfo);
}
}
然后在Application的onCreate()方法中调用即可:

public class MainApplication extends Application {
@Override
public void onCreate() {
   // 传入我们上面创建的AppBlockCanaryContext
BlockCanary.install(this, new AppBlockCanaryContext()).start();
}
}
当卡顿发生的时候，我们就能收到通知，并且可以在Logcat中看到我们自己打印的日志。

本文着重于源码分析，完整的使用可以看github。

源码分析
我们先跟主线代码BlockCanary.install():

public static BlockCanary install(Context context, BlockCanaryContext blockCanaryContext) {
   // 初始化BlockCanaryContext
BlockCanaryContext.init(context, blockCanaryContext);
   // 初始化状态，BlockCanaryContext.get()就是我们传入的参数，displayNotification()就是我们上面定义的是否展示通知
setEnabled(context, displayActivity.class, BlockCanaryContext.get().displayNotification());
   // 创建单例并返回
return get();
}
// get()函数的实现
public static BlockCanary get() {
if (sInstance == null) {
synchronized (BlockCanary.class) {
if (sInstance == null) {
sInstance = new BlockCanary();
}
}
}
return sInstance;
}
// BlockCanary的构造
private BlockCanary() {
BlockCanaryInternals.setContext(BlockCanaryContext.get());
   // 创建核心类，这里面包括了对日志的分析，堆栈的dump，以及cpu的采集
mBlockCanaryCore = BlockCanaryInternals.getInstance();
   // 核心代码，添加拦截器，拦截器就是我们传入的AppBlockCanaryContext
   // 当检测到卡顿的时候，就会调用它的onBlock()函数
mBlockCanaryCore.addBlockInterceptor(BlockCanaryContext.get());
   // 如果不展示通知，就返回
if (!BlockCanaryContext.get().displayNotification()) {
return;
}
   // 否则就添加一个服务来弹出通知
mBlockCanaryCore.addBlockInterceptor(new displayService());
}
支线代码BlockCanaryContext中的关键方法:

static void init(Context context, BlockCanaryContext blockCanaryContext) {
   // 保存Context
sApplicationContext = context;
   // 保存参数BlockCanaryContext，也就是我们自定义的那个AppBlockCanaryContext
sInstance = blockCanaryContext;
}
public static BlockCanaryContext get() {
if (sInstance == null) {
throw new RuntimeException("BlockCanaryContext null");
} else {
   // 返回上面保存的参数
return sInstance;
}
}
现在，让我们回到主线逻辑，接着看blockCanary.start():

// 检测
public void start() {
   // 添加一个boolean值，防止重复处理
if (!mMonitorStarted) {
mMonitorStarted = true;
   // 果然在这里，也是用这个方法设置的，那我们重点要看下这个参数了
Looper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor);
}
}
// 停止检测
public void stop() {
if (mMonitorStarted) {
mMonitorStarted = false;
   // 去掉Printer
Looper.getMainLooper().setMessageLogging(null);
   // 停止对堆栈的dump
mBlockCanaryCore.stackSampler.stop();
   // 停止对cpu的采集
mBlockCanaryCore.cpuSampler.stop();
}
}
既然传入了Printer，我们就要看下mBlockCanaryCore.monitor了，我们先来跟下上面创建mBlockCanaryCore的代码:

mBlockCanaryCore = BlockCanaryInternals.getInstance();
// 就是个单例，重点看构造
static BlockCanaryInternals getInstance() {
if (sInstance == null) {
synchronized (BlockCanaryInternals.class) {
if (sInstance == null) {
sInstance = new BlockCanaryInternals();
}
}
}
return sInstance;
}
// 看构造函数
public BlockCanaryInternals() {
// 堆栈转储器，第一个参数是UI线程，第二个参数就是我们设置的dump间隔
stackSampler = new StackSampler(Looper.getMainLooper().getThread(),sContext.provideDumpInterval());
// cpu采集器，参数就是我们设置的dump间隔
cpuSampler = new cpuSampler(sContext.provideDumpInterval());
// 核心函数，设置日志打印类Printer
setMonitor(new LooperMonitor(new LooperMonitor.BlockListener() {
// 当检测到卡顿的时候，会执行这个方法
@Override
public void onBlockEvent(long realTimeStart, long realTimeEnd, long threadTimeStart, long threadTimeEnd) {
   // 获取堆栈信息
ArrayList<String> threadStackEntries = stackSampler.getThreadStackEntries(realTimeStart, realTimeEnd);
if (!threadStackEntries.isEmpty()) {
   // 构造BlockInfo并回调给拦截器，这样就调到我们的AppBlockCanaryCotnext的onBlock()里面去了
BlockInfo blockInfo = BlockInfo.newInstance()
.setMainThreadTimeCost(realTimeStart, realTimeEnd, threadTimeStart, threadTimeEnd)
.setcpuBusyFlag(cpuSampler.iscpuBusy(realTimeStart, realTimeEnd)) // 传入dump到的cpu信息
.setRecentcpuRate(cpuSampler.getcpuRateInfo()) // 传入dump到的cpu信息
.setThreadStackEntries(threadStackEntries) // 传入dump到的堆栈信息
.flushString();
// 保存卡顿信息
LogWriter.save(blockInfo.toString());
// 如果有拦截器，则执行它的onBlock()方法，还记得我们前面添加的拦截器吗
if (mInterceptorChain.size() != 0) {
for (BlockInterceptor interceptor : mInterceptorChain) {
interceptor.onBlock(getContext().provideContext(), blockInfo);
}
}
}
}
},
getContext().provideBlockThreshold(), // 我们设置的卡顿阈值
getContext().stopWhenDebugging())); // 如果是debug模式，是否停止，默认返回true，因为debug模式下普遍卡
LogWriter.cleanObsolete();
}
接下来我们要看下LooperMonitor这个类了:

// 果然是实现了Printer，那么重点就在println()方法了
class LooperMonitor implements Printer {
        // 参数分别是: 卡顿时的回调，卡顿的阈值，debug模式下是否停止
public LooperMonitor(BlockListener blockListener, long blockThresholdMillis, boolean stopWhenDebugging) {
if (blockListener == null) {
throw new IllegalArgumentException("blockListener should not be null.");
}
mBlockListener = blockListener;
mBlockThresholdMillis = blockThresholdMillis;
mStopWhenDebugging = stopWhenDebugging;
}
   // 核心函数
   @Override
public void println(String x) {
   // debug模式下停止
if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
return;
}
   // 这里也是用一个boolean值来判断是在执行前还是执行后
if (!mPrintingStarted) {
   // 记录开始时间
mStartTimestamp = System.currentTimeMillis();
mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
mPrintingStarted = true;
   // 开始dump堆栈和cpu信息
startDump();
} else {
   // 记录结束时间
final long endTime = System.currentTimeMillis();
mPrintingStarted = false;
   // 检测卡顿并回调
if (isBlock(endTime)) {
notifyBlockEvent(endTime);
}
   // 停止dump
stopDump();
}
}
// 是否卡顿
private boolean isBlock(long endTime) {
   // 时间差大于我们传入的阈值就认为卡顿
   return endTime - mStartTimestamp > mBlockThresholdMillis;
}
   // 卡顿的回调
private void notifyBlockEvent(final long endTime) {
final long startTime = mStartTimestamp;
final long startThreadTime = mStartThreadTimestamp;
final long endThreadTime = SystemClock.currentThreadTimeMillis();
HandlerThreadFactory.getWriteLogThreadHandler().post(new Runnable() {
@Override
public void run() {
                           // 这里就回调到
mBlockListener.onBlockEvent(startTime, endTime, startThreadTime, endThreadTime);
}
});
}
   // 开始dump
private void startDump() {
   // dump堆栈信息
if (null != BlockCanaryInternals.getInstance().stackSampler) {
BlockCanaryInternals.getInstance().stackSampler.start();
}
               // dump cpu信息
if (null != BlockCanaryInternals.getInstance().cpuSampler) {
BlockCanaryInternals.getInstance().cpuSampler.start();
}
}
   // 结束dump
private void stopDump() {
if (null != BlockCanaryInternals.getInstance().stackSampler) {
BlockCanaryInternals.getInstance().stackSampler.stop();
}
if (null != BlockCanaryInternals.getInstance().cpuSampler) {
BlockCanaryInternals.getInstance().cpuSampler.stop();
}
}
}
以上逻辑很简单，blockCanary.start()的时候，就创建LooperMonitor，同时创建stackSampler和cpuSampler这两个类，用来抓取堆栈和cpu信息，当message将要执行时，就开始进行dump并记录开始时间，当message执行完毕后，就停止dump，并记录结束时间，然后用结束时间和开始时间作差，如果差值大于我们传递的阈值，就认为卡顿，就用dump到的堆栈信息和cpu信息构造BlockInfo并通过回调传递给开发者。

现在让我们来看下dump堆栈和cpu信息的代码，先看他们的父类AbstractSampler，入口函数start()就是在这里面的:

abstract class AbstractSampler {
protected AtomicBoolean mShouldSample = new AtomicBoolean(false);
   // 这是入口函数
public void start() {
   // 通过一个原子变量来避免重复启动
if (mShouldSample.get()) {
return;
}
mShouldSample.set(true);
   // 移除上一个
HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
   // post新的，注意第二个参数就是我们在AppBlockCanaryContext里面设置的转储时间间隔的0.8倍
HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,BlockCanaryInternals.getInstance().getSampleDelay());
}
   // 对应的stop函数
public void stop() {
if (!mShouldSample.get()) {
return;
}
mShouldSample.set(false);
HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
}
}
我们看到它是通过post一个runnable来实现的，接着我们来看这个runnable:

protected long mSampleInterval;
private Runnable mRunnable = new Runnable() {
@Override
public void run() {
   // 核心函数: 调用了doSample();
doSample();
if (mShouldSample.get()) {
   // 再次post出去
HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable, mSampleInterval);
}
}
};
可以看到，这里会循环调用doSample()，循环的间隔就取决于我们在AppBlockCanaryContext里面设置的转储时间间隔。

那么我们来看下核心函数doSample()，这是个重载函数，先来看StackSampler中的实现

@Override
protected void doSample() {
private static final int DEFAULT_MAX_ENTRY_COUNT = 100;
private int mMaxEntryCount = DEFAULT_MAX_ENTRY_COUNT;
private static final LinkedHashMap<Long, String> sstackMap = new LinkedHashMap<>();
StringBuilder stringBuilder = new StringBuilder();
   // 遍历当前线程的StackTrace生成String
for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
stringBuilder
   .append(stackTraceElement.toString())
.append(BlockInfo.SEParaTOR);
}
   // 采用lru的方式将每次dump到的StackTrace添加到sstackMap中去
synchronized (sstackMap) {
   // mMaxEntryCount默认最大是100
if (sstackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
sstackMap.remove(sstackMap.keySet().iterator().next());
}
   // key是当前的时间值，value就是本次dump到的StackTrace
sstackMap.put(System.currentTimeMillis(), stringBuilder.toString());
}
}
核心逻辑就是: 获取当前线程的StackTrace，并且保存到map中，最多保存最近的100个，其中 key是时间值，value是StackTrace。

还记得我们在onBlockEvent()里面怎么获取堆栈信息的吗？没错，就是通过

stackSampler.getThreadStackEntries(realTimeStart, realTimeEnd)
它的实现在StackSampler里面，如下:

// 在我们上面保存的那个sstackMap中查找时间位于startTime和endTime之间的结果，保存在List中返回。
public ArrayList<String> getThreadStackEntries(long startTime, long endTime) {
ArrayList<String> result = new ArrayList<>();
synchronized (sstackMap) {
for (Long entryTime : sstackMap.keySet()) {
if (startTime < entryTime && entryTime < endTime) {
result.add(BlockInfo.TIME_FORMATTER.format(entryTime)
+ BlockInfo.SEParaTOR
+ BlockInfo.SEParaTOR
+ sstackMap.get(entryTime));
}
}
}
return result;
}
实现很简单，就是在sstackMap中进行查找，查找时间位于startTime和endTime之间的结果，然后将结果存储为一个List返回。

接着我们来看下cpuSampler中的doSample() 的实现:

@Override
protected void doSample() {
BufferedReader cpuReader = null;
BufferedReader pidReader = null;
try {
   // 读取"/proc/stat"文件
cpuReader = new BufferedReader(new InputStreamReader(
new FileInputStream("/proc/stat")), BUFFER_SIZE);
   // 从"/proc/stat"文件中获取cpu速率
String cpuRate = cpuReader.readLine();
if (cpuRate == null) {
cpuRate = "";
}
   // 获取进程id
if (mPid == 0) {
mPid = android.os.Process.myPid();
}
   // 根据进程id获取本进程对应的"/proc/mpid/stat"文件
pidReader = new BufferedReader(new InputStreamReader(
new FileInputStream("/proc/" + mPid + "/stat")), BUFFER_SIZE);
   // 进而获取进程的cpu速率
String pidcpuRate = pidReader.readLine();
if (pidcpuRate == null) {
pidcpuRate = "";
}
   // 将数据进行解析
parse(cpuRate, pidcpuRate);
} catch (Throwable throwable) {
Log.e(TAG, "doSample: ", throwable);
} finally {
try {
if (cpuReader != null) {
cpuReader.close();
}
if (pidReader != null) {
pidReader.close();
}
} catch (IOException exception) {
Log.e(TAG, exception);
}
}
}
上述核心逻辑是: 从/proc/stat文件中获取cpu速率，然后从/proc/mpid/stat中获取本进程的cpu速率，然后对数据进行解析，我们接着看解析的逻辑，位于parse()方法中:

// 最大保存10条数据
private static final int MAX_ENTRY_COUNT = 10;
// 用来保存cpu信息
private final LinkedHashMap<Long, String> mcpuInfoEntries = new LinkedHashMap<>();
private void parse(String cpuRate, String pidcpuRate) {
   // 转换成数组
String[] cpuInfoArray = cpuRate.split(" ");
if (cpuInfoArray.length < 9) {
return;
}
   // 挨个针对下标进行解析
long user = Long.parseLong(cpuInfoArray[2]);
long nice = Long.parseLong(cpuInfoArray[3]);
long system = Long.parseLong(cpuInfoArray[4]);
long idle = Long.parseLong(cpuInfoArray[5]);
long ioWait = Long.parseLong(cpuInfoArray[6]);
long total = user + nice + system + idle + ioWait
+ Long.parseLong(cpuInfoArray[7])
+ Long.parseLong(cpuInfoArray[8]);
String[] pidcpuInfoList = pidcpuRate.split(" ");
if (pidcpuInfoList.length < 17) {
return;
}
long appcpuTime = Long.parseLong(pidcpuInfoList[13])
+ Long.parseLong(pidcpuInfoList[14])
+ Long.parseLong(pidcpuInfoList[15])
+ Long.parseLong(pidcpuInfoList[16]);
   // 将数据转换成String并保存
if (mTotalLast != 0) {
StringBuilder stringBuilder = new StringBuilder();
long idleTime = idle - mIdleLast;
long totalTime = total - mTotalLast;
stringBuilder
.append("cpu:")
.append((totalTime - idleTime) * 100L / totalTime)
.append("% ")
.append("app:")
.append((appcpuTime - mAppcpuTimeLast) * 100L / totalTime)
.append("% ")
.append("[")
.append("user:").append((user - mUserLast) * 100L / totalTime)
.append("% ")
.append("system:").append((system - mSystemLast) * 100L / totalTime)
.append("% ")
.append("ioWait:").append((ioWait - mIoWaitLast) * 100L / totalTime)
.append("% ]");
   // 将数据保存在mcpuInfoEntries中，key也是当前时间值，也是采用的lru策略
synchronized (mcpuInfoEntries) {
mcpuInfoEntries.put(System.currentTimeMillis(), stringBuilder.toString());
if (mcpuInfoEntries.size() > MAX_ENTRY_COUNT) {
for (Map.Entry<Long, String> entry : mcpuInfoEntries.entrySet()) {
Long key = entry.getKey();
mcpuInfoEntries.remove(key);
break;
}
}
}
}
   // 更新数据供下一轮使用
mUserLast = user;
mSystemLast = system;
mIdleLast = idle;
mIoWaitLast = ioWait;
mTotalLast = total;
mAppcpuTimeLast = appcpuTime;
}
这里的逻辑跟StackSample类似，获取cpu信息并且保存在mcpuInfoEntries中，key也是当前时间值，value是cpu信息对应的String，也是采用的Lru策略。

还记得我们在onBlockEvent()里面怎么获取cpu信息的吗？没错，就是通过cpuSampler.getcpuRateInfo()，它的实现如下:

// 获取cpu速率信息
public String getcpuRateInfo() {
StringBuilder sb = new StringBuilder();
synchronized (mcpuInfoEntries) {
   // 直接遍历mcpuInfoEntries并写入String中返回
for (Map.Entry<Long, String> entry : mcpuInfoEntries.entrySet()) {
long time = entry.getKey();
sb.append(BlockInfo.TIME_FORMATTER.format(time))
.append(' ')
.append(entry.getValue())
.append(BlockInfo.SEParaTOR);
}
}
return sb.toString();
}
这里直接就将我们保存在mcpuInfoEntries的cpu信息转换成一个String返回了。

（编辑：汽车网）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!