Fixr: Mining and Understanding Bug Fixes

Here's what we are doing.

Detection of Bugs in Android Apps via Dynamic Callback Analysis

When implementing an Android application the developer will write a series of classes that implement methods called by the Android framework. These methods are used to communicate to the app what has happened to the phone, in response the app will communicate how it wants to respond by calling methods defined by the framework. This continuing dialog allows an application to successfully and responsively interact with its surroundings. However this dialog can be very foreign and come with a large number of constraints that make development difficult, especially when there is an ordering of callback invocations that the developer did not realize was possible. Our research aims to create automated methods of finding places where this ordering goes wrong in existing Android applications by monitoring the runtime behavior and searching for possible erroneous executions.

Identifying Bug Communities with Qualitative Isomorphism in CDFGs

While most existing API mining techniques usually do not precisely consider the control and data flow in the program, this activity of our project explores a more enriched abstraction of API usage: we consider a representation of the control and data flow for a program, known as Abstract Control Data Flow Graph (ACDFG). For each program, we extract a set of ACDFG, each capturing a unique slice of the program corresponding to a specific API of interest (particularly, an Android framework API call) For each pair of programs, we compute an approximate isomorphism relations between their ACDFGs. This approximate isomorphism precisely characterizes how two programs are similar and provides also a similarity measure between programs. Using this isomorphism relation, we are developing techniques to identify the set of programs that use the API in a similarly fashion, thus enabling us to distinguish between normal and anomalous usages of an API.

Interactive Application for Information Visualization

Post data extraction from Android repositories in GitHub, we obtain huge amounts of app data; from developer commit messages, raw source codes to syntactic diffs of source codes. While much of these data are effectively machine accessible through curation by our Apache Solr servers, native query interfaces provided by Solr Admin are not entirely ideal for human interaction and understanding. To address this gap, our team is working on developing interactive support tools that augments existing search functionalities and visualization of graph-based results (e.g., program CDFGs). This includes the development of a mobile app that exercises these advance features. Simultaneously, we are developing an interactive tool for visualizing dynamic traces extracted from crowd-sourcing mobile analytic frameworks such as Firebase. As a whole, our work here provides advance visualization tools for future Android developers, enabling them to harness the power of communal knowledge buried in open-source software enclaves.

Feature Extraction from Commits in Public Android Software Repositories

In this activity of our project, we develop scalable cloud computing services that provide solutions to the frontlines of this project: extraction, processing and curation of data from repositories in public software enclaves, like GitHub. Particularly, our work here involves developing scalable means of discovering and extracting basic meta-data from commit histories (e.g., comment messages, parent-child relations), to syntactic information from raw source codes (e.g., bag of framework API calls/imports). The data extracted here will constitute the core fragment of our corpus in which other research activities of this project will use as inputs. By using state-of-the-art cluster compute systems (Apache Spark), full-text search services (Apache Solr) and machine-learning algorithms, we aim to develop an industrial strength relevant code search engine, that enable developers to submit their buggy code fragments and query for potential fixes hidden in the sea of data we extract.

Fixr Android App Builder Farm

Far too often enough, before we can fix the apps, we have to fix the build scripts. The use of build automation tools (e.g., Gradle, Maven, Ant) is prevalent on GitHub, yet so are bad practices in the usage of such tools. From hard-coded dependencies and build paths, to improper or missing build configurations, such bad practices hampers the buildability of the app, thus blocking any efforts in automated analysis of build artifacts (e.g., bytecodes, Android apks, etc..). Part of our team's current research and engineering efforts are dedicated to mitigating this problem. In particular, our team is working to improve the buildability of unsolicited GitHub Android repositories. We have developed an app building script that applies various heuristics that fix the most common problems in Gradle, Ant and Eclipse projects, without human intervention. We have also implemented a prototype cloud service that automates the building of Android GitHub repositories from our corpus, store build outputs/meta-data and facilitate querying and retrieve of such information.

Fixr: Mining and Understanding Bug Fixes

Overview

Here's what we are doing.

Detection of Bugs in Android Apps via Dynamic Callback Analysis

Identifying Bug Communities with Qualitative Isomorphism in CDFGs

Interactive Application for Information Visualization

Feature Extraction from Commits in Public Android Software Repositories

Fixr Android App Builder Farm

MUSE Team

Bor-Yuh Evan Chang

Pavol Černý

Sriram
Sankaranarayanan

Tom Yeh

Kenneth M. Anderson

Fixr: Mining and Understanding Bug Fixes

Bor-Yuh Evan Chang

Pavol Černý

SriramSankaranarayanan

Tom Yeh

Kenneth M. Anderson

Sriram
Sankaranarayanan