RoboTA-common-errors - Automated software engineering assessment

RoboTA (Robot Teaching Assistant) is a group of Python packages that provide a framework for the assessment of software engineering practices. The focus of RoboTA is the assessment of student software engineering coursework, though it has a wider scope in the assessment of general good practice in software engineering.

The robota-core package collects information about a project from a number of sources, git repositories, issue trackers, ci-servers. It is designed to be provider agnostic, for example repository data can come from GitLab or GitHub.

The robota-common-errors package then uses this information to assess the project against a number of common software engineering errors or bad practices. Examples include committing non-project files to a git repository or merging a git branch into the remote tracking branch instead of the local one. The included errors are designed to be general and project agnostic but it would be easy to add new methods to enforce project or group specific standards such as a standard format for git commit messages or ensuring that issues are always assigned to an individual as soon as they are opened.

RoboTA was developed in the Computer Science department at the University of Manchester. From 2018 to March 2021, development of RoboTA was funded by the Institute of Coding.

Installation

To install as a Python module, type

python -m pip install .

from the root directory. For developers, you should install in linked .egg mode using

python -m pip install robota-core -e

If you are using a Python virtual environment, you should activate this first before using the above commands.

RoboTA Config

RoboTA requires access to a number of data sources to collect data to operate on.

Details of these data sources and information required to connect to them is provided in the robota config yaml file.

Documentation on the config file can be found in the RoboTA config section of the documentation.

An example robota config file can be found in the root of the robota-common-errors repository.

API Reference

RoboTA config

RoboTA reads in various types of data from different sources. These are specified in the robota config file. The main data source documentation is included in the robota-core documentation. This documentation is supplementary to that, so you should read that documentation first.

robota-common-errors only makes use of a subset of the possible data types in robota-core and adds a single additional data type.

Data types

Issues, ci, repository and remote_provider are data types from robota-core that are used in robota-common-errors. They are documented in the robota-core documentation.

There is one additional data type for robota-common-errors:

common-errors

The location of the yaml file that defines the common errors.

Valid sources:

  • local_path

  • gitlab

required keys:

  • file_name - The path of the file containing the mapping between names and email addresses

This keys may specify sub-folder(s) in the git repository, e.g.

file_name: config_files/error_definitions.yaml

Data Sources

Data sources are unchanged from robota-core.

Getting started with RoboTA common errors

Once the robota-common-errors package has been installed you can either run the provided report which will output detected common errors to a HTML page, or you can import the functions and incorporate the outputs into your own workflow.

Running the report

The common-errors report is not intended as the primary method for displaying the detected common errors, but is provided as a simple example of a possible output.

To run the common errors report, from the common-errors root directory run:

python robota_common_errors/report.py

This will produce a report in the webpages folder called common-error-report.html.

Optional parameters for the script are:

  • --config_path - This is the path to the robota config file. The default is ./robota-config.yaml

  • --output - This is the path where the report will be output. The default is ./webpages

  • --start - This is the start date of the report. Artifacts (commits, issues etc.) with a creation date older than this will not be considered in the report. The default is 2020-01-01

  • --end - This is the end date of the report, Artifacts newer than this will not be considered in the report. The default is the current date

Using common errors as a library

If you want to analyse common errors and use the data in your own scripts, you should use the robota_common_errors.report.identify_common_errors() function. This function takes the path to the robota config, start and end times as arguments and outputs a list of CommonError. These can then be parsed into whatever display format you want to use.

Adding new common errors

Common errors are described in the error_definitions.yaml file. The marking_function key in the error description is the name of the method in the code that will be called to assess the error. The code that actually assesses the error is located in the common_error_functions.py file. Functions to assess a common error must have the signature marking_function_name(data_source: DataServer, common_error: CommonError) and should populate the CommonError with the result and return this.

robota_common_errors

robota_common_errors package

Subpackages
robota_common_errors.output_templates package
Submodules
robota_common_errors.output_templates.build_webpages module
robota_common_errors.output_templates.build_webpages.build_webpages(template_dir: Path, output_dir: Path)[source]

Create ancillary pages from templates and copy them to output_dir. :param template_dir: The path to load page templates from. :param output_dir: The path to output generated pages to.

Module contents
Submodules
robota_common_errors.analysis module

An example of a simple data analysis script that might be run on the JSON output of common_error_collection.py.

robota_common_errors.analysis.main(file_name: str)[source]
robota_common_errors.analysis.read_json(file_path: Path) DataFrame[source]
robota_common_errors.common_error_collection module
class robota_common_errors.common_error_collection.CommonErrorReport(course: str, year: str, team_name: str, error_name: str, detail_titles: List[str], error_details: List[List[str]], num_occurrences: int)[source]

Bases: object

course: str
detail_titles: List[str]
error_details: List[List[str]]
error_name: str
num_occurrences: int
team_name: str
year: str
robota_common_errors.common_error_collection.main(config_path: str, output_name: str)[source]
robota_common_errors.common_error_collection.output_to_json(file_name: str, common_error_reports: List[CommonErrorReport])[source]
robota_common_errors.common_error_functions module

These functions specifically test the common errors.

robota_common_errors.common_error_functions.commit_unnecessary_files(data_source: DataServer, common_error: CommonError) CommonError[source]

Certain files shouldn’t be committed to a Git repo, and should probably be specified in a .gitignore file

robota_common_errors.common_error_functions.committing_comments(data_source: DataServer, common_error: CommonError) CommonError[source]

Students often commit commented out code which is bad. This function checks the diffs for any java style commenting syntax.

robota_common_errors.common_error_functions.committing_conflict_markup(data_source: DataServer, common_error: CommonError) CommonError[source]

Conflicts should be resolved before committing. Any commit conflict markup which gets committed shows that the conflict has not been properly resolved.

robota_common_errors.common_error_functions.confusing_branch_names(data_source: DataServer, common_error: CommonError) CommonError[source]

Some branch names are obviously confusing and should be avoided. e.g. a local branch called origin/branch would have a remote tracking branch called origin/origin/branch

robota_common_errors.common_error_functions.get_comment_line_number(diff, line_num_in_diff)[source]

Return starting line number of hunk which contains line_num_in_diff

robota_common_errors.common_error_functions.get_commit_diffs(commits: List[Commit], data_source: DataServer) Dict[Commit, List[Diff]][source]

Get the diffs for each commit in commits.

While it would be fewer API calls to do a diff of all commits at once, the API refuses to return the diff if it is too big. This means doing it one commit at a time is safer.

robota_common_errors.common_error_functions.get_line_num_in_diff(diff_up_to_match: Diff, match: Match)[source]

Return line number in diff of a regex match object

robota_common_errors.common_error_functions.get_subject_errors(message: str) List[str][source]

Detect errors with the subject line of git commit messages.

Parameters

message – Commit message

Returns

List of detected problems.

robota_common_errors.common_error_functions.good_commit_messages(data_source: DataServer, common_error: CommonError) CommonError[source]

Git commit messages should have a standardised formatting.

robota_common_errors.common_error_functions.merge_remote_branch(data_source: DataServer, common_error: CommonError) CommonError[source]

Identify cases where students have merged a remote tracking branch. They should be pulling before making local commits or rebasing their local branch onto the remote branch.

robota_common_errors.common_error_functions.repeated_revert(data_source: DataServer, common_error: CommonError) CommonError[source]

Identify cases where students have repeatedly attempted to revert commits. This is often because they think that a commit revert is like an undo command.

robota_common_errors.common_error_functions.wrong_use_of_estimate(data_source: DataServer, common_error: CommonError) CommonError[source]

Identify cases where students have misused the /estimate command when trying to record time estimates for a gitlab issue.

robota_common_errors.common_error_functions.wrong_use_of_spend(data_source: DataServer, common_error: CommonError) CommonError[source]

Identify cases where students have misused the /spend command when trying to record time spent working on a gitlab issue.

robota_common_errors.common_error_functions.wrong_way_merge(data_source: DataServer, common_error: CommonError) CommonError[source]

Identify cases where students have merged develop into their feature branch rather than the feature branch into the development branch.

robota_common_errors.common_errors module

Common errors are suboptimal student behaviors that occur regularly. If a common error is identified, feedback is given on how to correct it or avoid it in future.

Common errors are much like Tasks except they are not assigned any credit.

class robota_common_errors.common_errors.CommonError(common_error: dict)[source]

Bases: object

Representation of a common error that could be identified

Variables
  • name – The name of the common error.

  • marking_function – The name of the function used to assess the common error.

  • description – A textual description of the common error.

  • detail_titles – Titles of a table to be printed on the marking report.

  • error_details – Rows of a table to be printed on the marking report.

add_feedback(feedback: List[List[str]])[source]
add_feedback_titles(feedback_titles: List[str])[source]
count_errors() int[source]

Sum the occurrences of this error by checking error details.

robota_common_errors.common_errors.assess_common_errors(data_source: DataServer, common_errors: List[CommonError]) List[CommonError][source]

Go through all of the possible common errors, running the assessment function for each one.

robota_common_errors.common_errors.get_error_descriptions(robota_config: dict) List[CommonError][source]

Get the textual description of common errors to identify.

robota_common_errors.common_errors.validate_data_sources(data_source: DataServer, common_error: CommonError)[source]

Check the data sources that are available from the DataServer.

robota_common_errors.report module

Generates a HTML report of common errors for a git repo.

robota_common_errors.report.count_errors(common_errors: List[CommonError]) int[source]

Count how many different common error types were detected.

robota_common_errors.report.identify_common_errors(robota_config: dict, start: datetime, end: datetime) List[CommonError][source]

The main function which gets the common errors.

robota_common_errors.report.output_html_report(common_errors: List[CommonError], data_source_info: dict, common_error_summary: dict, output_dir: str)[source]
robota_common_errors.report.run_html_error_report(start: str, end: str, config_path: str, output_dir: str, substitution_variables: dict)[source]

Get common errors and output them in the form of a HTML report.

robota_common_errors.report.summarise_common_errors(common_errors: List[CommonError])[source]
robota_common_errors.report.summarise_data_sources(robota_config: dict, start: datetime, end: datetime) dict[source]
robota_common_errors.report.update_template(common_errors: List[CommonError], data_source_info: dict, common_error_summary: dict)[source]

Produce the HTML report by writing the marking results to a HTML template.

Parameters
  • common_errors – A list of common errors and feedback on them.

  • data_source_info – A dictionary of information about the data sources that were used.

  • common_error_summary – A dictionary of stats about the common errors that is printed in the report.

Module contents