Plugis

Plugins are a powerful way to extend the functionality of the application by defining instructions. These instructions can be applied to a data node, which contains data that comes from chains of previous instructions. Writing a plugin is an excellent way to contribute to the project, and the How to add plugins section provides more details on how to do so.

Implemented plugins

Instruction of all plugins should include plugin_name key.

plugin name	plugin required fields
filter	plugin_name, filter_condition
drop_na	plugin_name

filter

The filter plugin allows the user to filter the data based on a set of rules. Internally it uses the query function in the pandas library.

This plugin conducts the following steps:

Load data from the database and join them based on the id column.
Apply the filter_condition as a query to the data.
Select the id column from the filtered data.
Generate output_data dictionary and assign the selected ids to d_index field.
Return the output_data dictionary.

See the Data Node section for more details on d_index.

Here are a couple of examples:

>>> instruction_1 = {"plugin_name": "filter", "filter_condition": "exposure > 10"}
>>> instruction_2 = {"plugin_name": "filter", "filter_condition": "cs_white > 0.45 & cs_white < 0.8"}
>>> instruction_3 = {"plugin_name": "filter", "filter_condition": "cs_household_income > 15000"}

drop_na

This is similar to the filter plugin. The only difference is that it drops rows that contain NaN values.

This plugin conducts the following steps:

Load data from the database and join them based on the id column.
Drop rows with missing values.
Select the id column from the remaining data.
Generate output_data dictionary and assign the selected ids to d_index field.
Return the output_data dictionary.

Here is an example:

>>> instruction_4 = {"plugin_name": "drop_na"}