.. _rules_vfields: .. raw:: html .. role:: bolditalic :class: bolditalic Rules and V-Fields ====================== .. contents:: :local: :depth: 2 .. toctree:: :numbered: :maxdepth: 2 Rules ------- **Rules** are validations that have been created with the aim of improving data accuracy. Each submitted file will be evaluated against a set of rules. These rules are generated from both the :ref:`dataset specifications `, as well as rules identified by the Commonwealth and jurisdictions to draw focus to common unusual trends that have been found over the history of the project. Rules for datasets can be found via the `metadata `_ site. Each rule has a set of elements, some of which are used in reporting. These include: * **Name** - Unique shortname used to identify a rule. * **Class** - Anomaly: A field or combination of fields contain data that is likely to be incorrect - Barren: The record is expected to have child records but there are none present. This can occur if the child record exists but has irredeemable errors - Exceptional: Identifies indicators derived from data combinations that are exceptional on statistical (normative) criteria. Exceptional indicators may point to errors with one or more of the component data elements, or be based on correct data - Historical: Information from previous years is used to find changes between years. Examples include: establishments opening, closing or being renamed, significant changes in items that are expected to be stable. The value provided may be correct but should be checked - Inconsistent: There is a logical inconsistency between two fields or derived data items - Invalid: A field contains incorrect data, misformatted or out of Domain - Missing: A field contains no meaningful data. Depending on the entry involved, it may be all spaces, all zeroes, or a Missing value in the Domain (eg. "9") if applicable to the data-set. - Skeleton: Structural comparisons to the SKL file to check the same set of entities is used, or that there is a statistical match between files * **Priority** - The priority of rules has been determined by the jurisdictions and Commonwealth to enable users to focus on data issues with the greatest impact on the accuracy of reporting. - Low - Medium - High * **Bulk** - Simple rules that result in a high number of similar issues, such as spaces being used to indicate missing data rather than the appropriate missing value, are reported in bulk, that is, as a total count of the times the issue exists in the submission file. * **Message** - Short message that briefly describes the issue. The following list indicates rules for formatting: - $xxx.perc - this extension formats the numbers as percentages - $xxx.commas - this extension formats the numbers with commas - $xxx.dmy and $xxx.ddmmyyyy - these extensions format the numbers as dates * **Mark** - Indicates on which field or record the error is marked. * **Description** - Detailed description of the issue. * **SQL** - Outlines the SQL implementation of the rule. VFields -------- Some rules use **Virtual Elements** (*VFields*): fields that have not been directly supplied in your data, instead they are calculated from a variety of fields in the submitted data file. VFields and their SQL can be found via the `metadata site `_. * **Name** - Unique shortname used to identify a V-Field. * **Base** - Indicates on which record type the calculation is based. * **Title** - Descriptive title of VField. * **SQL** - Outlines the SQL implementation of the virtual field calculation.