Testing Stage#
The Testing Stage has two related aims:
Ensure that the researcher’s code and operating environment conforms to the POPROX API and expectations.
Ensure that the output of the researcher’s code meets POPROX standards
Conformance testing#
There are three aspects of the researcher’s code we will testing for conformance:
Does the code conform to the POPROX API? Does the API endpoint accept all of the inputs that POPROX might send it and does the endpoint return outputs that match the API expectations. See [POPROX API documentation] for additional details.
Does the code produce recommendations within the expected service level? We expect recommendation results for each user to be returned within 25 seconds. Occasional timeouts / hiccups are acceptable but what percentage of requests should be answered within the specified window. In a live experiment, baseline recommendation results will be used in place of the experimenter’s results if they are not available, but this may have consequences for experiment validity.
Testing is an on-going process and experimenters will have opportunities to update their codebases in response to feedback from the team. Conformance testing must be successful before \textit{dark live} testing can begin.
Content evaluation#
At the same time as the technical aspects of the code are being evaluated, the content of the produced newsletters will be examined against the POPROX content standards to assess whether the newsletters produced are acceptable as POPROX output? This is a more subjective assessment to be made by your POPROX consultant, with special attention if there are changes in formatting or programmatically added content, e.g. explanations.
Document content standards
Dark live testing#
Once the content and conformance testing is complete, experimental code will undergo one week of dark live testing, where the experimenter will be expected to produce newsletters for their specific POPROX subjects under the same conditions as the live experiment using the same API calls and under the same constraints. The only difference is that the newsletters will not be delivered to subscribers but be reviewed by the POPROX team.